I am trying to compare IServerXMLHTTPRequest and WinHTTP in regards to performance.
I would like to know:
What is the maximum limit of the data/file that can be sent?
What is the transfer rate if the file to be sent is the maximum limit?
For those who might be needing an information about this.
IServerXMLHTTPRequest is a thin layer above WinHTTP. Being a layer over WinHTTP means SXH will carry additional overhead. SXH doesn't provide any additional functionality over WinHTTP, other than the ability to directly support XML Document objects. source
And thus by using the WinHTTP object directly you achieve higher performance, scalability, and reduced memory consumption. source
If you are dealing with very large payloads (either posting/receiving multi-megabyte requests/responses), then use the WinHTTP Win32 API. The SXH component does not handle large data payloads efficiently--it will store all the data in a single memory buffer. The WinHTTP Win32 API allows the application to send/receive data using separate, smaller memory buffers. source
Related
I know this is a very generic question. But, I wanted to understand what are the major architectural decision that allow Redis (or caches like MemCached, Cassandra) to work at amazing performance limits.
How are connections maintained?
Are connections TCP or HTTP?
I know that it is completely written in C. How is the memory managed?
What are the synchronization techniques used to achieve high throughput inspite
of competing read/writes?
Basically, what is the difference between a plain vanilla implementation of a machine with in memory cache and server that can respond to commands and a Redis box? I also understand that the answer needs to be very huge and should include very complex details for completion. But, what I'm looking for are some general techniques used rather than all nuances.
There is a wealth of of information in the Redis documentation to understand how it works. Now, to answer specifically your questions:
1) How are connections maintained?
Connections are maintained and managed using the ae event loop (designed by the Redis author). All network I/O operations are non blocking. You can see ae as a minimalistic implementation using the best network I/O demultiplexing mechanism of the platform (epoll for Linux, kqueue for BSD, etc ...) just like libevent, libev, libuv, etc ...
2) Are connections TCP or HTTP?
Connections are TCP using the Redis protocol, which is a simple telnet compatible, text oriented protocol supporting binary data. This protocol is typically more efficient than HTTP.
3) How is the memory managed?
Memory is managed by relying on a general purpose memory allocator. On some platforms, this is actually the system memory allocator. On some other platforms (including Linux), jemalloc has been selected since it offers a good balance between CPU consumption, concurrency support, fragmentation and memory footprint. jemalloc source code is part of the Redis distribution.
Contrary to other products (such as memcached), there is no implementation of a slab allocator in Redis.
A number of optimized data structures have been implemented on top of the general purpose allocator to reduce the memory footprint.
4) What are the synchronization techniques used to achieve high throughput inspite of competing read/writes?
Redis is a single-threaded event loop, so there is no synchronization to be done since all commands are serialized. Now, some threads also run in the background for internal purposes. In the rare cases they access the data managed by the main thread, classical pthread synchronization primitives are used (mutexes for instance). But 100% of the data accesses made on behalf of multiple client connections do not require any synchronization.
You can find more information there:
Redis is single-threaded, then how does it do concurrent I/O?
What is the difference between a plain vanilla implementation of a machine with in memory cache and server that can respond to commands and a Redis box?
There is no difference. Redis is a plain vanilla implementation of a machine with in memory cache and server that can respond to commands. But it is an implementation which is done right:
using the single threaded event loop model
using simple and minimalistic data structures optimized for their corresponding use cases
offering a set of commands carefully chosen to balance minimalism and usefulness
constantly targeting the best raw performance
well adapted to modern OS mechanisms
providing multiple persistence mechanisms because the "one size does fit all" approach is only a dream.
providing the building blocks for HA mechanisms (replication system for instance)
avoiding stacking up useless abstraction layers like pancakes
resulting in a clean and understandable code base that any good C developer can be comfortable with
I have a restful java api that provides data to a Node.js client (that gzip data to users). The question is, If they are running in the same machine, should I Gzip the data from the java api to the node.js application?
I'm asking this because this case, I dont have to worry to network latency, but Gzip compression may increase CPU utilization.
Does it worth use gzip this situation?
If the objective is to increase speed of the overall system, then using gzip to transfer across processes boundaries would not be very useful, particularly if the message size is small enough to fit within memory. If the message is too large to fit in memory, and some paging overhead is incurred, the benefit of gzip may be greater but still not anywhere near enough to justify using it. Gzip only makes sense when the speed of compression is significantly greater than the speed of communication. This is usually not the case with inter-process communication (even if it incurs pagefault overhead.)
Redis is "memory monster". Storing data as "compressed json string" minimizes memory usage.
Is there any built-in compression option in Redis Db?
Redis uses LZF light data compressor at the dump time, so it won't lessen the memory consumption. Implying that the redis does not compresses the data in memory and stores it as a string.You must deploy your own client side compression code.
The lua scripting also provides the compression algorithm but the branch is relatively new and therefore won't be advisable to use at production level.
No, there isn't any runtime compression option.
However, as dan-boa said - it might be a good idea to implement compression on your application side. Doing it that way will let to save CPU on the Redis server. Your Database server won't be affected of cpu time needed for compression.
In one of our Redis cluster we saved like 82% of memory (from circa 340GB to 60GB) thanks to GZIPing our json-based blobs. Some more thoughts about it and other ways of optimizing memory usage can be found in our article:
http://labs.octivi.com/how-we-cut-down-memory-usage-by-82/
Note: link moved to archive.org backup
We have web based j2ee application which allows file upload/download. Due to latency issue upload/download is slower for many users.
1) I read that sending data using UDP can improve data transfer speed. How can we send file data using UDP?
2) We are zipping file using GZIP before upload/download to reduce amount data transfer. Is there better method available improve data compression?
UDP is a protocol that does not guarantee the arrival of messages. You are most likely using a standard file transfer protocol like ftp which should suit you fine. Are your issues with latency or with bandwidth? You might be better of investigating why the link has a high latency or bandwidth issues, as this could prove to be an issue with other parts of your web application.
GZIP and other zipping tools are good for reducing the amount of data that is sent if you're willing up put up with the initial cost of compressing. These tools should have options so you can tweak the level of compression (i.e. take a long amount of time and compress optimally, or compress it quickly but have a larger zipped file). You will probably need to experiment and see what balance works the best for you.
1) Are there protocols faster than TCP on high latency links?
Yes, UDT is the primary example, but it is not a free trade, for instance consider you now need a custom frontend application to download files.
2) Is there better file compression than GZIP?
Yes, view the exhaustive list at http://www.maximumcompression.com/index.html, bzip2 and 7-zip are popular alternatives to gzip.
Note for specific domains, such as text, photographic images, scanned text, there are domain specific codecs which are more preferable.
We have a Win CE 6.0 device that is required to consume services that will be provided using WCF. We are attempting to reduce bandwidth usage as much as possible and with a simple test we have found that using UDP instead of HTTP saved significant data usage.
I understand there are limitations regarding WCF on .NET Compact Framework 3.5 devices and was curious what people thought would be the appropriate way forward. Would it make sense to develop a custom UDP binding, and would that work for both sides?
Any feedback would be appreciated. Thanks.
While http does have some overhead, if this is becoming a significant part of your data usage, then I would suspect that your API is too "chatty", and maybe fewer messages (each carrying more payload) should be considered.
The next point would be; how can we reduce the bandwidth for a given amount of payload? Compression is an option, but can be a problem on some platforms. Another is to use a serialization format that is inherently dense and efficient to process (in terms of CPU cycles, since you are using low-power devices). For that purpose, something like "protocol buffers" would be ideal.
protobuf-net is a CF-compatible implementation of protocol buffers for .NET; the CF build doesn't have all the nice WCF features (because CF doesn't support them), but it can work very effectively.
Additionally, if you do go http, then MTOM should be considered, as this reduces the encoding overhead of binary data (i.e. what protobuf-net would use).
Moving to UDP can be an option, but I would try something like http + protobuf-net + MTOM first (combined with a less "chatty" API), and see how it stacks up.
I should also note that the current (downloadable) version of protobuf-net has some "kinks" with CF; it works, but it isn't as fast etc as it could be (due to limitations in meta-programming on CF). The "v2" product (not yet released) addresses all these points, allowing fully static (and fast) execution on CF. And best of all, it is free.