What is the best way to send large data from WCF service to Client? - wcf

I have a particular service which returns large amount of data. What' is the best practice and options available in WCF to handle this?
These large data is returned after doing all the filtering, so no more filtering is possible.
The data could go into GB's. I do understand there is a limit to how much data a system can handle.
But give the above scenario what options/alternatives would you recommend.

Use streaming MSDN

MTOM is a mechanism for transmitting large binary attachments with SOAP messages as raw bytes, allowing for smaller messages.
see: http://msdn.microsoft.com/en-us/library/aa395209.aspx for details.

Related

how would I expose 200k+ records via an API?

what would be the best option for exposing 220k records to third party applications?
SF style 'bulk API' - independent of the standard API to maintain availability
server-side pagination
call back to a ftp generated file?
webhooks?
This bulk will have to happen once a day or so. ANY OTHER SUGGESTIONS WELCOME!
How are the 220k records being used?
Must serve it all at once
Not ideal for human consumers of this endpoint without special GUI considerations and communication.
A. I think that using a 'bulk API' would be marginally better than reading a file of the same data. (Not 100% sure on this.) Opening and interpreting a file might take a little bit more time than directly accessing data provided in an endpoint's response body.
Can send it in pieces
B. If only a small amount of data is needed at once, then server-side pagination should be used and allows the consumer to request new batches of data as desired. This reduces unnecessary server load by not sending data without it being specifically requested.
C. If all of it needs to be received during a user-session, then find a way to send the consumer partial information along the way. Often users can be temporarily satisfied with partial data while the rest loads, so update the client periodically with information as it arrives. Consider AJAX Long-Polling, HTML5 Server Sent Events (SSE), HTML5 Websockets as described here: What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?. Tech stack details and third party requirements will likely limit your options. Make sure to communicate to users that the application is still working on the request until it is finished.
Can send less data
D. If the third party applications only need to show updated records, could a different endpoint be created for exposing this more manageable (hopefully) subset of records?
E. If the end-result is displaying this data in a user-centric application, then maybe a manageable amount of summary data could be sent instead? Are there user-centric applications that show 220k records at once, instead of fetching individual ones (or small batches)?
I would use a streaming API. This is an API that does a "select * from table" and then streams the results to the consumer. You do this using a for loop to fetch and output the records. This way you never use much memory and as long as you frequently flush the output the webserver will not close the connection and you will support any size of result set.
I know this works as I (shameless plug) wrote the mysql-crud-api that actually does this.

azure storage queue (json vs binary)

What's the best way to store custom messages into the queue? I mean if I have a queue that can store different types of messages should I store them in binary format or json?
What do you think?
Windows Azure Storage Client Library provides overloads for binary and string that handle encoding for you. As such you can make use of any serialization mechanism you like, given that the serialized form is less than 64 KB.
Hence, the answer to your question actually depends on your specific scenario. Handling JSON data would be much easier, but if you have a specific need to send the data in another format, please consider such alternatives. For larger scenarios some users augment queue messages to simply point to blob or table storage as a more flexible and verbose option while using the queue messages to provide for reliable message delivery.

WCF best practises in regards to MaxItemsInObjectGraph

I have run into the exception below a few times in the past and each time I just change the configuration to allow a bigger object graph.
"Maximum number of items that can be serialized or deserialized in an object graph is '65536'. Change the object graph or increase the MaxItemsInObjectGraph quota."
However I was speaking to a colleague and he said that WCF should not be used to send large amounts of data, instead the data should be bite sized.
So what is the general consensus about large amounts of data being returned?
In my experience using synchronous web service operations to transmit large data sets or files leads to many different problems.
Firstly, you have performance related issues - serialization time at the service boundary. Then you have availability issues. Incoming requests can time out waiting for a response, or may be rejected because there is no dispatcher thread to service the request.
It is much better to delegate large data transfer and processing to some offline asynchronous process.
For example, in your situation, you send a request and the service returns a URI to the eventual resource you want. You may have to wait for the resource to become available, but you can code your consumer appropriately.
I haven't got any concrete examples but this article seems to point to WCF being used for large data sets, and I am aware of people using it for images.
Personally, I have always had to increase this property for any real world data.

Using WCF to provide large report datasets

I have a need for an application to access reporting data from a remote database. We currently have a WCF service that handles the I/O for this database. Normally the application just sends small messages back and forth between the WCF service and itself, but now we need to run some historical reports on that activity. The result could be several hundred to a few thousand records. I came across http://msdn.microsoft.com/en-us/library/ms733742.aspx which talks about streaming, but it also mentions segmenting messages, which I didn't find any more information on. What is the best way to send large amounts of data such as this from a WCF service?
It seems my options are streaming or chunking. Streaming restricts other WCF features, message security being one (http://msdn.microsoft.com/en-us/library/ms733742.aspx). Chunking is breaking up a message into pieces then putting those pieces back together at the client. This can be done by implementing a custom Channel which MS has provided an example of here: http://msdn.microsoft.com/en-us/library/aa717050.aspx. This is implemented below the security layer so security can still be used.

Is this possible in wcf?

I have a wcf service which returns a list of many objects e.g. 100,000
I get an error when calling this function because the maximum size i am allowed to pass back from wcf has been exceeded.
Is there a built in way i could return this in smaller chunks e.g. 20,000 at a time
I can increase the size allowed back from the wcf but was wondering what the alternatives were.
Thanks
Without knowing your requirements, I'd take a look at two other possible options:
Paging: If your 100,000 objects are coming from a database, then use paging to reduce the amount of data and invoke the service in batches with a page number. If the objects are not coming from a database, then you'd need to look at how that data will be stored server-side during invocations.
Streaming: Return the data to the caller as a stream instead.
With the streaming option, you'd have to do some more work in terms of managing the serialization of the objects, but it would allow the client to 'pull' the objects from the service at its own pace. Streaming is supported in most, if not all, the standard bindings (including HTTP).