Is this possible in wcf? - wcf

I have a wcf service which returns a list of many objects e.g. 100,000
I get an error when calling this function because the maximum size i am allowed to pass back from wcf has been exceeded.
Is there a built in way i could return this in smaller chunks e.g. 20,000 at a time
I can increase the size allowed back from the wcf but was wondering what the alternatives were.
Thanks

Without knowing your requirements, I'd take a look at two other possible options:
Paging: If your 100,000 objects are coming from a database, then use paging to reduce the amount of data and invoke the service in batches with a page number. If the objects are not coming from a database, then you'd need to look at how that data will be stored server-side during invocations.
Streaming: Return the data to the caller as a stream instead.
With the streaming option, you'd have to do some more work in terms of managing the serialization of the objects, but it would allow the client to 'pull' the objects from the service at its own pace. Streaming is supported in most, if not all, the standard bindings (including HTTP).

Related

how would I expose 200k+ records via an API?

what would be the best option for exposing 220k records to third party applications?
SF style 'bulk API' - independent of the standard API to maintain availability
server-side pagination
call back to a ftp generated file?
webhooks?
This bulk will have to happen once a day or so. ANY OTHER SUGGESTIONS WELCOME!
How are the 220k records being used?
Must serve it all at once
Not ideal for human consumers of this endpoint without special GUI considerations and communication.
A. I think that using a 'bulk API' would be marginally better than reading a file of the same data. (Not 100% sure on this.) Opening and interpreting a file might take a little bit more time than directly accessing data provided in an endpoint's response body.
Can send it in pieces
B. If only a small amount of data is needed at once, then server-side pagination should be used and allows the consumer to request new batches of data as desired. This reduces unnecessary server load by not sending data without it being specifically requested.
C. If all of it needs to be received during a user-session, then find a way to send the consumer partial information along the way. Often users can be temporarily satisfied with partial data while the rest loads, so update the client periodically with information as it arrives. Consider AJAX Long-Polling, HTML5 Server Sent Events (SSE), HTML5 Websockets as described here: What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?. Tech stack details and third party requirements will likely limit your options. Make sure to communicate to users that the application is still working on the request until it is finished.
Can send less data
D. If the third party applications only need to show updated records, could a different endpoint be created for exposing this more manageable (hopefully) subset of records?
E. If the end-result is displaying this data in a user-centric application, then maybe a manageable amount of summary data could be sent instead? Are there user-centric applications that show 220k records at once, instead of fetching individual ones (or small batches)?
I would use a streaming API. This is an API that does a "select * from table" and then streams the results to the consumer. You do this using a for loop to fetch and output the records. This way you never use much memory and as long as you frequently flush the output the webserver will not close the connection and you will support any size of result set.
I know this works as I (shameless plug) wrote the mysql-crud-api that actually does this.

Best way to store data between two request

I need one a bit theoretical advice. Here is my situation : I have a search system, which returns a list of found items. But the user is allowed to display only particular amount of items on one page, so when his first request is sent to my WCF service, it gets the whole list, then tests if the list isn't longer then the ammount of items my user is allowed to get and if the list isn't longer, there is no problem and my service returns the whole list, but when it is, then there is problem. I need to let the user choose which page he wants to display, so I let the javascript know that the user should choose page and the "page number dialog" is shown and then user is sending the second request with page number. And based on this request the webservice selects relewant items and sends them back to user. So what I need to do is to store the whole list on the server between first and second request and I 'd appreciate any idehow to store it. I was thinking about session, but I don't know if it is possible to set timeout only to particular sesion (ex. Session["list"]), because the list is used only once and can have thousands of items, so I don't want to keep it on the server to long.
PS. I Can't use standart pagination, the scenario has to be exactly how is described above.
Thanks
This sounds like a classic use-case for memcached. It is a network based key-value store for storing temporary values. Unlike in-memory state, it can be used to share temporary cached values among servers (say you have multiple nodes), and it is a great way to save state across requests (avoiding the latency that would be caused by using cookies, which are transmitted to/from the server on each http request).
The basic approach is to create a unique ID for each request, and associate it with a particular (set of) memcached key for that user's requests. You then save this unique ID in a cookie (or similar mechanism).
A warning, though, the memory is volatile, so can be lost at any point. In practice, this is not frequent, and the memcached algorithm uses a LRU queue. More details http://code.google.com/p/memcached/wiki/NewOverview
http://memcached.org/
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
I'm not a .net programmer, but there appear to be implementations:
http://code.google.com/p/memcached/wiki/Clients
.Net memcached client
https://sourceforge.net/projects/memcacheddotnet .Net 2.0 memcached
client
http://www.codeplex.com/EnyimMemcached Client developed in .NET 2.0
keeping performance and extensibility in mind. (Supports consistent
hashing.) http://www.codeplex.com/memcachedproviders BeIT Memcached
Client (optimized C# 2.0)
http://code.google.com/p/beitmemcached jehiah
http://jehiah.cz/projects/memcached-win32

Having more WCF methods in a service can decrease performance?

What is a best practice for designing WCF services concerning to the use of more or less operations under a single service.
Taking into consideration that a Service must be generic and Business oriented, I have encountered some SOAP services # work that have too much XML elements per operation in their contracts and too many operations in a single service.
From my point of view, without testing, I think the number of operations within a service will not have any impact on the performance in the middleware since a response is build specifically for each operation containing only the XML elements concerning that operation.
Or are there any issues for having too many operations within a SOAP service ?
There is an issue, and that is when trying to do a metadata exchange or a proxy creation against a service with many methods (probably in the thousands). Since it will be trying to do the entire thing at once, it could timeout, or even hit an OutOfMemory exception.
Dont hink it will impact performance much but important thing is methods must be logically grouped in different service. Service with large number of method usually mean they are not logically factored.

WCF best practises in regards to MaxItemsInObjectGraph

I have run into the exception below a few times in the past and each time I just change the configuration to allow a bigger object graph.
"Maximum number of items that can be serialized or deserialized in an object graph is '65536'. Change the object graph or increase the MaxItemsInObjectGraph quota."
However I was speaking to a colleague and he said that WCF should not be used to send large amounts of data, instead the data should be bite sized.
So what is the general consensus about large amounts of data being returned?
In my experience using synchronous web service operations to transmit large data sets or files leads to many different problems.
Firstly, you have performance related issues - serialization time at the service boundary. Then you have availability issues. Incoming requests can time out waiting for a response, or may be rejected because there is no dispatcher thread to service the request.
It is much better to delegate large data transfer and processing to some offline asynchronous process.
For example, in your situation, you send a request and the service returns a URI to the eventual resource you want. You may have to wait for the resource to become available, but you can code your consumer appropriately.
I haven't got any concrete examples but this article seems to point to WCF being used for large data sets, and I am aware of people using it for images.
Personally, I have always had to increase this property for any real world data.

What is the best way to send large data from WCF service to Client?

I have a particular service which returns large amount of data. What' is the best practice and options available in WCF to handle this?
These large data is returned after doing all the filtering, so no more filtering is possible.
The data could go into GB's. I do understand there is a limit to how much data a system can handle.
But give the above scenario what options/alternatives would you recommend.
Use streaming MSDN
MTOM is a mechanism for transmitting large binary attachments with SOAP messages as raw bytes, allowing for smaller messages.
see: http://msdn.microsoft.com/en-us/library/aa395209.aspx for details.