oData RIA PowerPivot Large Message Size - wcf

I was playing with powerpivot to directly load 3 million rows from a sql database and performance is suprisingly good.
I tried generating a simple oData service by using vs2010 and silverlight RIA services and access that from powerpivot. Which works with small numbers of rows but blows up on the server if a single method tries to return 3 million rows. Not suprising I guess.
Ive often run into the message size issue with WCF and it is a real pain to configure transports to support larger sizes. Plus, ideally I dont want one big message but some sort of data packeting. Adding a layer of RIA and oData ontop of WCF seems to just make the idea of changing max message sizes even more convoluted.
Is there any support in the oData interface for a transport that will stream or packet the data returned from a method?
Is this a limitation of WCF/RIA or oData itself. Is it possible to use powerpivot connected to an oData source that returns millions of rows?
Anybody have ideas for better techniques of exposing large sets of data via WCF / RIA / oData ?
thanks,
Adam

Found it! In the DataService<> class InitializeService method needed to use config.SetEntitySetPageSize.

Related

WCF streaming mode is really slow

I want to know why WCF in streaming mode is really slow compared to the buffered mode.
Basically, I'm reading a lot of data from a server (database access) then transferring that huge data through WCF to other clients.
I was doing some tests and benchmarks by comparing the 2 different transfer modes.
I created 2 endpoints. The first one is using transferMode="Buffered" and the other one is using transferMode="StreamedResponse".
By loading the same 1 millions rows from a SQL server (Dummy table), here are the results:
Buffered: 20447 milliseconds.
Streaming: 109417 milliseconds.
The streaming is done like in that Q/A. Basically, the data is stored in an IEnumerable<T> and then streamed to the client that consumes it.
I can provide the WCF app.config files if needed.
By the way, I already had a look on other similar questions like those:
WCF NetTcpBinding Buffered vs Streamed performance problems
But they don't really give an appropriate answer.

Where should we calculate fields?

I'm currently working in a Silverlight / MS SQL project where the Entity Framework has not been implemented and I would like to know what's the best practice to deal with calculated fields in this particular situation.
Considering that some external system might also consume my data directly in the DB or thru a web service, here's the 3 options I can see right now.
1) Force any external system to consume data thru a web service and create all the calculated fields in the objects only.
2) Create the calculated fields in a DB view and resync your object with the server each time a value needs to be calculated.
3) Replicate the calculation rules in the object and the database view.
Any other suggestions would also be welcomed.
I would recommend to follow two principles: data decoupling and minimum functionality duplication. Both would suggest to put your calculations in one place only, and serve them already calculated. So I would implement the calculations in the DB, and serve them via a web service.
However, you have to consider your particular case. For example, if the calculations are VERY heavy, you could delegate them to the client to spare server resources. This could even be the reason you are using Silverlight. I am in a similar situation on a project, and I found that the best compromise is to push raw data to the client and have it do the heavy computations.
Having a best practice or approach for this kind of problem is difficult as circumstances change what was formerly a good approach might start to seem less useful. That said where possible I would do anything data related at the DB level including calculated fields. This way you know no matter where you are looking at the data from you will see the same results. So your web service, SQL reporting and anything else that needs to look at or receive data will see the same result.

Sync Framework 2.0 + WCF Service - OutofMemoryException

I have a process using Microsoft Sync Framework 2.0 across a WCF Service (IIS Hosted) to synchronize a SQL 2008 Standard database (server) and SQL CE 3.5 (client). All was working perfectly, until a single user started receiving OutOfMemory Exceptions. As it turns out, this user has a dataset that is significantly larger than any other user.
The dataset in question is 800,000 rows, with a total size when exported to CSV from SSMS of 174MB. Most users are in the 20-30MB range, which works fine.
I am using the DbServerSyncProvider, and SqlCeClientSyncProvider.
I have implemented batching as described in other articles and posted, to no avail. As I understand it, the batching mechanism in the DbServerSyncProvider is just how many revisions of the data to retrieve in one pull. Even with an anchor difference of 1, I still result in the same sized dataset.
I am using transferMode="Streamed" on my service, and I have applied the fix for Streamed when hosting in IIS.
I have tried upping the maxReceivedMessageSize, first from 20MB to 200MB, then to 2GB, and finally to 10GB, all with no success. This was done on both the server and client.
My WCF Trace logs show the Execute of GetChanges, but never logs anything under Process action.
I have read about the SqlSyncProvider, and how it allows batching by memory size. I can't find much information about using this through a WCF Service, though, and before I attempt to rewrite my client and server using this, I wanted to check if I was being an idiot on something and whether the SqlSyncProvider could solve my issue, along with being able to transfer across a WCF Service.
Thanks in advance...
The out of memory is most likely caused by the way Datasets are serialized.
If you want to re-write using the SqlSyncProvider, check out the section Code Specific to N-Tier on this link: http://msdn.microsoft.com/en-us/library/dd918908.aspx#Y3096. That should give you an idea on writing the WCF service component for the SqlSyncProvider.
You may also check out the sample SQL Server and SQL Compact N-Tier with WCF
If you want to retain your existing providers, you can play around using DatasetSurrogates. Check out a sample here: Sync Framework WCF-based Synchronization for Offline scenario – Using custom dataset serialization

Write-though caching of large data sets in WCF?

We've got a smart client that talks to a SQL Server database via WCF, displaying the entities in the database, and allowing the user to edit those entities.
Some of the WCF calls return a large data set. Since this data set doesn't change very often, I'm considering some sort of write-through cache on the client, and only getting the deltas from the WCF service.
That is: the client both reads from the service and writes to the service.
I'm not looking for disconnected/offline operation, but since the majority of the data doesn't change very often, I'd probably implement this with a local data store.
I don't want the local store to get too stale, and I don't think I'm too concerned about conflict resolution, because updates will always go straight to the WCF service -- think of it as a write-through cache.
Would Microsoft's Sync Framework be good for this? Could I use a local SQL-CE cache and perform the updates over WCF? The service end has a SQL Server 2005/2008 backend, but I don't want to talk to it directly. Does Sync Framework integrate well with WCF?
Are there other solutions out there? Should I roll something myself?
I don't think you have to couple it to WCF at all. FeedSync allows you to publish directly to an RSS feed.
The only that I'm not too sure about is if it would be suitable for a "large dataset" though. Since you don't need two way replication, if your dataset is extremely large, you might want to write your own WCF implementation to optimize it; especially for the initial population.

WCF/Silverlight/SQL DB Caching Strategies

Ok, I have a pretty complex silverlight app that gets its data from a WCF service (asp.net hosted service layer) which in turn calls into a data layer that calls stored procedures in a SQL 2005 DB to extract the needed data. So the round trip goes like this:
Silverlight App --> WCF Service --> Data Layer --> DB --> Data Layer --> WCF Service transforms Data Entity into corresponding DTO (Data Transfer Object) or List<> thereof --> Silverlight App
Much of the data is highly relational (so it needs to exist in the DB), but it will change infrequently. It seems that I have several choices of locations to cache this "semi-constant" data:
I can cache it in the data layer. My data layer is already set up to use the SQLDependency class and cache the results from a stored procedure call. I think that this is or can be an application level cache.
I can cache the resulting DTO in an application level (or session level depending on the call) cache within the WCF service itself.
2(a) I could even take this a step further by serializing the XML for the resulting DTO(s) into a file on the WCF service side so that I could (a) check memory cache, then (b) check file cache and (c) hit the data layer
I could do something similar to 2(a) with isolated storage on the client side within the SL app. I could serialize the data to the local isolated storage with a hash (or a moddate or something) and then just make a call to check that.
One more thing to add: I am hosting this WCF service in IIS7 with dynamic compression turned on so that the (often very large and easily compressed) XML response gets gzip-ed. Ideally, it would seem, I would like IIS to cache this gzip-ed result to avoid all the extra processing. I think that it may do this already but I am not sure.
I am pretty sure that the final answer to this is some flavor of "it depends", but I would love to hear how others are approaching this. A good tactical recipe of Do X, Test Performance with tool Y, the do Z if needed would be great to have.
A few links (I will add to this as I research this):
WCF Caching Approach
If you have data that are user that will change quite rarely and need fast response, going for a custom mechanism bases on local storage is a great advantage quite faster than having to wait for a server roundtrip.
Dino Sposito published an interesting article about local storage and caching on MSDN Magazine there you can find as well an approach to catch assemblies (imagine just loading the minimum package required and just go loadin the rest of assemblies in background, ... performance rocket, more complexity on your code :)).
As you said is matter to go putting in a balance and decide.
HTH
Braulio
My approach would be this:
Determine if there is actually a problem with performance (isn't it alreade acceptable to my users?)
Measure the performance at each teir (how long does it take the database to come up with data? how long does it take the service to respond with data? how much time does it take from the service to the client?)
Based on the measurements I would then determine where to do my caching. Remember that, the closer to your data storage you do caching, the easier it is, but the closer to the client you do caching, the better the performance gain (usually).
Also remember that caching should not be the first thing to do to improve performance. You should also look into other performance gains as well. Are the stored procedures slow? Is there a lot of overhead in the WCF messages? Is there some inefficient processing in the service? Do I realy need all that data in one message?
HTH,
Jonathan
I think #2 is your best bet for maintainability and architecture. IIS provides caching, why not use it?
You don't want to have to reference System.Web from a data layer. Client side is not the best option either, because you'd have to write a bunch of additional code to keep the data synchronized.
Is System.Web caching even available to WCF when it's not running in ASP.NET compatible mode? Probably best not to depend on it and write your own.
On the other hand, look into Microsoft's Velocity project, which looks like it will produce a very interesting caching technology not dependant on ASP.NET.
We just recently implemented #3, the client-side caching using Isolated Storage.
In our app we have lot of drop downs and custom fields which the app used to get from the server every time it loads. Moving these data to IS really helped. The app now makes a call to check if there were any changes on the server, and if not - loads the data from the IS, otherwise ( which is pretty rare ) refreshes IS.
That eliminated a lot of WCF calls and data transfers, the SL pages' loading time is shorter, and the app in general became more scalable because of the reduced network traffic and db access.
Yes, there are some coding involved, but the benefits for the end users are essential.
Andrew
If you use RIA Services, then a simple approach is to have two separate edmx definitions. One for cached entities, one for transactional ones.
One domain context can reference the entities on another domaincontext via AddReference see.
The cached entities could be loaded immediately after user has authenticated. For simplicity, transactional data should not load until cached entities have loaded.
Depending on the size of the cache, you may also wish to consider serializing these values to local storage.