We are developing a project that involves about 10 different WCF services with several endpoints each. One of the services keeps a few big tables of data cached in memory.
We have found we need access to that data from another service. Rather than keeping 2 copies of the cache, I'd like to be able to share those tables across all services.
I have done some research and found some articles about using an IExtension attached to the servicehosts to store the shared data.
Provided that all the services are running under the same web site, will that work? And is it the right approach? Or should I be looking elsewhere?
If the data that you're caching is required by more than one service, it sounds like - from a Service Oriented Architecture perspective, anyway - that it doesn't belong in either of services you have calling it.
If the data being cached isn't really related to either service, but is something that both services need, then perhaps it belongs in it's own seperate service. Have you considered encapsulating your cache in a third service, and performing a service-to-service call to retrieve the data you need? Benefits include...
It solves your original dilemma, avoiding the need to read the whole cache from the database several times;
It encapsulates the cache in one place for easy maintainance/change later.
It allows you to abstract the implementation of the cache away from the other services by putting another service interface in the way.
All in all, I'd suggest that's the best approach. The only downside is the extra overhead of making the service-to-service call, but that surely outperforms having to read the whole cache from the database.
Alternatively, if the data in your cache is very closely related to BOTH of the services that are calling the cache, i.e. both services add/change the data in the cache, etc. then perhaps the two existing services should be combined into a single service.
If what I'm saying is making some sense, then then principle of SOA I'm drawing on is Service Autonomy.
Provided all your services are part of the same application there doesn't seem to be any reason why you can't share the cache directly via a shared object reference. The simplest way of doing this is via a static field.
If you choose this approach, one thing to be very careful about is thread safety. If your cache is concurrently accessed via two WCF sessions, you must ensure that the two sessions are not going to interfere with each other by both changing the cache at the same time. If the cache is read-only, your need to do this is lessened, but you still might need to synchronrise initialisation of the cache.
Related
A simple question about scalability. I have been studying about scalability and I think I understand the basic concept behind it. You use an orchestrator like Kubernetes to manage the automatic scalability of a system. So in that way, as a particular microservice gets an increase demand of calls, the orchestrator will create new instances of it, to deal with the requirement of the demand. Now, in our case, we are building a microservice structure similar to the example one at Microsoft's "eShop On Containers":
Now, here each microservice has its own database to manage just like in our application. My question is: When upscaling this system, by creating new instances of a certain microservice, let's say "Ordering microservice" in the example above, wouldn't that create a new set of databases? In the case of our application, we are using SQLite, so each microservice has its own copy of the database. I would asume that in order to be able to upscale such a system would require that each microservice connects to an external SQL Server. But if that was the case, wouldn't that be a bottle neck? I mean, having multiple instances of a microservice to attend more demand of a particular service BUT with all those instances still accessing a single database server?
In the case of our application, we are using SQLite, so each microservice has its own copy of the database.
One of the most important aspects of services that scale-out is that they are stateless - services on Kubernetes should be designed according to the 12-factor principles. This means that service-instances cannot have its own copy of the database, unless it is a cache.
I would asume that in order to be able to upscale such a system would require that each microservice connects to an external SQL Server.
yes, if you want to be able to scale-out, you need to use a database that are outside the instances and shared between the instances.
But if that was the case, wouldn't that be a bottle neck?
This depend very much on how you design your system. Comparing microservices to monoliths; when using a monolith, the whole thing typically used one big database, but with microservices it is easier to use multiple different databases, so it should be much easier to scale-out the database this way.
I mean, having multiple instances of a microservice to attend more demand of a particular service BUT with all those instances still accessing a single database server?
There are many ways to scale a database system as well, e.g. caching read-operations (but be careful). But this is a large topic in itself and depends very much on what and how you do things.
I have a few (small size) tables, saved in Table Storage which I use only for reading from.
When my service starts, I'd like to read all tables, save the data in a data structure (i.e. List), and read from that List from there on.
Is there a way to do that, or must I read from the Table Storage each time I need data?
If there is a way, where should the List be declared, and where should it be initialized?
Thanks.
Azure cache may be the best route, but there is an obvious cost.
Could you declare the WCF service as a singleton and store the data as a static property?
You could use the Windows Azure Cache service to store the data. See http://www.windowsazure.com/en-us/home/tour/caching/
If your list is not too big, you could use the Windows Azure caching component http://www.windowsazure.com/en-us/home/tour/caching/ . During the initialization process of your service, read the information from your tables, and stored it there. You are also asking where the list should declared and initialized. Are you also hosting your service on Windows Azure? Is this a web service runnig on IIS, or a windows service? Are you using WCF to expose your service?
I see others are suggesting static properties (good choice) and Azure Chache. Anyway it is good to cache the data if it is not often updated, and not read it every time from the Table Storage.
I want to give my two cents:
I would not use Azure Cahce if the data is small enough (1MB is small enough for me). Static property would do the work. But there is also something new to .NET 4.0 and obviously missing from most of programmes view. It's the System.Runtime.Caching namespace. I haven't presonally used it yet, but it seems to be a good for small local caches. You could use the MemoryCache object and store your data in-memory. And, of course program like against any other type of chache - in the getter of a property, check if data exists in the chache. If exists - return it. If does not exists - retrieve from tables, store in chache, and then return it.
We have a Data Access service in our SOA WCF system. This service is responsible for doing CRUD (create, update, delete) operations on "system wide" database tables, and is also the source of this data for queries. Any other service in the system wanting to access the tables under the contol of the DAS have to go to the DAS to get it or modify it. We use Entity Framework and built our own POCO state tracking system for this DAS.
We have other tables in our database that belong to single services and store data only for their own use, ie state information they can access if they crash and resume or recording of business information. We have a rule any one table cannot be accessed by more than one service: so data needed by multiple services ends up in the DAS.
Truth is I have never really understood why a Data Access Service is a good idea as opposed to just accessing tables directly. It seems to be to be slower, our DAS is not transactional as it cannot send back a POCO graph for database update (only single POCOS at a time) and we have issues also where the DAS is actually a client to another service which needs data from it...circular dependancy.
Why bother with a DAS? Why is a DAS so important when it comes to SOA? What am I missing here? Single point of control?
Is it also an SOA design flaw that not all tables are part of a DAS and that some services have their own "private" tables?
Any discussion about this welcome.
You're correct in thinking that this is the proper way to do things, and you're also correct that it slows things down and can occasionally be cumbersome. SOA necessarily trades off some efficiency in exchange for ensuring single points of control for all data associated with a service. In fact, even the idea of having a "common DAS" service is slightly smelly in some SOA circles.
By centralizing all CRUD operations to one service in an SOA application, you can ensure data integrity and that business rules are being acted upon properly. To give an example, think of an entity you'd like to store that has some business rules associated with it that are difficult to approach from a pure SQL perspective - for example, let's say a table that stores file references, and create / update services that ensure that these files exist.
With SOA and a single access point to those tables, you can code the logic into the create / update methods and be reasonably assured that the data you're recieving from the service is valid - i.e. the files referenced exist. If anyone was capable of writing to these tables or retrieving data from them, no such assurance would exist - even if you're calling the service yourself, you don't know what other programmers, through malice or just plan forgetfulness, forgot to implement that critical business rule. This leads to defensive programming where every bit of client code is ensuring business logic independently, and ultimately a tangled mess of business logic scattered throughout your application.
Another benefit is scalability and maintanability. Let's say one of your services is accessing a huge chunk of data. With SOA, everything is "black-boxed" so that your client code doesn't have much knowledge of how the data is ultimately obtained. You could change your RDBMS, partition tables, or implement caching, and make that all invisible to the client code calling it - ensuring your painful updates only need to be made in one place. With database code scattered throughout your app, this sort of upgrade becomes extremely painful.
We've got a smart client that talks to a SQL Server database via WCF, displaying the entities in the database, and allowing the user to edit those entities.
Some of the WCF calls return a large data set. Since this data set doesn't change very often, I'm considering some sort of write-through cache on the client, and only getting the deltas from the WCF service.
That is: the client both reads from the service and writes to the service.
I'm not looking for disconnected/offline operation, but since the majority of the data doesn't change very often, I'd probably implement this with a local data store.
I don't want the local store to get too stale, and I don't think I'm too concerned about conflict resolution, because updates will always go straight to the WCF service -- think of it as a write-through cache.
Would Microsoft's Sync Framework be good for this? Could I use a local SQL-CE cache and perform the updates over WCF? The service end has a SQL Server 2005/2008 backend, but I don't want to talk to it directly. Does Sync Framework integrate well with WCF?
Are there other solutions out there? Should I roll something myself?
I don't think you have to couple it to WCF at all. FeedSync allows you to publish directly to an RSS feed.
The only that I'm not too sure about is if it would be suitable for a "large dataset" though. Since you don't need two way replication, if your dataset is extremely large, you might want to write your own WCF implementation to optimize it; especially for the initial population.
Ok, I have a pretty complex silverlight app that gets its data from a WCF service (asp.net hosted service layer) which in turn calls into a data layer that calls stored procedures in a SQL 2005 DB to extract the needed data. So the round trip goes like this:
Silverlight App --> WCF Service --> Data Layer --> DB --> Data Layer --> WCF Service transforms Data Entity into corresponding DTO (Data Transfer Object) or List<> thereof --> Silverlight App
Much of the data is highly relational (so it needs to exist in the DB), but it will change infrequently. It seems that I have several choices of locations to cache this "semi-constant" data:
I can cache it in the data layer. My data layer is already set up to use the SQLDependency class and cache the results from a stored procedure call. I think that this is or can be an application level cache.
I can cache the resulting DTO in an application level (or session level depending on the call) cache within the WCF service itself.
2(a) I could even take this a step further by serializing the XML for the resulting DTO(s) into a file on the WCF service side so that I could (a) check memory cache, then (b) check file cache and (c) hit the data layer
I could do something similar to 2(a) with isolated storage on the client side within the SL app. I could serialize the data to the local isolated storage with a hash (or a moddate or something) and then just make a call to check that.
One more thing to add: I am hosting this WCF service in IIS7 with dynamic compression turned on so that the (often very large and easily compressed) XML response gets gzip-ed. Ideally, it would seem, I would like IIS to cache this gzip-ed result to avoid all the extra processing. I think that it may do this already but I am not sure.
I am pretty sure that the final answer to this is some flavor of "it depends", but I would love to hear how others are approaching this. A good tactical recipe of Do X, Test Performance with tool Y, the do Z if needed would be great to have.
A few links (I will add to this as I research this):
WCF Caching Approach
If you have data that are user that will change quite rarely and need fast response, going for a custom mechanism bases on local storage is a great advantage quite faster than having to wait for a server roundtrip.
Dino Sposito published an interesting article about local storage and caching on MSDN Magazine there you can find as well an approach to catch assemblies (imagine just loading the minimum package required and just go loadin the rest of assemblies in background, ... performance rocket, more complexity on your code :)).
As you said is matter to go putting in a balance and decide.
HTH
Braulio
My approach would be this:
Determine if there is actually a problem with performance (isn't it alreade acceptable to my users?)
Measure the performance at each teir (how long does it take the database to come up with data? how long does it take the service to respond with data? how much time does it take from the service to the client?)
Based on the measurements I would then determine where to do my caching. Remember that, the closer to your data storage you do caching, the easier it is, but the closer to the client you do caching, the better the performance gain (usually).
Also remember that caching should not be the first thing to do to improve performance. You should also look into other performance gains as well. Are the stored procedures slow? Is there a lot of overhead in the WCF messages? Is there some inefficient processing in the service? Do I realy need all that data in one message?
HTH,
Jonathan
I think #2 is your best bet for maintainability and architecture. IIS provides caching, why not use it?
You don't want to have to reference System.Web from a data layer. Client side is not the best option either, because you'd have to write a bunch of additional code to keep the data synchronized.
Is System.Web caching even available to WCF when it's not running in ASP.NET compatible mode? Probably best not to depend on it and write your own.
On the other hand, look into Microsoft's Velocity project, which looks like it will produce a very interesting caching technology not dependant on ASP.NET.
We just recently implemented #3, the client-side caching using Isolated Storage.
In our app we have lot of drop downs and custom fields which the app used to get from the server every time it loads. Moving these data to IS really helped. The app now makes a call to check if there were any changes on the server, and if not - loads the data from the IS, otherwise ( which is pretty rare ) refreshes IS.
That eliminated a lot of WCF calls and data transfers, the SL pages' loading time is shorter, and the app in general became more scalable because of the reduced network traffic and db access.
Yes, there are some coding involved, but the benefits for the end users are essential.
Andrew
If you use RIA Services, then a simple approach is to have two separate edmx definitions. One for cached entities, one for transactional ones.
One domain context can reference the entities on another domaincontext via AddReference see.
The cached entities could be loaded immediately after user has authenticated. For simplicity, transactional data should not load until cached entities have loaded.
Depending on the size of the cache, you may also wish to consider serializing these values to local storage.