Caching architecture for Memcached/wcf/web/ravendb - wcf

I have an architecture question - related to my ravendb based setup.
I have the following:
ravendb -> wcf service -> (web/iphone/android)
the web/iphone/android level actually has (at the moment - this is growing) connections to 7 wcf services
at the moment the 7 services talk to the same ravendb - this is likely to be segmented in a future refactoring blitz as they don't need to be on the same instance - there is minimal - if none at all - crossover of the model.
My question is this:
I am looking at using memcached - at which points (i have little experience setting this up) can i / should i use memcached?
between ravendb and wcf?
between wcf and (web/iphone/android)?
between all?
am i likely to run into stale data issues? is this taken care of or am i over simplifying things?

As many people will tell you: Premature optimization is the root of all evil (and they are all quoting Donald Knuth I think). So wait when you have performance issues before doing anything (You don't need to wait for the system to crush. Wait till you see 90% utilization of your resources)
That being said, You should use memcached (or any kind of caching for that matter) when you expect to use the cached data before it is being invalidated (The improvement factor will change upon many other factors like: the operation cost and the frequency in which the data accessed)
To answer your "where" questions that really depends where you will be saving most on resources and it is really application specific and can not be answered here.

As an additional pointer, RavenDB REST interface uses ETags to support HTTP-based cahing capabilities. If your HTTP client plays well with those mechanisms, you'll have some nice caching out of the box.
I am not sure how this plays with the WCF stack, though

Related

Web Server Caches - in-memory vs the OS

I'm not entirely sure if this question would be better suited for something like Serverfault - however, since I'm a programmer, and not a sys-admin, I'm asking from the perspective of a programmer.
These days there are a HUGE number of options available for caching static web content. Things like Varnish or Squid are used throughout the industry.
However, I'm somewhat confused here. From a theoretical perspective, I don't see how the caching of static content requires the use of some 3rd party software apart from the web-server and OS.
Dynamic-content, (such as, the result of an expensive PHP script calculation or something), certainly could benefit from a good caching system.
But with static content, what do we gain by caching resources in memory? Wouldn't the OS page cache already provide the same benefits as a dedicated caching system like Varnish or Squid? Or am I missing some of the benefits?
Varnish, in fact, stores data in Virtual Memory using mmap - and lets the OS handle the page swapping. So, how exactly is this even different from just saving cached resources to disk and opening them with fread?
You are correct. For static resources, the memory can just as well be put to use for the page cache instead of using Varnish.
Chaining caches (varnish, pagecache) for identical content that compete for the same resource (server memory) is silly.
If you in addition have some dynamic content, you may choose to combine and serve everything from cache due to operational reasons. For example is it simpler to collect access logs and statistics from a single software stack, than two. This also applies to things like staff training and security patching.

What is the best way of pulling json data in terms of performance?

Currently I am using HttpWebRequest to pull json data from an external site, and the performance was not good. Is wcf much better?
I need expert advice on this..
Probably not, but that's not the right question.
To answer it: WCF, which certainly supports JSON, is ultimately going to use HttpWebRequest at the bottom level, and it will certainly have the same network latency. Even more importantly, it will use the same server to get the JSON. WCF has a lot of advantages in building, maintaining, and configuring web services and clients, but it's not magically faster. It's possible that your method of deserializing JSON is really slow compared to what WCF would use by default, but I doubt it.
And that brings up the really important point: find out why the performance is bad. Changing frameworks is only an intelligible optimization option if you know what's slow, and, by extension, how doing something different would make it less slow. Is it the server? Is it deserialization? Is it network? Is it authentication or some other request overhead detail? And so on.
So the real answer is: profile! Once you know what the performance issue really is, you can make an informed decision about whether a framework like WCF would help.
The short answer is: no.
The longer answer is that WCF is an API which doesn't specify a communication method, but supports multiple methods. However, those methods are normally over SOAP which is going to involve more overheard than a JSON, and it would seem the world has decided to move on from SOAP.
What sort of performance are you looking for and what are you getting? It may be that you are simply facing physical limitations of network locations, in which case you might look towards making your interface feel more responsive, even if the data is sluggish.
It'd be worth it to see if most of the latency is just in reaching the remote site (e.g. response times are comparable to ping times). Or, perhaps, the problem is the time it takes for the remote site to generate and serve the page. If so, some intermediate caching might be best.
+1 on what Isaac said, but one thing I'd add is, if you do use WCF here, it'll internally use the HttpWebRequest in most places, so you're definitely not gaining performance at all. One way you may unintentionally gain in performance -- however -- is in how WCF recycles, reuses, pools, and caches most transport objects internally. So it ultimately goes back to Isaac's advice on profiling.

Improving WCF performance

Could I know ways to improve performance of my .Net WCF service?
Right now its pretty slow and sometimes it gets clogged & eventually stops responding.
What kind of InstanceContextMode and ConcurrencyMode are you using on your service class?
If it's PerCall instances, you might want to check if you can reduce the overhead of creating a server instance for each call.
If it's Single instances (singleton) - do you really need that? :-) Try using PerCall instead.
Marc
Well, what sort of data are you sending, and over what binding?
Is the problem the size of requests (bandwidth), or the quantity of requests (latency). If latency, then simply make fewer, but bigger, requests ;-p
For bandwidth: if you are sending binary data over http, you can enable MTOM - that'll save you a few bytes. You can enable compression support at the server, but this isn't guaranteed.
If you are using .NET to .NET, you might want to consider protobuf-net; this has WCF hooks for swapping the formatter (DataContractSerializer) to use google's "protocol buffers" binary format, which is very small and fast. I can advise on how on request.
Other than that: send less data ;-p
What binding are you using? If you're using HTTP you could get better perfomance with TCP.
In all likelihood though the bottleneck is going to be higher up in the WCF pipeline and possibly in your hosted objects.
We'd need some more details about your WCF set up to be able to help much.
The symptoms you describe could be caused by anything at all. You'll need to narrow it down by using a profiler such as JetBrain's dotTrace or Automated QA's AQTime.
Or you could do it the old fashioned way by instrumenting your code (which is what the profilers do for you). Collect the start time before your operation starts. When it finishes, subtract the start time from the curren time to determine the elapsed time, then print it out or log it or whatever. Do the same around the methods that this operation calls. You'll quickly see which methods are taking the longest time. Then, move into those methods and do the same, finding out what makes them take so long, etc.
"Improve performance of my .Net WCF service" - its very generic term you are asking, different ways we can improve performance and at the sametime you need to find which one causing performance hit like DB access in WCF methods.
Please try to know available features in WCF like oneWay WCF method it will help you in finding ways to improve performance.
Thanks
Venkat
Here is an article with some statistics from real production systems, you could use these to compare/benchmark your performance.
WCF Service Performance
Microsoft recently released a knowledge base article:
WCF Performance and Stability Issues - http://support.microsoft.com/kb/982897
These issues include the following:
Application crashes
Hangs
General performance of the application when calling WCF Service.

WCF in the enterprise, any pointers from your experience?

Looking to hear from people who are using WCF in an enterprise environment.
What were the major hurdles with the roll out?
Performance issues?
Any and all tips appreciated!
Please provide some general statistics and server configs if you can!
WCF can be configuration hell. Be sure to familiarize yourself with its diagnostics and svcTraceViewer, lest you get madenning cryptic, useless exceptions. And watch out for the generated client's broken implementation of the disposable pattern.
I've been recently hired to a company that previously handled their client/server communication with traditional asp.net web services and passing dataset's back and forth.
I re-wrote the core so now there is a Net.Tcp "connected" client... and everything is done through there. It was a week worth of "in-production-discoveries"... but well worth it.
The pain points we had to find out late in the game was:
1) The default throttling blocked the 11th user onward (it defaults to allow only 10).
2) The default "maxBufferSize" was set to 65k, so the first bitmap that needed to be downloaded crashed the server :)
3) Other default configurations (max concurent connections, max concurrent calls, etc).
All in all, it was absolutely worth it... the app is a lot faster just by changing their infrustructure and now that we have "connected" users... the server can send messages down to the clients.
Other beautiful gains is that, since we know 100% who is connected, we can actually enforce our licensing policy at the application level. Before now (and before I was hired) my company had to simply log, and then at the end of the month bill the clients extra for connecting too many times.
As already stated, configuration nightmare and exceptions can be cryptic. You can enable tracing and use the trace log viewer to generally troubleshoot a problem but its definitely a shifting of gears to troubleshoot a WCF service, especially once you've deployed it and you are experiencing problems before your code is even executing.
For communication between components within my organization I ended up using [NetDataContract] on my services and proxies which is recommended against (you can't integrate with platforms outside of .NET and to integrate you need the assembly that has the contracts) though I found the performance to be stellar and my overall development time reduced by using it. For us it was the right solution.
WCF is definitely great for enterprise stuff as it is designed with scalability, extensibility, security, etc... in mind.
as maxidad said, it can be very hard though as exceptions often tell you nearly nothing, if you use security (obvisously for enterprise scenarios) you have to deal with certificates, meaningless MessageSecurityExceptions and so on.
Dealing with WCF services is definitely harder than with old asmx service, but it's worth the effort once you're in.
supplying server configs will not be useful to you as it has to fit to your scenario. using the right bindings is very important, as well as security, concurreny. there is no single way to go when using wcf. just think about your requirements. do you need callbacks, what are your users? what kind of security do you need?
however, WCF will be definitely the right technology for enterprise scale applications.

Website Hardware Scaling

So I was listening to the latest Stackoverflow podcast (episode 19), and Jeff and Joel talked a bit about scaling server hardware as a website grows. From what Joel was saying, the first few steps are pretty standard:
One server running both the webserver and the database (the current Stackoverflow setup)
One webserver and one database server
Two load-balanced webservers and one database server
They didn't talk much about what comes next though. Do you add more webservers? Another database server? Replicate this three-machine cluster in a different datacenter for redundancy? Where does a web startup go from here in the hardware department?
A reasonable setup supporting an "average" web application might evolve as follows:
Single combined application/database server
Separate database on a different machine
Second application server with DNS round-robin (poor man's load balancing) or, e.g. Perlbal
Second, replicated database server (for read loads, requires some application logic changes so eligible database reads go to a slave)
At this point, evaluating the current state of affairs would help to determine a better scaling path. For example, if read load is high and content doesn't change too often, it might be better to emphasise caching and introduce dedicated front-end caches, e.g. Squid to avoid un-needed database reads, although you will need to consider how to maintain cache coherency, typically in the application.
On the other hand, if content changes reasonably often, then you will probably prefer a more spread-out solution; introduce a few more application servers and database slaves to help mitigate the effects, and use object caching, such as memcached to avoid hitting the database for the less volatile content.
For most sites, this is probably enough, although if you do become a global phenomenon, then you'll probably want to start considering having hardware in regional data centres, and using tricks such as geographic load balancing to direct visitors to the closest "cluster". By that point, you'll probably be in a position to hire engineers who can really fine-tune things.
Probably the most valuable scaling advice I can think of would be to avoid worrying about it all far too soon; concentrate on developing a service people are going to want to use, and making the application reasonably robust. Some easy early optimisations are to make sure your database design is fairly solid, and that indexes are set up so you're not doing anything painfully crazy; also, make sure the application emits cache-control headers that direct browsers on how to cache the data. Doing this sort of work early on in the design can yield benefits later, especially when you don't have to rework the entire thing to deal with cache coherency issues.
The second most valuable piece of advice I want to put across is that you shouldn't assume what works for some other web site will work for you; check your logs, run some analysis on your traffic and profile your application - see where your bottlenecks are and resolve them.
plenty of fish Architecture
some interesitng videos:
Youtube scalibility
Inteview with Dan Farino, System Architect at Myspace
Joel mentioned adding a second datacenter, with the same setup, and then assigning your users randomly to each. Changes to the data are logged and sent from one location to the other, so that both locations contain all the data.
The talk Scalable Web Architectures Common Patterns & Approaches from Cal Henderson (Yahoo) on Web 2.0 Expo was quite interesting. I thought there was an video, but I could not find it. But here are the slides:
http://www.slideshare.net/techdude/scalable-web-architectures-common-patterns-and-approaches
A certain next step would be a cluster of webservers (a web farm) and a clustered system of database servers (replication or Oracle RAC etc. etc.)
If your interested in caching and using .Net, look into the application caching block in enterprise library (of course use this along with the other points above).