w3wp.exe consuming large amount of RAM

w3wp.exe consuming large amount of RAM - .net-4.0

I noticed the other day that it seems like all the w3wp.exe running on my server are consuming way more RAM than I would expect. So I created test web application with a single aspx page and then a vanilla global.asax page as well (default methods, no additional code). I then deployed that site to IIS6 with a target framework of 4.0 built in release mode with debug set to "false". The site is also set up under it's own application pool. I then used issapp.vbs to figure out which w3wp.exe this test site was running under. I was surprised to see that the single site with 1 page was using almost 40mb of ram.
40mb of RAM seems like a lot for a single page website. Is this normal and if so why? Is there anything I can do to reduce the memory footprint?
I also noticed that each time the default page was refreshed that the w3wp.exe grew a little bit more. Is IIS6 caching the same page over and over?

40Mb for a single AppDomain (irrespective of how many pages you have) is alright. Consider that .NET is a managed environment and the app pool contains a large majority of the logic and heap for serving requests. In many cases it also will be less eager to free up RAM, if the RAM is not under contention (demanded by other parts of the system).
You can share app pools amongst different websites, and the "overhead" of the sandbox becomes less proportionate to your perception of what is acceptable. However, I don't think this is bad at all.

You're trying to optimize the case where you have no load and no clients. That really isn't sensible. Optimize for a realistic use case.
You don't build a factory and then try to get it to make one item as efficiently as possible. You try to get it to crank them out efficiently by the dozens.

This is not really an issue, since it is quite normal. Your .NET Framework is getting loaded and hence the size. When a real issue happens, you would have to use certain post production or profiling tools to get it fixed. Sharing some articles for memory and post production debugging just in case you need it in future.
http://msdn.microsoft.com/en-us/magazine/cc188781.aspx
http://aspalliance.com/1350_Post_Production_Debugging_for_ASPNET_Applications__Part_1.all
http://blogs.msdn.com/b/tess/archive/2006/09/06/net-memory-usage-a-restaurant-analogy.aspx
http://blogs.msdn.com/b/tess/archive/2008/09/12/asp-net-memory-issues-high-memory-usage-with-ajaxpro.aspx

Related

Google App Engine Latency skyrockets out of nowhere

Our development team is scratching our heads wondering why Google App Engine latency tends to go through the roof from time to time with very little predictability or warning. When latency jumps like this we start to see db timeouts between our app and database. The CPU util is pretty flat across the instances at this time which also makes this hard to understand.
We are using the Flex environment to host a .NET Core API. We like AppEngine for its PaaS feel and its always on feature. We are thinking about looking at Cloud Run as an alternative to test with since we can't figure this out.
Any suggestions on where to look or how we could troubleshoot this?
Here's the latest spike in latency from last night. Plenty of Cloud SQL db timeouts and other exceptions happening here due to this spike as well.

There are couple of reasons for this. This page has all the possible causes and the debugging steps. Mostly check your monitoring for memory usage, there might be a memory leak. Also, check for the autoscalar configuration.

Understanding BLOCKED_TIME in PerfView

We are suspecting that we're experciencing thread pool starvation on a server that is running a couple of ASP.NET Core APIs and a couple of .NET Core consoles.
I ran perfview one one of our servers were we are suspecting problems with thread pool starvation. However I'm having a bit of trouble analyzing the results.
I ran PerfView /threadTime collect for about 60 seconds. And this is the result I got (I chose one to look at one of our ASP.NET Core APIs):
Looking at "By Name" we can see that there is a lot of time spent in BLOCKED_TIME. If I double click then I'm taken to the following view where I can expand one of the nodes to get the following view (the overwritten part is the name of our API process):
What does that tell me? Shouldn't I be able to see what exactly is blocking? And does it look like the problem is that a lot of threads is blocking each one for a small amount of time?
Are there any other conclusions we can draw from this?

BLOCKED_TIME generally means a period when the thread wasn't doing anything at all. This could be periods of I/O, where network or other types of latency are involved or time spent waiting on locks such as in situations with semaphores. In short, this doesn't necessarily tell you anything, as there's perfectly standard and reasonable reasons for the thread to be idled. However, a goodish amount of time spent blocked can be an indication of an underlying problem. Perhaps you have too much network latency. Perhaps you're trying to do too much file system work on a slow drive. In short, it may or may not indicate a problem, and even if it does indicate a problem, it doesn't really tell you what the problem is.
In general, if you're experiencing thread starvation, the first thing you should look at is thread pool utilization. Are you using async everywhere you can? Are you doing things that are big no-nos in web apps such as using Task.Run, Task.StartNew or worse, Thread.Start? All those created threads are coming out of the same thread pool, and thus proportionally reducing your server throughput.
There's an all too common pattern of attempting to schedule long-running jobs by shuffling them to new threads. That's death to a web application. All threads in the pool are there to service requests, not long-running jobs, and as such, requests should be handled quickly and efficiently so that the thread can be returned to the pool in short order to field other requests. If you need to background work, you need to truly background it, by offloading to another process or even a different machine entirely.
Short of all that, maybe you're just getting more load than the server can handle in general. That's always a possibility. Perhaps you need to vertically scale your system resources (and the thread pool with it). Perhaps you need to horizontally scale by replicating this server with a load balancer in front. Given that you're running multiple different things on the same server, an easy way to horizontally scale is to simply divvy out these things to their own machines. That alone would probably help tremendously. However, scaling, either vertically or horizontally, should be your last resort. Make sure you're using resources efficiently first, before throwing more resources at your inefficient things.

High CPU with ImageResizer DiskCache plugin

We are noticing occasional periods of high CPU on a web server that happens to use ImageResizer. Here are the surprising results of a trace performed with NewRelic's thread profiler during such a spike:
It would appear that the cleanup routine associated with ImageResizer's DiskCache plugin is responsible for a significant percentage of the high CPU consumption associated with this application. We have autoClean on, but otherwise we're configured to use the defaults, which I understand are optimal for most typical situations:
<diskCache autoClean="true" />
Armed with this information, is there anything I can do to relieve the CPU spikes? I'm open to disabling autoClean and setting up a simple nightly cleanup routine, but my understanding is that this plugin is built to be smart about how it uses resources. Has anyone experienced this and had any luck simply changing the default configuration?
This is an ASP.NET MVC application running on Windows Server 2008 R2 with ImageResizer.Plugins.DiskCache 3.4.3.

Sampling, or why the profiling is unhelpful
New Relic's thread profiler uses a technique called sampling - it does not instrument the calls - and therefore cannot know if CPU usage is actually occurring.
Looking at the provided screenshot, we can see that the backtrace of the cleanup thread (there is only ever one) is frequently found at the WaitHandle.WaitAny and WaitHandle.WaitOne calls. These methods are low-level synchronization constructs that do not spin or consume CPU resources, but rather efficiently return CPU time back to other threads, and resume on a signal.
Correct profilers should be able to detect idle or waiting threads and eliminate them from their statistical analysis. Because New Relic's profiler failed to do that, there is no useful way to interpret the data it's giving you.
If you have more than 7,000 files in /imagecache, here is one way to improve performance
By default, in V3, DiskCache uses 32 subfolders with 400 items per folder (1000 hard limit). Due to imperfect hash distribution, this means that you may start seeing cleanup occur at as few as 7,000 images, and you will start thrashing the disk at ~12,000 active cache files.
This is explained in the DiskCache documentation - see subfolders section.
I would suggest setting subfolders="8192" if you have a larger volume of images. A higher subfolder count increases overhead slightly, but also increases scalability.

How to generate low-memory and high disk IO for testing?

I'm developing a windows application. Currently it's a .NET application on Windows 7 PCs.
I would like to test the performance of my app under conditions where the user may have other memory-intensive applications open as well as cases where there are other heavy hard disk accessing applications in use.
Are there accepted ways to test this?
For the memory test I can write a separate application to allocate memory in various configurations (one big array, linked list of many smaller (or larger) nodes) etc...
The disk access is trickier to me because I'm not even sure how to correctly measure how much load a separate load-generation app would be imparting, other than running various scenarios: creating a large file, deleting or copying, creating many small files etc...
I feel like these problems maybe already solved by some software I just haven't found yet. Or maybe everyone writes their own for their specific needs?

What is the performance hit for using WCF Performance Counters (performanceCounters = "ALL")?

Does anyone have experience with using the WCF Performance Counters in a production system and running into any performance issues? I suspect if you are monitoring all Service, Endpoints, and Operations and log all counters to a file, sampling every second, then this is the worst case scenario. From what I gather, the hit comes when you actually sample, not when the counters are turned on. Any real-life experience out there using them in production?

I can't answer for the WCF ones in detail, but Performance Counters in general work by writing values to some shared memory all the time. So WCF always writes values to a memory mapped file or a shared section in a dll.
When the perfmon application wants to display them, it loads the shared memory and reads from it. That's not a performance hit particularly.
The problem comes when you want to do something with that counter data, like write it to a file or update a graph. That's when the performance starts to be noticeable. This goes double if the reader is running across the network.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas