Unable to control memory utilization on server increasing with ruby process - ruby-on-rails-3

I have a Rails 3.2 application running on production server. The server has 8 GB of RAM and every other process works fine. But, there is a ruby process which keeps the memory utilization on the higher side. I have to manually login to the server console and type the TOP command and kill the process using the PID.
But, I am unable to figure out how to check which ruby process is taking so much of memory and also how to control it permanently.
Please suggest me a solution.
Thanks.

Could be so many things. Finding memory leaks is tough. What kind of application server are you using? If you're using Unicorn consider checking out Puma. It's actually really easy to switch over. We saw big gains in our app when we switched to Puma.
Also look through your app for n+1 queries. Optimizing some queries here and there would help tremendously.
Another thing you could consider trying is moving some longer running tasks to a background job with something like sidekiq.
Lots of performance monitoring services out there, like New Relic, that you could check out as well. Without more info it's a tough question to answer.

Related

Google App Engine Latency skyrockets out of nowhere

Our development team is scratching our heads wondering why Google App Engine latency tends to go through the roof from time to time with very little predictability or warning. When latency jumps like this we start to see db timeouts between our app and database. The CPU util is pretty flat across the instances at this time which also makes this hard to understand.
We are using the Flex environment to host a .NET Core API. We like AppEngine for its PaaS feel and its always on feature. We are thinking about looking at Cloud Run as an alternative to test with since we can't figure this out.
Any suggestions on where to look or how we could troubleshoot this?
Here's the latest spike in latency from last night. Plenty of Cloud SQL db timeouts and other exceptions happening here due to this spike as well.
There are couple of reasons for this. This page has all the possible causes and the debugging steps. Mostly check your monitoring for memory usage, there might be a memory leak. Also, check for the autoscalar configuration.

Understanding BLOCKED_TIME in PerfView

We are suspecting that we're experciencing thread pool starvation on a server that is running a couple of ASP.NET Core APIs and a couple of .NET Core consoles.
I ran perfview one one of our servers were we are suspecting problems with thread pool starvation. However I'm having a bit of trouble analyzing the results.
I ran PerfView /threadTime collect for about 60 seconds. And this is the result I got (I chose one to look at one of our ASP.NET Core APIs):
Looking at "By Name" we can see that there is a lot of time spent in BLOCKED_TIME. If I double click then I'm taken to the following view where I can expand one of the nodes to get the following view (the overwritten part is the name of our API process):
What does that tell me? Shouldn't I be able to see what exactly is blocking? And does it look like the problem is that a lot of threads is blocking each one for a small amount of time?
Are there any other conclusions we can draw from this?
BLOCKED_TIME generally means a period when the thread wasn't doing anything at all. This could be periods of I/O, where network or other types of latency are involved or time spent waiting on locks such as in situations with semaphores. In short, this doesn't necessarily tell you anything, as there's perfectly standard and reasonable reasons for the thread to be idled. However, a goodish amount of time spent blocked can be an indication of an underlying problem. Perhaps you have too much network latency. Perhaps you're trying to do too much file system work on a slow drive. In short, it may or may not indicate a problem, and even if it does indicate a problem, it doesn't really tell you what the problem is.
In general, if you're experiencing thread starvation, the first thing you should look at is thread pool utilization. Are you using async everywhere you can? Are you doing things that are big no-nos in web apps such as using Task.Run, Task.StartNew or worse, Thread.Start? All those created threads are coming out of the same thread pool, and thus proportionally reducing your server throughput.
There's an all too common pattern of attempting to schedule long-running jobs by shuffling them to new threads. That's death to a web application. All threads in the pool are there to service requests, not long-running jobs, and as such, requests should be handled quickly and efficiently so that the thread can be returned to the pool in short order to field other requests. If you need to background work, you need to truly background it, by offloading to another process or even a different machine entirely.
Short of all that, maybe you're just getting more load than the server can handle in general. That's always a possibility. Perhaps you need to vertically scale your system resources (and the thread pool with it). Perhaps you need to horizontally scale by replicating this server with a load balancer in front. Given that you're running multiple different things on the same server, an easy way to horizontally scale is to simply divvy out these things to their own machines. That alone would probably help tremendously. However, scaling, either vertically or horizontally, should be your last resort. Make sure you're using resources efficiently first, before throwing more resources at your inefficient things.

Apache server cannot allocate memory for new process

I have a apache server with 32 GB of RAM. When I start the server and execute top to see the resources It show me that the CPU is at 95 percent. It doesn't a normal behaviour and after a few minutes it raises:
apache cannot allocate memory fork unable to fork new process
I don't know how to solve the problem. Any tips?
I had same problem to fix it there is 2 options:
1- move from micro instances to small and this was the change that solved the problem (micro instances on amazon tend to have large cpu steal time)
2- tune the mysql database server configuration and my apache configuration to use a lot less memory.
tuning guide for a low memory situation such as this one: http://www.narga.net/optimizing-apachephpmysql-low-memory-server/ (But don't use the suggestion of MyISAM tables - horrible...)
this 2 options will make the problem much much less happening .. I am still looking for better solution to close the process that are done and kill the ones that hang in there .

w3wp.exe consuming large amount of RAM

I noticed the other day that it seems like all the w3wp.exe running on my server are consuming way more RAM than I would expect. So I created test web application with a single aspx page and then a vanilla global.asax page as well (default methods, no additional code). I then deployed that site to IIS6 with a target framework of 4.0 built in release mode with debug set to "false". The site is also set up under it's own application pool. I then used issapp.vbs to figure out which w3wp.exe this test site was running under. I was surprised to see that the single site with 1 page was using almost 40mb of ram.
40mb of RAM seems like a lot for a single page website. Is this normal and if so why? Is there anything I can do to reduce the memory footprint?
I also noticed that each time the default page was refreshed that the w3wp.exe grew a little bit more. Is IIS6 caching the same page over and over?
40Mb for a single AppDomain (irrespective of how many pages you have) is alright. Consider that .NET is a managed environment and the app pool contains a large majority of the logic and heap for serving requests. In many cases it also will be less eager to free up RAM, if the RAM is not under contention (demanded by other parts of the system).
You can share app pools amongst different websites, and the "overhead" of the sandbox becomes less proportionate to your perception of what is acceptable. However, I don't think this is bad at all.
You're trying to optimize the case where you have no load and no clients. That really isn't sensible. Optimize for a realistic use case.
You don't build a factory and then try to get it to make one item as efficiently as possible. You try to get it to crank them out efficiently by the dozens.
This is not really an issue, since it is quite normal. Your .NET Framework is getting loaded and hence the size. When a real issue happens, you would have to use certain post production or profiling tools to get it fixed. Sharing some articles for memory and post production debugging just in case you need it in future.
http://msdn.microsoft.com/en-us/magazine/cc188781.aspx
http://aspalliance.com/1350_Post_Production_Debugging_for_ASPNET_Applications__Part_1.all
http://blogs.msdn.com/b/tess/archive/2006/09/06/net-memory-usage-a-restaurant-analogy.aspx
http://blogs.msdn.com/b/tess/archive/2008/09/12/asp-net-memory-issues-high-memory-usage-with-ajaxpro.aspx

Rails controllers, safe to fork a block and then return?

Short and simple question:
fork { something_that_takes_a_few_seconds_and_doesnt_concern_the_user }
respond_to ...
Is there any reason not to do this sort of thing in a rails app? In PHP we're currently relying on external queueing systems like beanstalk or Amazon's SQS coupled with an asynchronous task worker that's pulling things off the queue to run in the background. A simple fork would fit better in many cases, depending on the complexity of the task.
Yes, if you get too many requests or this process takes longer than expected you can easily blast your CPU and memory and the Rails app is going to start raise out of memory errors.
There are many dead simple background workers solutions for Rails apps, like Resque, and you can benefit greatly in using such a solution instead of forking, as it would always happen out of your webapp environment and even if something nasty happens it would be confined to the worker machine (and not your application server).
Also, I've seen a lot of people saying that once you move to a virtualized word, memory fragmentation (due to spawning and killing many processes) might take their toll and make you go slower.