Apache processes are increasing to max everytime even after restart - apache

We are facing problem with one of our Drupal sites hosted in apache. Today suddenly, there are 150 httpd2-prefork processes running. This causes site to be down. After every apache restart it comes up for a short while. Then again, number of processes threads grows to max and site goes down. Can anyone else help here?

Look at your server logs and see who your visitors are (real people? bots?), where they're coming from (IP), and what pages they're visiting.
If it's just a few offenders, you can try blocking them. If it's a distributed attack, it's going to be more difficult.
Do you have any modules installed to help block bad people/things? I recommend: Honeypot and Bad Behavior.

Related

controlling number of concurrent connection apache server

I'm using apache with ldirector i'm facing some issues during load times when google, bing crawlers hit my site it makes apache to choke due to which my server's cup useage went to 100% utlization. after this i have to stop apache and monitor load manually i want to automate all this scenario. here is what i want when ever load comes on apache it normalizes server according to given settings and if cpu usage goes high it should not be exceded to given cpu usage limit.
I want to control all this via shell script, please give suggestions.

The Node.js event loop - nginx/apache

Both nginx and Node.js have event loops to handle requests. I put nginx in front of Node.js as has been recommended here
Using Node.js only vs. using Node.js with Apache/Nginx
with the setup shown here
Node.js + Nginx - What now?
How do the two event loops play together? Is there any risk of conflicts between the two? I wonder because Nginx may not be able to handle as many events per second as Node.js or vice versa. For example, if Nginx can handle 1000 events per second but node.js only 500, won't that cause issues? (I have no idea if 1000,500 are reasonable orders of magnitude, you could correct me on that.)
What about putting Apache in front of Node.js? Apache has no event loop. Just threads. So won't putting Apache in front of Node.js defeat the purpose?
In this 2010 talk, Node.js creator Ryan Dahl had vision to get rid of nginx/apache/whatever entirely and make node talk directly to the internet. When do you think this will be reality?
Both nginx and Node use an asynchronous and event-driven approach. The communication between them will go more or less like this:
nginx receives a request
nginx forwards the request to the Node process and immediately goes back to wait for more requests
Node receives the request from nginx
Node handles the request with minimal CPU usage, until at some point it needs to issue one or more I/O requests (read from a database, write the response, etc). At this point it launches all these I/O requests and goes back to wait for more requests.
The above can repeat lots of times. You could have hundreds of thousands of requests all in a non-blocking wait state where nginx is waiting for Node and Node is waiting for I/O. And while this happens both nginx and Node are ready to accept even more requests!
Eventually async I/O started by the Node process will complete and a callback function will get invoked.
If there are still I/O requests that haven't completed for this request, then Node goes back to its loop one more time. It can also happen that once an I/O operation completes this data is consumed by the Node callback and then new I/O needs to happen, so Node can start more async I/O requests before going back to the loop.
Eventually all I/O operations started by Node for a particular request will be complete, including those that write the response back to nginx. So Node ends this request, and then as always goes back to its loop.
nginx receives an event indicating that response data has arrived for a request, so it takes that data and writes it back to the client, once again in a non-blocking fashion. When the response has been written to the client and event will trigger and nginx will then end the request.
You are asking about what would happen if nginx and Node can handle a different number of maximum connections. They really don't have a maximum, the maximum in general comes from operating system configuration, for example from the maximum number of open handles the system can have at a time or the CPU throughput. So your question does not really apply. If the system is configured correctly and all processes are I/O bound, neither nginx or Node will ever block.
Putting Apache in front of Node will only work well if you can guarantee that your Apache never blocks (i.e it never reaches its maximum connection limit). This is hard/impossible to achieve for large number of connections, because Apache uses an individual process or thread for each connection. nginx and Node scale really well, Apache does not.
Running Node without another server in front works fine and it should be okay for small/medium load sites. The reason putting a web server in front of it is preferred is that web servers like nginx come with features that Node does not have and you would need to implement yourself. Things like caching, load balancing, running multiple apps from the same server, etc.
I think your questions have been largely covered by some of the others answers, but there are a few pieces missing, and some that I disagree with, so here are mine:
The event loops are isolated from each other at the process level, but do interact. The issues you're most likely to encounter are around the configuration of nginx response buffers, chunked data, etc. but this is optimisation rather than error resolution.
As you point out, if you use Apache you're nullifying the benefit of using Node.js, i.e. massive concurrency and websockets. I wouldn't recommend doing that.
People are already using Node.js at the front of their stack. Searching for benchmarks returns some reasonable-looking results in Node's favour, so performance to my mind isn't an issue. However, there are still reasons to put Nginx in front of Node.
Security - Node has been given increasing scrutiny, but it's still young. You may not have problems here, but caution is often your friend.
Training - Ops staff that you hire will know how to manage Nginx, but the configuration and management of your custom Node app will only ever be understood by those people your developers successfully communicate it to. In some companies this is nobody.
Operational Flexibility - If you reach scale you might want to split out the serving of static content, purely to reduce the load on your app servers. You might want to split content amongst different domains and have it managed separately, or have different SSL or proxying behaviour for different domains or URL patterns. These are the things that are easy for Ops guys to configure in Nginx, but you'd have to code manually in a Node app.
The event loops are independent. Event loops are implemented at the application level, so neither cares what sort of architecture the other uses.
NodeJS is good at many things, but there are some places where it still falters. Once example is serving static files. At the moment, nodejs performs fairly poorly in this test, so having a dedicated web server for your static files greatly improves response time. Also, nodejs is still in its infancy, and has not been "tested and hardened" in the matters of security like Apache on nginX.
It'll take a long time for people to consider fronting nodejs all by itself. The cluster module is a step in the right direction, but it'll take a long time even after it reaches v1 before it happens.
Both event loops are unrelated. They don't play together.
Yes, it is pretty useless. Apache is not a load balancer.
What Ryan Dahl said may be applicable already. The limit of concurrent users is definitely higher than that of Apache. Before node.js websites with fair amount of concurrent users had to use nginx to balance the load. For small to medium sized businesses it can be done with node.js alone. But ruling out nginx completely will take time. Let node.js be stable before it can follow this ambitious dream.

Nginx subdomain down issue

I have two rails app which are running on linode. OS is ubuntu, nginx server. The subdomain instance giving problem. It is getting down just after 1 day. On restarting the server, it is working fine.
The error log says- "*1 upstream timed out (110: Connection timed out) while reading response header from upstream,".
I googled for the problem and found that increasing proxy_read_timeout value will solve the problem. But I am unable to find the reason.
Is there an issue of over utilization of resources? I have 24 GB of storage and 512 MB of RAM as shown in linode manager. I have 10 cron jobs in total (5 in each app). They all start at the same time. Can that be the issue?
Please tell me the reason and solution for it.
It definitely sounds like a resource issue... Or perhaps something else is killing / hogging your app. Generally the upstream request is a request from the web server to the app server, so if your app is doing something wonky, this would cause the timeout to occur. I'm not sure what the default timeout is, but I'm guessing it's rather short. Increasing the timeout would at least buy you time to look at the system resources the process stack to try to figure out what's going on.

What are the most effective tools to manage multiple apache httpd instances?

We have many Apache instances all over our intranet. Some instances run on the same machine. Some instances run on different machines.
I need a tool that can manage these instances from one central location.
Get CPU stats
Get Connection stats
Stop/start Apache instances
Get access to error log
I looked at webmin, but the documentation isn't too clear how it works. Without installing it I'd have trouble getting it to go.
Any recommendations?
I've never used it myself, but I've seen people with monitoring requirements be very happy with Cacti. Besides general health monitoring like CPU stats it has an extremely simple Apache stats plugin that might do what you need:
Script to get the requests per second and the requests currently being processed from
an Apache webserver.
maybe you can put something together with that.

Low latency web server/load balancer for the non-Twitters of the world

Apache httpd has done me well over the years, just rock solid and highly performant in a legacy custom LAMP stack application I've been maintaining (read: trying to escape from)
My LAMP stack days are now numbered and am moving on to the wonderful world of polyglot:
1) Scala REST framework on Jetty 8 (on the fence between Spray & Scalatra)
2) Load balancer/Static file server: Apache Httpd, Nginx, or ?
3) MySQL via ScalaQuery
4) Client-side: jQuery, Backbone, 320 & up or Twitter Bootstrap
Option #2 is the focus of this question. The benchmarks I have seen indicate that Nginx, Lighthttpd, G-WAN (in particular) and friends blow away Apache in terms of performance, but this blowing away appears to manifest more in high-load scenarios where the web server is handling many simultaneous connections. Given that our server does max 100gb bandwidth per month and average load is around 0.10, the high-load scenario is clearly not at play.
Basically I need the connection to the application server (Jetty) and static file delivery by the web server to be both reliable and fast. Finally, the web server should double duty as a load balancer for the application server (SSL not required, server lives behind an ASA). I am not sure how fast Apache Httpd is compared to the alternatives, but it's proven, road warrior tested software.
So, if I roll with Nginx or other Apache alternative, will there be any difference whatsoever in terms of visible performance? I assume not, but in the interest of achieving near instant page loads, putting the question out there ;-)
if I roll with Nginx or other Apache alternative, will there be any difference whatsoever in terms of visible performance?
Yes, mostly in terms of latency.
According to Google (who might know a thing or tow about latency), latency is important both for the user experience, high search-engine rankings, and to survive high loads (success, script kiddies, real attacks, etc.).
But scaling on multicore and/or using less RAM and CPU resources cannot hurt - and that's the purpose of these Web server alternatives.
The benchmarks I have seen indicate that Nginx, Lighthttpd, G-WAN (in particular) and friends blow away Apache in terms of performance, but this blowing away appears to manifest more in high-load scenarios where the web server is handling many simultaneous connections
The benchmarks show that even at low numbers of clients, some servers are faster than others: here are compared Apache 2.4, Nginx, Lighttpd, Varnish, Litespeed, Cherokee and G-WAN.
Since this test has been made by someone independent from the authors of those servers, these tests (made with virtualization and 1,2,4,8 CPU Cores) have clear value.
There will be a massive difference. Nginx wipes the floor with Apache for anything over zero concurrent users. That's assuming you properly configure everything. Check out the following links for some help diving into it.
http://wiki.nginx.org/Main
http://michael.lustfield.net/content/dummies-guide-nginx
http://blog.martinfjordvald.com/2010/07/nginx-primer/
You'll see improvements in terms of requests/second but you'll also see significantly less RAM and CPU usage. One thing I like is the greater control over what's going on with a more simple configuration.
Apache made a claim that apache 2.4 will offer performance as good or better than nginx. They made a bold claim calling out nginx and when they made that release it kinda bit them in the ass. They're closer, sure, but nginx still wipes the floor in almost every single benchmark.