Cannot allocate memory: fork: Unable to fork new process? - apache

We have our hosting in aws. Recently after moving our blog from wordpress to aws, we are experiencing noticeable delay in server response time. Mainly while accessing the blog. Below are the logs from the error_log file,
[Wed Feb 25 06:10:10 2015] [error] (12)Cannot allocate memory: fork: Unable to fork new process
[Wed Feb 25 06:12:22 2015] [error] (12)Cannot allocate memory: fork: Unable to fork new process
[Wed Feb 25 06:12:36 2015] [error] (12)Cannot allocate memory: fork: Unable to fork new process
[Wed Feb 25 06:12:50 2015] [error] (12)Cannot allocate memory: fork: Unable to fork new process
[Wed Feb 25 06:13:35 2015] [error] (12)Cannot allocate memory: fork: Unable to fork new process
[error] (12)Cannot allocate memory: fork: Unable to fork new process
[Wed Feb 25 06:27:14 2015] [error] (12)Cannot allocate memory: fork: Unable to fork new process
We increased the memory size from 256 to 512 mb in php.ini file. But, still the issue exist.
We also changed the KeepAlive as On. Still it doesn't resolve. Any suggestions / solutions would be of great help.

I've face that problem either while hosting a java app with jenkins, mysql & tomcat on ubuntu on an vm of AWS.
First steps I used to solve the problem with restarting a vm.
AWS doesn't give swap memory on a harddrive by default, so you'd better to make it with your hands. How to do this you can find here. Need to mention: the solution with swap zone (have no idea why) haven't work for me, I had to create a swap file.
Good luck to you!

I had same problem to fix it there is 2 options:
1- move from micro instances to small and this was the change that solved the problem (micro instances on amazon tend to have large cpu steal time)
2- tune the mysql database server configuration and my apache configuration to use a lot less memory.
tuning guide for a low memory situation such as this one: http://www.narga.net/optimizing-apachephpmysql-low-memory-server/
(But don't use the suggestion of MyISAM tables - horrible...)
this 2 options will make the problem much much less happening ..
I am still looking for better solution to close the process that are done and kill the ones that hang in there .

Changed Apache's prefork MPM into the httpd.conf
These are the values I ended up using:
StartServers 1
MinSpareServers 1
MaxSpareServers 5
ServerLimit 16
MaxClients 16
MaxRequestsPerChild 0
ListenBacklog 100
Then, try to desactivate some modules php with
sudo a2dismod name_of_module

Related

AWS ElasticBeanstalk periodically goes down

I noticed lately that my laravel project in an AWS Elasticbeanstalk setup has been acting strangely. The server would go down in a few minutes. In a t3.small, it goes down in every 50 minutes. The health tab says that the memory is exhausted or something. It will go "Severe" for about 5-10 minutes then goes back without me doing anything. Basically just a whole zigzag in the monitoring. In the t3.nano it goes down at approximately every 5 minutes.
Here are some things that I've done that I suspect to be the cause
I've re-enabled pusher for broadcasting. The project has a pusher setup before and it was working fine. However, I disabled (removed all parts that uses it) as I don't need it yet. I re-enabled it and the problem occured
I've played with AWS WAF and Cloudfront. So I was studying those two parts before and played with some settings, however, I can't remember using any of them related to my EBS application. I did removed everything I've added on WAF and Cloudfront.
Here are some facts:
Whenever I remove the container commands creating the schedule:run and queue:work, it becomes completely fine. Totally "OK" status even
if I simulate sending hundreds of requests per second.
I tried scaling it to 3 instances and the result is still the same, however, the downtime just becomes slower
It gives a 503 error code whenever it's down
The setup for EBS is PHP 7.2 running on 64bit Amazon Linux/2.8.4
I'm testing the job and queue by sending 1 Pusher message every minute. It doesn't do anything except send the current time. This is
also the only cronjob running.
The cronjob works and I can also receive the Pusher messages, except during downtime
Here's an observation that I had with the logs
- There's an "internal dummy connection" related to Apache. The time when it is logged is identical to the time that the downtime occurs.
I've tried every hint on the logs, from juggling different settings on the cronjob and other possible causes. I've also tried asking my peers but no one has encountered such error before. In fact, they tested my cronjob and it's working properly for them.
I also have this error in the /var/log/httpd/error_log
[Fri Nov 23 19:07:35.208657 2018] [suexec:notice] [pid 3142] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Fri Nov 23 19:07:35.228633 2018] [http2:warn] [pid 3142] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[Fri Nov 23 19:07:35.228644 2018] [http2:warn] [pid 3142] AH02951: mod_ssl does not seem to be enabled
[Fri Nov 23 19:07:35.229188 2018] [lbmethod_heartbeat:notice] [pid 3142] AH02282: No slotmem from mod_heartmonitor
[Fri Nov 23 19:07:35.267841 2018] [mpm_prefork:notice] [pid 3142] AH00163: Apache/2.4.34 (Amazon) configured -- resuming normal operations
[Fri Nov 23 19:07:35.267860 2018] [core:notice] [pid 3142] AH00094: Command line: '/usr/sbin/httpd -D FOR
This is a case of running into surprises with the CPU credit and throttling restrictions with t2/t3.* EC2 instances. 1 CPU credit allows a (t2/t3) instance to operate at 100% CPU for 1 minute. All t2/t3.* instance CPU credits are replenished at a constant rate per hour for running instances (this rate depends on the instance class). Hence, prolonged periods of load (above a certain threshold) will gradually deplete these credits, leading to the states that you have described above.
It's advised to use higher tier instances (m3.medium and above) to sustain production workloads consistently. Placing a load balancer in front of multiple instances is also a great way to maintain availability.
More information on the same can be found here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-credits-baseline-concepts.html

Failed opening .rdb for saving: Permission denied - started after a while of running successfully

I have had a node web service running successfully on an aws ubuntu server for over a month, with the requests cached using redis.
Yesterday I started getting the following error from some of my routes:
MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
I was able to stop the error occurring by using:
config set stop-writes-on-bgsave-error no
as suggested in the answers to this question, but it doesn't actually solve the underlying problem.
To find the underlying problem I checked the logs and found the following had started happening:
[1105] 09 Aug 13:17:14.800 - 0 clients connected (0 slaves), 797680 bytes in use
[1105] 09 Aug 13:17:15.101 * 1 changes in 900 seconds. Saving...
[1105] 09 Aug 13:17:15.101 * Background saving started by pid 28090
[28090] 09 Aug 13:17:15.101 # Failed opening .rdb for saving: Permission denied
[1105] 09 Aug 13:17:15.201 # Background saving error
Over the weekend no one had been using the server, but before the weekend the logs were fine, and we were getting no errors:
[12521] 06 Aug 04:49:27.308 - 0 clients connected (0 slaves), 803352 bytes in use
[12521] 06 Aug 04:49:29.012 * 1 changes in 900 seconds. Saving...
[12521] 06 Aug 04:49:29.012 * Background saving started by pid 26663
[26663] 06 Aug 04:49:29.014 * DB saved on disk
[26663] 06 Aug 04:49:29.014 * RDB: 2 MB of memory used by copy-on-write
[12521] 06 Aug 04:49:29.112 * Background saving terminated with success
As I said, no one has touched this server in the intervening time.
Looking around for people having the same problem I found this question. I checked the ownership and permissions on the directory and db file as suggested in the answers there:
drwxr-xr-x 2 redis redis 26 Aug 6 06:55 redis
-rw-r--r-- 1 redis redis 18 Aug 6 06:55 dump-6379.rdb
The permissions and ownership both look ok to me, but I have noticed that the date on the file and folder is between the last time I saw the service working and the first time it failed. Unfortunately that hasn't really helped me with what to do next and I am at a bit of a loss.
I am looking for suggestions for next steps to find the cause of the problem, or at least a way of making redis able to write again.

Apache Tomcat and Mod_jk

We have been running Apache with Tomcat using mod_jk for about a month now with out issues. This morning I have started seeing the error below in the mod_jk log files.
I am fairly new to using mod_jk and am not sure how to increase the number of connections, see the number of active connections and/or kill of connections that are idle or dead.
Any ideas/help would be much appreciated.
[Thu Sep 19 11:02:42 2013] [1644:11984] [warn] ajp_get_endpoint::jk_ajp_common.c (3177): Unable to get the free endpoint for worker Worker1 from 10 slots
[Thu Sep 19 11:02:42 2013] [1644:11984] [error] jk_handler::mod_jk.c (2726): Could not get endpoint for worker=Worker1
[Thu Sep 19 11:02:42 2013] [1644:11984] [info] jk_handler::mod_jk.c (2788): Service error=0 for worker=Worker1
So it turns out this issue was a by product of another configuration issue. We had different Railo contexts configure to point to the same set of shared directories, some of the context's mapped to directories that were within the root context which caused Java thread locks

What's the hard limit for apache ThreadsPerChild parameter in httpd.conf?

i'm using the ibm http server which is based on Apache. When i tried to increase the parameter ThreadsPerChild more than 1000, the http server always only start up 1000 worker threads. Below is the related information:
error log:
[Thu Jul 05 10:50:45 2012] [debug] mpm_winnt.c(564): Child 9040: retrieved 2 listeners from parent
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Acquired the start mutex.
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Starting 1000 worker threads.
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Starting thread to listen on port 81.
[Thu Jul 05 10:50:45 2012] [notice] Child 9040: Starting thread to listen on port 80.
httpd.conf
<IfModule mpm_winnt.c>
ThreadLimit 2048<br>
ThreadsPerChild 2000
MaxRequestsPerChild 0
</IfModule>
IHS 7.0.0.0
OS winNT
BTW, another concern with ThreadsPerChild is whether one Apache thread handles one client connection here, or one thread can take care of more than one client connection?
Please help me out.
Thanks very much
On limits of ThreadsPerChild setting, quoting from IBM HTTP Server Performance Tuning ;
On 64-bit Windows OS'es, each instance of is limited to approximately
2500 ThreadsPerChild. On 32-bit Windows, this number is closer to
5000. These numbers are not exact limits, because the real limits are the sum of the fixed startup cost of memory for each thread + the
maximum runtime memory usage per thread, which varies based on
configuration and workload. Raising ThreadsPerChild and approaching
these limits risks child process crashes when runtime memory usage
puts the process address space over the 2GB or 3GB barrier.
The interesting to note here is ThreadsPerChild is not the only parameter for tuning concurrent connections to IHS. You may find information about other parameters (like maxClients) and tuning methodology at the following link;
Tuning IBM HTTP Server to maximize the number of client connections to WebSphere Application Server

apache mod_fcgid problems

I have a problem on multiple servers than use Apache module mod_fcgid to serve a cgi script that processes the request (ticket validation and similar processing) then serves files on the server based on the result of the processing.
I keep getting the following errors repeatedly in the logs:
[Mon Jan 30 23:11:41 2012] [warn] [client 95.35.160.193] mod_fcgid: error reading data, FastCGI server closed connection
[Mon Jan 30 23:11:41 2012] [warn] [client 95.35.160.193] (32)Broken pipe: mod_fcgid: ap_pass_brigade failed in handle_request_ipc function
[Mon Jan 30 23:13:34 2012] [warn] [client 37.8.52.128] mod_fcgid: can't apply process slot for /var/www/cgi-bin/assetx.fcgi
These problems cause the server to be slow and other times result in service temporarily unavailable error.
The servers have large traffic on them, I have currently configured the following fcgi directives as below:
FcgidMaxRequestsPerProcess 0
FcgidMaxProcesses 300
FcgidMinProcessesPerClass 0
FcgidIdleTimeout 240
FcgidIOTimeout 240
FcgidBusyTimeout 300
the average load on the servers is normal, the number of processes is on average 250 processes.
I have done research for days about this issue, some say it is a permission problem, I've followed their suggestion, didn't help. I tried to tune the parameters above, these are the final values I tried, but they didn't work as well. I am also trying out nginx to be used instead of apache but I cannot find a suitable way to run the cgi script with this high load on the server using nginx.
What can I do to fix this problem?
Your app is dying before Apache can contact it successfully. The answer is to find out why the app is dying.
FastCGI process should never die or quit, even in an error condition. Apache expects FastCGI script to just keep on being there.
You mention you have a cgi script. How did you modify it to support FastCGI?
Usually you need to switch to something like CGI::Fast, remove all calls to die and exit, and refactor your script to run using the CGI::Fast while loop.