I'm working on a debian server with tomcat 7 and java 1.7. This is an application that recieves several TCP connections, each TCP connection is an open file by the java process.
Looking at /proc/pid of java/fd I found that, sometimes, the number of open files exceeds 1024, when this happens, I find in catalina.out log the stacktrace _SocketException: Too many open files_
Everything I find about this error, people refer to the ulimit, I have already changed this thing and the error keeps happening. Here is the config:
at /etc/security/limits.conf
root soft nofile 8192
root hard nofile 8192
at /etc/sysctl.conf
fs.file-max = 300000
the ulimit -a command returns:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 16382
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
But, when I check the limits of the java process, it's only 1024
at /proc/pid of java/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 32339 32339 processes
Max open files 1024 1024 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 32339 32339 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
How can I increase the number of Max open files for the java process?
I just put the line ulimit -n 8192 inside the catalina.sh, so when I do the catalina start, java runs with the specified limit above.
The ulimit values are assigned at session start-up time, so changing /etc/security/limits.conf will not have any effect to processes that are already running. Non-login processes will inherit the ulimit values from their parent, much like the inheritance of environment variables.
So after changing /etc/security/limits.conf, you'll need to logout & login (so that your session will have the new limits), and then restart the application. Only then will your application be able to use the new limits.
Setting higher ulimit maybe completely unnecessary depending on the workload/traffic that the tomcat/httpd handles. Linux creates a file descriptor per socket connection, so if tomcat is configured to use mod_jk/ajp protocol as a connector then you may want to see if the maximum allowed connection is too high or if the connectionTimeout or keepAliveTimeout is too high. These parameters play a huge role in consumption of OS file descriptors. Sometimes it may also be feasible to limit the number of apache httpd/nginx connection if tomcat is fronted by a reverse proxy. I once reduce serverLimit value in httpd to throttle incoming requests during gaterush scenario. All in all adjusting ulimit may not be a viable option since your system may end up consuming however many you throw at it. You will have to come up with a holistic plan to solve this problem.
Related
eventlet==0.33.0
Flask-SocketIO==5.1.1
I'm using Flask-SocketIO and to test the number of clients I can connect to it I'm running a loop that connects clients up to 5000 clients. I've installed eventlet so that Flask-SocketIO will use eventlet.
When running locally I can connect 5000 clients without issue.
When I run the same code against the server (EC2 Ubuntu 20.04 instance) I get to around 1136 clients and then all connections after that fail.
I have set the max_size property to 100000. This properly has an impact because if I change it to 10 on either local or server I only get up to 5 clients connecting. I assume it requires two client connections for each socket.io client.
I have set the file descriptor limits with sudo nano /etc/security/limits.conf as follow:
* soft nproc 100000
* hard nproc 100000
* soft nofile 100000
* hard nofile 100000
I can confirm this with ulimit -a which gives output:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1806
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 100000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 100000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I am running the app on the server on a cron reboot as follows:
#reboot sudo python3 /home/ubuntu/app/app.py &
I assume that's the best way to do it.
Incidentally, this seems to be isolated to eventlet running on the web server. If I run with the Flask development server I can connect 5000 instances without issue.
Can anyone shed any light on why the server stops accepting connections on the server when it works locally please? Thanks.
I am evaluating rabbitmq as mqtt broker and currently doing benchmark tests to check performance. Using benchmark tool https://github.com/takanorig/mqtt-bench I tried publishing 1 byte messages for 10000 clients. The memory consumption by rabbitmq for these numbers is 2gb and it's the same for 10000 subscriptions as well. Here are the consumption details provided by rabbitmq-diagnostics memory_breakdown
connection_other: 1.1373 gb (55.89%)
other_proc: 0.3519 gb (17.29%)
allocated_unused: 0.1351 gb (6.64%)
other_system: 0.0706 gb (3.47%)
quorum_ets: 0.0675 gb (3.32%)
plugins: 0.0555 gb (2.73%)
binary: 0.0482 gb (2.37%)
mgmt_db: 0.035 gb (1.72%)
This means that the broker server is taking 200KB per connection, which seems to me a big number, considering that we need to scale our system to 1million connections in future and then we would need to provide around 200gb for just rabbitmq.
I have tried playing with some settings in my conf file and docker command
mqtt.allow_anonymous=false
ssl_options.cacertfile=/certs/ca_certificate.pem
ssl_options.certfile=/certs/server_certificate.pem
ssl_options.keyfile=/certs/server_key.pem
ssl_options.verify=verify_peer
ssl_options.fail_if_no_peer_cert=false
mqtt.listeners.ssl.default=8883
mqtt.listeners.tcp.default=1883
web_mqtt.ws_path = /mqtt
web_mqtt.tcp.port = 15675
collect_statistics_interval = 240000
management.rates_mode = none
mqtt.tcp_listen_options.sndbuf = 1000
mqtt.tcp_listen_options.recbuf = 2000
mqtt.tcp_listen_options.buffer = 1500
Below is the docker command where I've tried to reduce tcp_rmem and tcp_wmem size as well
docker run -d --rm -p 8883:8883 -p 1883:1883 -p 15675:15675 -p 15672:15672 -v /home/ubuntu/certs:/certs --sysctl net.core.somaxconn=32768 --sysctl net.ipv4.tcp_max_syn_backlog=4096 --sysctl net.ipv4.tcp_rmem='1024 4096 500000' --sysctl net.ipv4.tcp_wmem='1024 4096 500000' -e RABBITMQ_VM_MEMORY_HIGH_WATERMARK=0.9 -e RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+P 2000000" -t probusdev/hes-rabbitmq:latest
Are there any other settings I can try to reduce the memory consumption?
Update: I used the same benchmark test against Emq broker and it took only 400mb for the same numbers. So is Rabbitmq mqtt more memory consuming then Emq?
1167:M 26 Apr 13:00:34.666 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1167:M 26 Apr 13:00:34.667 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
1167:M 26 Apr 13:00:34.667 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1167:M 26 Apr 13:00:34.685 # Creating Server TCP listening socket 192.34.62.56:6379: Name or service not known
1135:M 26 Apr 20:34:24.308 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1135:M 26 Apr 20:34:24.309 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
1135:M 26 Apr 20:34:24.309 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1135:M 26 Apr 20:34:24.330 # Creating Server TCP listening socket 192.34.62.56:6379: Name or service not known
Well, it's a bit late for this post, but since I just spent a lot of time(the whole night) to configure a new redis server 3.0.6 on ubuntu 16.04. I think I should just write down how I do it so others don't have to waste their time...
For a newly installed redis server, you are probably going to see the following issues in redis log file which is /var/log/redis/redis-server.log
Maximum Open Files
3917:M 16 Sep 21:59:47.834 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
3917:M 16 Sep 21:59:47.834 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
3917:M 16 Sep 21:59:47.834 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
I have seen a lot of posts telling you to modify
/etc/security/limits.conf
redis soft nofile 10000
redis hard nofile 10000
or
/etc/sysctl.conf
fs.file-max = 100000
That might work in ubuntu 14.04, but it certainly not works in ubuntu 16.04. I guess it has something to do with changing from upstart to systemd, but I am no expert of linux kernel!
To fix this you have to do it the systemd way
/etc/systemd/system/redis.service
[Service]
...
User=redis
Group=redis
# should be fine as long as you add it under [Service] block
LimitNOFILE=65536
...
Then you must daemon reload and restart the service
sudo systemctl daemon-reload
sudo systemctl restart redis.service
To check if it works, try to cat proc limits
cat /run/redis/redis-server.pid
cat /proc/PID/limits
and you will see
Max open files 65536 65536 files
Max locked memory 65536 65536 bytes
At this stage, the maximum open file is solved.
Socket Maximum Connection
2222:M 16 Sep 20:38:44.637 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
Memory Overcommit
2222:M 16 Sep 20:38:44.637 # Server started, Redis version 3.0.6
2222:M 16 Sep 20:38:44.637 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
Since these two are related, we will solve it at once.
sudo vi /etc/sysctl.conf
# Add at the bottom of file
vm.overcommit_memory = 1
net.core.somaxconn=1024
Now for these configs to work, you need to reload the config
sudo sysctl -p
Transparent Huge Pages
1565:M 16 Sep 22:48:00.993 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
To permanently solve this, follow the log's suggestion, and modify rc.local
sudo vi /etc/rc.local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
This require you to reboot, backup your data or do anything you need before you actually do it!!
sudo reboot
Now check you redis log again, you should have a redis server without any errors or warnings.
Redis will never change the maximum open files.
This is a OS configuration and it can be configured on a per user basis also. The error is descriptive and tells you: "increase 'ulimit -n'"
You can refer to this blog post on how to increase the maximum open files descriptors:
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
You just need this command in console:
sudo ulimit -n 65535
I am using HBase 0.98.3 in standalone mode. Is there a way for restarting HBase when it crashes? I have tried with supervisord with no success.
Thank you.
I use upstart to achieve this, in an ubuntu setting.
Here's my recipe, but YMMV.
# hbase-master - HBase Master
#
description "HBase Master"
start on (local-filesystems
and net-device-up IFACE!=lo)
stop on runlevel[!2345]
respawn
console log
setuid hbase
setgid hbase
nice 0
oom score -700
limit nofile 32768 32768
limit memlock unlimited unlimited
exec /usr/lib/hbase/bin/hbase master start
My web app runs out of connection slots to the database when enough requests occur. This is despite setting it to run what seems to be a conservative size of connection pool, and I have limited the number of processes, threads. Am I correct that the connection pool is shared across threads, but not processes? And what is a good strategy for choosing a good combination of connection pool size, number of processes and threads, whilst avoiding running out of DB connections?
Error I'm seeing:
OperationalError: (OperationalError) FATAL: remaining connection
slots are reserved for non-replication superuser connections
/etc/postgresql/9.1/main/postgresql.conf:
max_connections = 100
app.ini:
sqlalchemy.pool_size = 1
sqlalchemy.max_overflow = 5
Apache config:
WSGIDaemonProcess test1 processes=5 threads=10 maximum-requests=10000
WSGIProcessGroup test1
Looking at the processes:
$ ps aux |grep postgres |wc
This can increase to 102 under a reasonable load and stay there, despite many connections being idle, and errors come in.