Restarting HBase automatically when it crashes - crash

I am using HBase 0.98.3 in standalone mode. Is there a way for restarting HBase when it crashes? I have tried with supervisord with no success.
Thank you.

I use upstart to achieve this, in an ubuntu setting.
Here's my recipe, but YMMV.
# hbase-master - HBase Master
#
description "HBase Master"
start on (local-filesystems
and net-device-up IFACE!=lo)
stop on runlevel[!2345]
respawn
console log
setuid hbase
setgid hbase
nice 0
oom score -700
limit nofile 32768 32768
limit memlock unlimited unlimited
exec /usr/lib/hbase/bin/hbase master start

Related

Why redis can not set maximum open file

1167:M 26 Apr 13:00:34.666 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1167:M 26 Apr 13:00:34.667 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
1167:M 26 Apr 13:00:34.667 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1167:M 26 Apr 13:00:34.685 # Creating Server TCP listening socket 192.34.62.56​​:6379: Name or service not known
1135:M 26 Apr 20:34:24.308 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1135:M 26 Apr 20:34:24.309 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
1135:M 26 Apr 20:34:24.309 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1135:M 26 Apr 20:34:24.330 # Creating Server TCP listening socket 192.34.62.56​​:6379: Name or service not known
Well, it's a bit late for this post, but since I just spent a lot of time(the whole night) to configure a new redis server 3.0.6 on ubuntu 16.04. I think I should just write down how I do it so others don't have to waste their time...
For a newly installed redis server, you are probably going to see the following issues in redis log file which is /var/log/redis/redis-server.log
Maximum Open Files
3917:M 16 Sep 21:59:47.834 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
3917:M 16 Sep 21:59:47.834 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
3917:M 16 Sep 21:59:47.834 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
I have seen a lot of posts telling you to modify
/etc/security/limits.conf
redis soft nofile 10000
redis hard nofile 10000
or
/etc/sysctl.conf
fs.file-max = 100000
That might work in ubuntu 14.04, but it certainly not works in ubuntu 16.04. I guess it has something to do with changing from upstart to systemd, but I am no expert of linux kernel!
To fix this you have to do it the systemd way
/etc/systemd/system/redis.service
[Service]
...
User=redis
Group=redis
# should be fine as long as you add it under [Service] block
LimitNOFILE=65536
...
Then you must daemon reload and restart the service
sudo systemctl daemon-reload
sudo systemctl restart redis.service
To check if it works, try to cat proc limits
cat /run/redis/redis-server.pid
cat /proc/PID/limits
and you will see
Max open files 65536 65536 files
Max locked memory 65536 65536 bytes
At this stage, the maximum open file is solved.
Socket Maximum Connection
2222:M 16 Sep 20:38:44.637 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
Memory Overcommit
2222:M 16 Sep 20:38:44.637 # Server started, Redis version 3.0.6
2222:M 16 Sep 20:38:44.637 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
Since these two are related, we will solve it at once.
sudo vi /etc/sysctl.conf
# Add at the bottom of file
vm.overcommit_memory = 1
net.core.somaxconn=1024
Now for these configs to work, you need to reload the config
sudo sysctl -p
Transparent Huge Pages
1565:M 16 Sep 22:48:00.993 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
To permanently solve this, follow the log's suggestion, and modify rc.local
sudo vi /etc/rc.local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
This require you to reboot, backup your data or do anything you need before you actually do it!!
sudo reboot
Now check you redis log again, you should have a redis server without any errors or warnings.
Redis will never change the maximum open files.
This is a OS configuration and it can be configured on a per user basis also. The error is descriptive and tells you: "increase 'ulimit -n'"
You can refer to this blog post on how to increase the maximum open files descriptors:
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
You just need this command in console:
sudo ulimit -n 65535

Why is redis zeroing my logstash list at about 1.85GB?

I have redis 3.0.2 running on CentOS 6 (64-bit) storing log entries for Logstash but every time it gets to about 1.85GB it zeros out the list. I had watch "redis-cli llen logstash | tee -a llen.log" running and captured this:
2823399
2827076
2831776
2836436
0
4470
8684
12531
17213
Any help understanding what is going on would be greatly appreciated.
Check your maxmemory and maxmemory-policy config to see if redis is performing eviction on logstash.
If it's not eviction, you might just have to use redis-cli monitor and bump your loglevel config up to verbose or debug to see what's actually happening.

Process Core dumps are not created after crash

I have configured system configurations to create process core dumps.
Below are my configurations.
/etc/sysctl.conf
kernel.core_uses_pid = 1
kernel.core_pattern = /var/core/core.%e.%p.%h.%t
fs.suid_dumpable = 2
/etc/security/limits.conf
* soft core unlimited
root soft core unlimited
Here are the steps which I am following to generate process coredumps.
1) I have restarted mysql service and executed command "kill -s SEGV <mysql_pid>" then I got the core dump file in /var/core location.
2) Then I have started my service mysql say "/etc/init.d/mysql start" or "service mysql start". Now if I give "kill -s SEGV <mysql_pid>" then core dump file is not getting created.
3) To get crash file again I have to restart the mysql service then only if I give "kill -s SEGV <mysql_pid>" i'm getting core dump file.
Can anyone please help me how to resolve this?
First of all, you can verify that core dumps are disabled for the MySQL process by running:
# cat /proc/`pidof -s mysqld`/limits|egrep '(Limit|core)'
Limit Soft Limit Hard Limit Units
Max core file size 0 unlimited bytes
The "soft" limit is the one to look for, zero in this case means core dumps are disabled.
Limits set in /etc/security/limits.conf by default only apply to programs started interactively. You may have to include 'ulimit -c unlimited' in the mysqld startup script to enable coredumps permanently.
If you're lucky, then you can enable coredumps for your current shell and restart the daemon using its init.d script:
# ulimit -c unlimited
# /etc/init.d/mysql restart
* Stopping MySQL database server mysqld [ OK ]
* Starting MySQL database server mysqld [ OK ]
* Checking for tables which need an upgrade, are corrupt
or were not closed cleanly.
# cat /proc/`pidof -s mysqld`/limits|egrep '(Limit|core)'
Limit Soft Limit Hard Limit Units
Max core file size unlimited unlimited bytes
As you can see, this works for MySQL on my system.
Please note that this won't work for applications like Apache, which call ulimit internally to disable core dumps, not for init.d script that use upstart.

How to sync time on host wake-up within VirtualBox?

I am running an Ubuntu 12.04-based box inside of Vagrant using VirtualBox. So far, everything is fine - except for one thing:
Let's assume that the VM is running. Then, the host goes to standby-mode. After waking it up again, the VM is still running, but its internal clock continues where it stopped when the host went down. So this basically means: Put the host to sleep for 15 minutes, wake it up again, then the VM's internal clock is 15 minutes late.
How can I fix this (setting the time manually is not an option for obvious reasons ;-))? Is there a way to run a script inside of a Vagrant VM whenever the host system changes its state?
I've read in the documentation that by default the VirtualBox Guest Additions sync the time with the host every 10 seconds. Apparently this is not happening, but I can not find any place where it is disabled. So any ideas?
PS: The Guest Additions are installed and match the version of VirtualBox being used.
The documentation lacks some details here.
What VirtualBox does every 10 seconds is just slight adjustement (something like 0.005 seconds). Only when the time difference reaches a threshold (20 minutes by default) a "real" resync is done.
You can reduce the thresold (i.e. to 10 seconds) with the following command:
VBoxManage guestproperty set <vm-name> "/VirtualBox/GuestAdd/VBoxService/--timesync-set-threshold" 10000
Summarizing answers of #zilupe and #Slobodan Kovacevic, solution is to add following to Vagrantfile:
config.vm.provider 'virtualbox' do |vb|
vb.customize [ "guestproperty", "set", :id, "/VirtualBox/GuestAdd/VBoxService/--timesync-set-threshold", 1000 ]
end
This will synchronize clocks each time when desync becomes > 1s (1000ms)
I give an other solution to sync time between guest & host without installing Virtualbox guest addition:
install ntp on your guest, and de-comment these lines in /etc/ntp.conf:
disable auth
broadcastclient
Then, restart ntp with service ntp restart
Active broadcast on your host:
For Linux users, edit your /etc/ntp.conf file and configure broadcast (you must adapt IP):
broadcast 192.168.123.255
For Windows users, activate the "Windows Time" service. You can then read this page to configure it to broadcast time
Then, restart time service on host.
For me to get timesync working I had to do this:
vboxmanage setextradata «machine-name» "VBoxInternal/Devices/VMMDev/0/Config/GetHostTimeDisabled" 0
It turns the timesync on. It was, for some reason, off.
I found a solution:
install ntpdate
add "s" permission for ntpdate, this allows non-root users to run ntpdate as root: sudo chmod u+s /usr/sbin/ntpdate
add one line in ~/.bashrc: ntpdate -u ntp.ubuntu.com
After that, each time you login to the linux system, the time will be sync once.
you can install the VirtualBox Guest Additions in the VM to sync the time automatically by VB.

Redis crashes instantly without error

I've got redis installed on my VM, and I haven't used it in a while. (Last I was using it, it did work, and now it doesn't.. nothing's changed in that time (about a month)). Needless to say I'm deeply confused but I'll post as much info as I can.
$ redis-server
Server starts, but throws a warning about overcommit memory being set to 0. I'm on a VM, so I can't change this setting from 0 to 1 if I wanted, which I wouldn't want to anyway for my purposes. I've written a custom redis.config file though, which I want it to use (and which I was using in the past), so starting it with the default config file doesn't do me much good. Let's try this again.
$ redis-server redis.config
$
Nothing. Silence. No error message, just didn't start.
$ nohup redis-server redis.config > nohup.out&
I get a process ID, but then $ ps and I see the the process is listed as stop and shortly disappears. Again, no errors, and no output in nohup.out nor in the log file for redis. Below is the redis.config I'm using (without the comments to keep it short)
daemonize yes
pidfile [my-user-account-path]/redis/redis.pid
port 0
bind 127.0.0.1
unixsocket [my-user-account-path]/tmp/redis.sock
unixsocketperm 770
timeout 10
tcp-keepalive 60
loglevel warning
logfile [my-user-account-path]/redis/logs/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error no
rdbcompression no
rdbchecksum no
dbfilename dump.rdb
dir [my-user-account-path]/redis/db
slave-serve-stale-data yes
slave-priority 100
appendonly no
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
# ADVANCED CONFIG is set to all default settings#
I'm sure it's probably something stupid, probably even a permissions thing somewhere (I've tried executing this as root, fyi), to no avail. Anyone ever experience something similar with Redis?
i have been experiencing redis crashes as well. just an fyi - the guy responsible for much of redis' development, Salvatore Sanfilippo, aka antirez, keeps an interesting blog that has some insight on redis crashes:
http://antirez.com/news/43