Can I use "stopsignal=WINCH" to have supervisord gracefully stop an Apache process? - apache

According to the Apache documentation, the WINCH signal can be used to gracefully stop Apache.
So it would seem that, in supervisord, I should be able to use stopsignal=WINCH to configure supervisord to stop Apache gracefully.
However, Google turns up 0 results for "stopsignal=WINCH". It seems odd that no-one has tried this before.
Just wanted to confirm: is stopsignal=WINCH the way to get supervisord to stop Apache gracefully?

I had the same problem running/stopping apache2 under supervisord inside a Docker container. I don't know if your problem is related to Docker or not, or how familiar you are with Docker. Just to give you some context: when calling docker stop <container-name>, Docker sends SIGTERM to the process with PID 1 running inside the container (some details on the topic), in this case supervisord. I wanted supervisord to pass the signal to all its programs to gracefully terminate them, because I realized that, if you don't gracefully terminate apache2, you might not be able to restart that because the PID file is not removed. I tried both with and without stopsignal=WINCH, and the result didn't change for me. In both cases apache2 was gently terminated (exit status was 0 and no PID file in /var/run/apache2. To stay on the safe side, I kept the stopsignal=WINCH inside the supervisord config, but as of today I was also not able to find a clear answer online, neither here nor by googling.

According to the supervisord's source code:
# all valid signal numbers
SIGNUMS = [ getattr(signal, k) for k in dir(signal) if k.startswith('SIG') ]
def signal_number(value):
try:
num = int(value)
except (ValueError, TypeError):
name = value.strip().upper()
if not name.startswith('SIG'):
name = 'SIG' + name
num = getattr(signal, name, None)
if num is None:
raise ValueError('value %r is not a valid signal name' % value)
if num not in SIGNUMS:
raise ValueError('value %r is not a valid signal number' % value)
return num
It does recognize all signals and even if your signal name doesn't start with 'SIG', it will add that automatically too.

Related

Monit false alerts

I am monitoring a java daemon process with PID. Below is the code.
check process SemanticReplication with pidfile "/ngs/app/edwt/opsmonit /monit/scripts/process.pid"
start = "/ngs/app/edwt/scripts/javadaemon/start_daemon.ksh"
stop = "/ngs/app/edwt/scripts/javadaemon/stop_daemon.ksh"
Many times, even though java daemon process is up and running, I get false alert as process not running.
In the next monit check cycle (after a minute), another monit alert triggers as process is up and running.
Can someone help how do we avoid this false alerts ?
Your check statement is to have monit check for the existence of the pid file (which looks weird with the spaces, btw). If there isn't any, it'll send an alert by default and then runs the start directive.
I get around this by having a check process ... matching statement like so:
check process app-pass matching 'Passenger RubyApp: \/home\/app\/app-name\/public'
Essentially, "matching" does the equivalent of ps aux | grep ... which does a better job when I can't rely on a pid file existing, like with a child process.

Sun Grid Engine resubmit job stuck in 'Rq' state

I have what I hope is a pretty simple question, but I'm not super familiar with Sun Grid, so I've been having trouble finding the answer. I am currently submitting jobs to a grid using a bash submission script that generates a command and then executes it. I have read online that if a sun grid job exits with a code of 99, it gets re-submitted to the grid. I have successfully written my bash script to do this:
[code to generate command, stores in $command]
$command
STATUS=$?
if [[ $STATUS -ne 0 ]]; then
exit 99
fi
exit 0
When I submit this job to the grid with a command that I know has a non-zero exit status, the job does indeed appear to be resubmitted, however the scheduler never sends it to another host, instead it just remains stuck in the queue with the status "Rq":
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
2150015 0.55500 GridJob.sh my_user Rq 04/08/2013 17:49:00 1
I have a feeling that this is something simple in the config options for the queue, but I haven't been able to find anything googling. I've tried submitting this job with the qsub -r y option, but that doesn't seem to change anything.
Thanks!
Rescheduled jobs will only get run in queues that have their rerun attribute (FALSE by default) set to TRUE, so check your queue configuration (qconf -mq myqueue). Without this, your job remains in the rescheduled-pending state indefinitely because it has nowhere to go.
IIRC, submitting jobs with qsub -r yes only qualifies them for automatic rescheduling in the event of an exec node crash, and that exiting with status 99 should trigger a reschedule regardless.

How do I set a backend for django-celery. I set CELERY_RESULT_BACKEND, but it is not recognized

I set CELERY_RESULT_BACKEND = "amqp" in celeryconfig.py
but I get:
>>> from tasks import add
>>> result = add.delay(3,5)
>>> result.ready()
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/djangoprojects/venv/local/lib/python2.7/site-packages/celery/result.py", line 105, in ready
return self.state in self.backend.READY_STATES
File "/djangoprojects/venv/local/lib/python2.7/site-packages/celery/result.py", line 184, in state
return self.backend.get_status(self.task_id)
File "/djangoprojects/venv/local/lib/python2.7/site-packages/celery/backends/base.py", line 414, in _is_disabled
raise NotImplementedError("No result backend configured. "
NotImplementedError: No result backend configured. Please see the documentation for more information.
I just went through this so I can shed some light on this. One might think for all of the great documentation stating some of this would have been a bit more obvious.
I'll assume you have both RabbitMQ up and functioning (it needs to be running), and that you have dj-celery installed.
Once you have that then all you need to do is to include this single line in your setting.py file.
BROKER_URL = "amqp://guest:guest#localhost:5672//"
Then you need to run syncdb and start this thing up using:
python manage.py celeryd -E -B --loglevel=info
The -E states that you want events captured and the -B states you want celerybeats running. The former enable you to actually see something in the admin window and the later allows you to schedule. Finally you need to ensure that you are actually going to capture the events and the status. So in another terminal run this:
./manage.py celerycam
And then finally your able to see the working example provided in the docs.. -- Again assuming you created the tasks.py that is says to.
>>> result = add.delay(4, 4)
>>> result.ready() # returns True if the task has finished processing.
False
>>> result.result # task is not ready, so no return value yet.
None
>>> result.get() # Waits until the task is done and returns the retval.
8
>>> result.result # direct access to result, doesn't re-raise errors.
8
>>> result.successful() # returns True if the task didn't end in failure.
True
Furthermore then you are able to view your status in the admin panel.
I hope this helps!! I would add one more thing which helped me. Watching the RabbitMQ Log file was key as it helped me identify that django-celery was actually talking to RabbitMQ.
Are you running django celery?
If so, you need to start a python shell in the context of django (or whatever the technical term is).
Type:
python manage.py shell
And try your commands from that shell
HI tried everything to work celery v3.1.25 with Django 1.8 version nothing worked..
Finally below line helped me ,feeling happy
app = Celery('documents',backend="celery.backends.amqp:AMQPBackend")
Setting backend="celery.backends.amqp:AMQPBackend" fixed my error.

proc_open interaction

Here's what I'm trying to achieve: open a shell (korn or bash, doesn't matter), from that shell, I want to open a ssh connection (ssh user#host). At some point it is likely to happen I will be prompted for either a password or I might be asked whether or not I'm sure I want to connect (offending keys).
Before anyone asks: yes, I am aware there is a plugin for ssh2 exec calls, but the servers I'm working on don't support it, and are unlikely to do so.
Here's what I've tried so far:
$desc = array(array('pipe','r'),array('pipe','w'));//used in all example code
$p = proc_open('ssh user#host',$desc,$pipes);
if(!is_resource($p)){ die('#!#$%');}//will omit this line from now on
sleep(1);//omitting this,too but it's there every time I need it
Then I tried to read console output (stream_get_contents($pipes[1])) to see what I have to pass next (either password, yes or return 'connection failed: '.stream_get_contents($pipes[1]) and proc_close $p.
This gave me the following error:
Pseudo-terminal will not be allocated because stdin is not a terminal.
So, I though ssh was called in the php:// io-stream context, seems a plausible explanation of the above error.
Next: I though about my first SO question and decided it might be a good idea to open a bash/ksh shell first:
$p = proc_open('bash',$desc,$pipes);
And take it from there, but I got the exact same error message, only this time, the script stopped running but ssh did run. So I got hopeful, then felt stupid and, eventually, desperate:
$p=proc_open('bash && ssh user#host',$desc,$pipes);
After a few seconds wait, I got the following error:
PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 133693440 bytes)
The Call Stack keeps bringing up the stream_get_contents line, even in my last desperate attempt:
#!/path/to/bin/php -n
<?php
$p = proc_open('bash && ssh user#host',array(array('pipe','r'),array('pipe','w')),$ps);
if (!is_resource($p))
{
die('FFS');
}
usleep(10);
fwrite($ps[0],'yes'."\n");
fflush($ps[0]);
usleep(20);
fwrite($ps[0],'password'."\n");
fflush($ps[0]);
usleep(20);
fwrite($ps[0],'whoami'."\n");
fflush($ps[0]);
usleep(2);
$msg = stream_get_contents($ps[1]);
fwrite($ps[0],'exit'."\n");
fclose($ps[0]);
fclose($ps[1]);
proc_close($p);
?>
I know, its a mess, a lot of fflush and redundancy, but the point is: I know this connection will first prompt me for offending keys, and then ask a password. My guess is the stream in $pipes[1] holds the ssh connection, hence it's content is huge. what I need then, is a pipe inside a pipe... is this even possible? I must be missing something, what good is a pipe if this isn't possible...
My guess is the proc_open command is wrong to begin with, (error: Broken pipe). But I really can't see any other way around the first error... any thoughts? Or follow up questions if the above rant isn't at all clear (which it probably isn't).
Before anyone asks: yes, I am aware there is a plugin for ssh2 exec
calls, but the servers I'm working on don't support it, and are
unlikely to do so.
There are actually two. The PECL module, which is a PITA that most servers don't have installed anyway and phpseclib, a pure PHP SSH2 implementation. An example of its use:
<?php
include('Net/SSH2.php');
$ssh = new Net_SSH2('www.domain.tld');
if (!$ssh->login('username', 'password')) {
exit('Login Failed');
}
echo $ssh->exec('pwd');
echo $ssh->exec('ls -la');
?>

mod_perl segmentation fault

HI,
I'm running an apache 2.2.3 on an Oracle64-bit (Red Hat clone) and I'm hitting a brick wall with an issue. I have a program which utilizes MIME::Lite to send mail through sendmail (I apologize, not sure what versions of sendmail or mod_perl I'm running, although I do believe the sendmail portion is irrelevant as you'll see in a moment)
On occasion, apache will segfault (11), and digging deep into the MIME::Lite module, I see it is on the following line:
open SENDMAIL, "|$sendmailcmd" or Carp::croak "open |$sendmailcmd: $!\n"; (this is in MIME::Lite)
Now, one would automatically suspect sendmail, but if I did the same line to use /bin/cat (as shown):
open SENDMAIL, "|/bin/cat"
apache still segfaults.
I attached an strace to the apache processes and see the following:
(when it does NOT crash)
12907 write(2, "SENDMAIL send_by_sendmail 1\n", 28) = 28
12907 write(2, "SENDMAIL /usr/lib/sendmail -t -o"..., 40) = 40
12907 pipe([24, 26]) = 0
12907 pipe([28, 29]) = 0
12907 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2b4bcbbd75d0) = 13186
Note the "SENDMAIL sent_by_sendmail" are my comments. You can clearly see pipes opening. When it DOES crash, you'll see the following:
10805 write(2, "SENDMAIL send_by_sendmail (for y"..., 40) = 40
10805 --- SIGSEGV (Segmentation fault) # 0 (0) ---
Now notice it never pipes. I've tried GDB and it hasn't really shown me anything.
Finally, I wrote a simple program to run through mod_perl and regular cgi:
print header();
print "test";
open SENDMAIL, "|/bin/cat" or Carp::croak "open |sendmailcmd: $!\n";
print SENDMAIL "foodaddy";
close SENDMAIL;
print "test done <br/>";
Under mod_perl it has successfully crashed.
My analysis is telling me it has to do with it trying to open a file handle, the piping function returns either false or a corrupt file handle.
I also increased the file descriptor limit to 2048, no dice.
Does anyone have any thoughts as to where I should look? Any thoughts?
I appreciate the help
I just spent a long time tracking down a problem that started with identical symptoms. I eventually discovered that Test::More does not play well with mod_perl . Removing this module from my code appears to have solved the problem (so far!). I didn't follow this any deeper, but I suspect that the problem actually lies in Test::Builder.
I managed to treat perhaps only the symptoms, not the cause. I happened to have this issue when used global/package scope variables on the package level, used inside a perl object instance, as soon as I passed them as object properties instead, not as automatic default perl variables scoping, I stopped to experience perl segmentation fault suddenly.