Handling cache warm-up with twisted and systemd - twisted

I have a simple twisted application which I run using a systemd service, executing a script, which subsequently executes a .tac file.
The application is structured as a JSON RPC endpoint (fastjsonrpc), built into a t.w.r.Resource, which is in a t.w.s.Site, and served t.a.i.TCPServer, and the whole thing packed into a t.a.Application. This works fine.
Where I do run into trouble is when I try to warm up caches at startup. This warm-up process is pretty slow (~300 seconds), and makes systemd timeout and kill the process. Increasing the timeout is not really a viable option, since I wouldn't want this to block system boot.
Analogous code is used in a separate stack running on Flask from within Apache and wsgi. That server starts itself off and lets systemd go on while it takes its time building the caches. This behaviour is fine for me.
I've tried calling the warmup function using the following within the setup function of the t.w.r.Resource:
reactor.callLater(1, ep.warmup, None)
I've not yet tried using this from within systemd, and have been testing it from twistd directly on the command line. The server does work as expected, however it no longer responds to SIGINT (^C). Removing the callLater is all that's needed to let the server respond to SIGINT.
If the warmup function is called directly (not by callLater, i.e., the arrangement which makes systemd give up while waiting for warm up to complete), the resulting server also continues to respond to SIGINT.
Is there a better / good way to handle this sort of long-running warmup code?
Why would twistd / the reactor not respond to SIGINT? Am I missing something here?

Twisted is a single-threaded thing. It sounds like your "cache warmup" code is blocking the reactor for those 300 seconds. One easy way to fix this would be using deferToThread to let it run without blocking the reactor.

Related

Nlog in hangfire jobs deadlock

I am using hangfire as a custom workflow engine in a dotnet core application, which works perfectly fine until I start logging. I log to DB as a requirement. So normal database target. I enabled a sync=true. Once I turn logging on I start getting deadlocks. All is using DI. Again all works fine without log, once I add a rule to write the logs in db I get a deadlock. It’s not even a lot of concurrent jobs, only 11 in my test. Help pls?! Already lost a week on this and still stuck.
I am not sure it’s related, but I am also getting this at the same time: asynchronous exception: timeout flushing all targets

What happens AFTER Apache says "Script timed out before returning headers" to the running script?

I have a Perl web app served by Apache httpd using plain mod_cgi or optionally mod_perl with PerlHandler ModPerl::Registry. Recently the app encountered the error Script timed out before returning headers on some invocations and behaved differently afterwards: While some requests seemed to be processed successfully in the background, after httpd sent status 504 to the client, others didn't.
So how exactly behaves httpd AFTER it reached its configured timeout and sent the error to the client? The request/response cycle is finished now, so I guess things like KeepAlive come into play to decide if the TCP connections stays alive or not etc. But what happens to the running script in which environment, e.g. mod_cgi vs. mod_perl?
Especially in mod_cgi, where new processes are started for each request, I would have guessed that httpd keeps the processes simply running. Because all our Perl files have a shebang, I'm not even sure if httpd is able to track the processes and does so or not. That could be completely different with mod_perl, because in that case httpd is aware of the interpreters and what they are doing etc. In fact, the same operation which timed out using plain mod_cgi, succeeded using mod_perl without any timeout, but even with a timeout in mod_cgi at least one request succeeded afterwards as well.
I find this question interesting for other runtimes than Perl as well, because they share the concepts of plain mod_cgi vs. some persistent runtime embedded into the httpd processes or using some external daemons.
So, my question is NOT about how to get the error message away. Instead I want to understand how httpd behaves AFTER the error occurred, because I don't seem to find much information on that topic. It's all just about increasing configuration values and try to avoid the problem in the first place, which is fine, but not what I need to know currently.
Thanks!
Both mod_cgi and mod_cgid set a cleanup function on the request scope to kill the child process, but they do it slightly different ways. This would happen shortly after the timeout is reported (a little time for mod_cgi to return control, the error response to be written, the request logged, etc)
mod_cgi uses a core facility in httpd that does SIGTERM, sleeps for 3 seconds, then does a SIGKILL.

jmeter hangs up and won't return

I am running 340 concurrent users to load test on server using jmeter.
But on most of the cases jmeter hangs up and won' t return, even if I try to close the connection it just hangs up. and eventually I have to close the application.
Any idea how to check what is holding the requests and how to check the requests sent by jmeter and find the bottleneck.
Got the following message on closing the thread
Shutting down thread please be patient message
I've hit this several times over the past few years. In each of my cases (may not be in your's) the issue was with the Load Balance (F5) I was sending my traffic through. Basically a property called OneConnect was holding the connections in a time-wait state and never killing the connection.
Run a pack tool like wireshark and see what's happening with the requests.
Try distributed testing, 340 concurrent users is not a big deal, but still you can try if that decreases your pain. Also take a look at the following link:
http://jmeter.apache.org/usermanual/best-practices.html#lean_mean
First check you script is ok with one user.
Ensure you use assertions.
Then run you test following jmeter best practices:
no gui
no costly listeners
You should then be able to see in csv output the longest request and be able to fix your issue.
I also encountered this problem before when I run my JMeter on my laptop(Core 2 Duo 1.5Ghz) it always hang-up in the middle of the processing. I tried to run on another pc which is more powerful than my laptop and its works now smoothly. Therefore, JMeter will run effectively if your pc or laptop has a better specs.
Note: It is also advisable to run your JMeter in non-gui mode.
Example to run JMeter in Linux box:
$ ./jmeter -t test.jmx -n -l /Users/home/test.jtl
I had the
one or more test threads won't exit
because of a firewall blocking some requests. So I had to leap in the firewalls timeout for all blocked request... then it returned.
You are getting this error probably because JVM is not capable of running so many threads. If you take a look at your terminal, you will see the exception you get:
Uncaught Exception java.lang.OutOfMemoryError: unable to create new native thread. See log file for details.
You can solve this by doing Remote Testing and have multiple clusters running, instead of one.

Apache taking 20sec to start running CGI script

Any ideas as to why my Apache/2.0.49 server always waits 20 seconds from receiving a request that is executed using a cgi-bin script to starting to run that script?
The server responds immediately to normal HTTP requests that only use static files, but always takes 20 seconds to respond to cgi-bin requests. I've used tcpdump to time the arrival of the request and printed the time at the beginning of the script to determine that the delay is between those two events.
I can't see anything in the configuration that relates to 20 seconds. The server runs Red Hat Linux 3.3.3-7 & I'm pretty sure that it used to respond to cgi-scripts immediately, but am unsure when it started being slow & what might have changed to cause that.
Thanks in advance for any help/suggestions you may be able to provide.
We had a similar problem with our Apache server running PHP but only with a specific client, in the end it turned out the issue was that the PHP was trying to fetch a file from a remote server but didn’t have connectivity, after 20 seconds the PHP “gave up” and the flow was completed. You should try to search your code for something similar.

Parallel.For Termination vb.net

I have a service that scans network folders using a parallel.for method.
However recently I am finding if I stop the service then while windows says the service is stopped the process is still running in task manager. However it is at 0 cpu and the memory does not change. If I try and end the task (even a force in command prompt) it just says access denied and i have to reboot the server.
What would be the best way to make sure everything terminates?
I thought of adding a global Boolean that in the stop procedure it turns true and part of my parallel code will check for that and call s.stop.
Thank you
In brief, when your service is stopped, it needs to cancel all pending and running operations, then wait for those operation to actually finish. Check out the MSDN reference for Task Cancellation.