I have a service that scans network folders using a parallel.for method.
However recently I am finding if I stop the service then while windows says the service is stopped the process is still running in task manager. However it is at 0 cpu and the memory does not change. If I try and end the task (even a force in command prompt) it just says access denied and i have to reboot the server.
What would be the best way to make sure everything terminates?
I thought of adding a global Boolean that in the stop procedure it turns true and part of my parallel code will check for that and call s.stop.
Thank you
In brief, when your service is stopped, it needs to cancel all pending and running operations, then wait for those operation to actually finish. Check out the MSDN reference for Task Cancellation.
Related
Monit cannot start/stop service,
If i stop the service, just stop monitoring the service in Monit.
Attached the log and config for reference.
#Monitor vsftpd#
check process vsftpd
matching vsftpd
start program = "/usr/sbin/vsftpd start"
stop program = "/usr/sbin/vsftpd stop"
if failed port 21 protocol ftp then restart
The log states: "stop on user request". The process is stopped and monitoring is disabled, since monitoring a stopped (= non existing) process makes no sense.
If you Restart service (by cli or web) it should print info: 'test' restart on user request to the log and call the stop program and continue with the start program (if no dedicated restart program is provided).
In fact one problem can arise: if the stop scripts fails to create the expected state (=NOT(check process matching vsftpd)), the start program is not called. So if there is a task running that matches vsftpd, monit will not call the start program. So it's always better to use a PID file for monitoring where possible.
Finally - and since not knowing what system/versions you are on, an assumption: The vsftpd binary on my system is really only the daemon. It is not supporting any options. All arguments are configuration files as stated in the man page. So supplying start and stop only tries to create new daemons loading start and stop file. -- If this is true, the one problem described above applies, since your vsftpd is never stopped.
I have a simple twisted application which I run using a systemd service, executing a script, which subsequently executes a .tac file.
The application is structured as a JSON RPC endpoint (fastjsonrpc), built into a t.w.r.Resource, which is in a t.w.s.Site, and served t.a.i.TCPServer, and the whole thing packed into a t.a.Application. This works fine.
Where I do run into trouble is when I try to warm up caches at startup. This warm-up process is pretty slow (~300 seconds), and makes systemd timeout and kill the process. Increasing the timeout is not really a viable option, since I wouldn't want this to block system boot.
Analogous code is used in a separate stack running on Flask from within Apache and wsgi. That server starts itself off and lets systemd go on while it takes its time building the caches. This behaviour is fine for me.
I've tried calling the warmup function using the following within the setup function of the t.w.r.Resource:
reactor.callLater(1, ep.warmup, None)
I've not yet tried using this from within systemd, and have been testing it from twistd directly on the command line. The server does work as expected, however it no longer responds to SIGINT (^C). Removing the callLater is all that's needed to let the server respond to SIGINT.
If the warmup function is called directly (not by callLater, i.e., the arrangement which makes systemd give up while waiting for warm up to complete), the resulting server also continues to respond to SIGINT.
Is there a better / good way to handle this sort of long-running warmup code?
Why would twistd / the reactor not respond to SIGINT? Am I missing something here?
Twisted is a single-threaded thing. It sounds like your "cache warmup" code is blocking the reactor for those 300 seconds. One easy way to fix this would be using deferToThread to let it run without blocking the reactor.
EDIT 2: I now believe this is an issue with the machine I'm attempting to run the service on. I tried moving the service to a different machine that is setup similarly and the service was able to start successfully even as a Local User. Now I just need to figure out what's different between the two machines...
I have a Windows Service project (written in VB.net) that is installed and configured with a Startup Type of Automatic and the Log On As set to a Local User account. This service will start when the computer first starts up. However, if I stop the service and try to start it again, I get "Error 1053: The service did not respond to the start or control request in a timely fashion." immediately. However, if I change the Log On As to "Local System account" then the service will start.
Summary:
Service will run as Local User when computer first starts
Service will not run as Local User if started manually
Service will run as Local System when computer first starts
Service will run as Local System if started manually
I have read that Error 1053 is caused by the OnStart method not returning quickly enough. The fact that the service has started previously, and that I get the error message immediately, leads me to believe a timeout is not what's going on. To verify this, I created a completely new Windows Service Project and without changing anything I built and installed it. I get the same behavior.
I am at a loss as to what's happening. As far as I can tell, the Local User has all of the correct privileges to run a service (as is evident by the fact that it will start with those credentials when it the computer is first starting up), and the OnStart method isn't actually timing out (as is evident by the completely blank dumb service exhibiting the same behavior).
Any ideas as to what's preventing the service from starting, or where I can look for better error messages (I have looked in the Application Event Log, but nothing shows up there)?
EDIT:
Here is the code from the dumb service I created (using the EventLogger from here as a module).
Protected Overrides Sub OnStart(ByVal args() As String)
' Add code here to start your service. This method should set things
' in motion so your service can do its work.
EventLogger.WriteToEventLog("On Start")
End Sub
Protected Overrides Sub OnStop()
' Add code here to perform any tear-down necessary to stop your service.
EventLogger.WriteToEventLog("On Stop")
End Sub
And the Main method of the same project.
' The main entry point for the process
<MTAThread()> _
<System.Diagnostics.DebuggerNonUserCode()> _
Shared Sub Main()
EventLogger.WriteToEventLog("Starting Main Method")
Dim ServicesToRun() As System.ServiceProcess.ServiceBase
ServicesToRun = New System.ServiceProcess.ServiceBase() {New Service1}
System.ServiceProcess.ServiceBase.Run(ServicesToRun)
EventLogger.WriteToEventLog("Leaving Main Method")
End Sub
When I try to run the Service as the Local User, none of the messages show in the Event Log and I get Error 1053. When I run the Service as the Local System, the messages show in the Event Log.
The reason I need to run the actual service as the Local User is so that it can access a network share. I am currently looking into using Windows User Impersonation, but I still think I should be able to start a simple service as a Local User.
Use this resource to create an event logger. Then wrap your code in each sub in a try/catch b/c most likely something is happening in your OnStart sub that is preventing the service from starting. Post some sample code of your onstart, onstop, and/or your service main subs and clarify why you need to use the local user vs the local system.
Is there a way to know the return code or process ID of the process which gets executed when the privileged helper tool is installed as a launchdaemon and launched via SMJobSubmit().
I have an application which to execute some tasks in privileged manner uses the SMJobSubmit API as mentioned here.
Now in order to know whether the tasks succeeded or not, I will have to do one of the following.
The best option is to get the return code of the executable that ran.
Another option would be if I could create a pipe between my application and the launchd.
If the above two are not possible, I will have to resort to some hack like writing a file in /tmp location and reading it from my app.
I guess SMJobSubmit internally submits the executable with a launchdaemon dictionary to the launchd which is then responsible for its execution. So is there a way I could query launchd to find out the return code for the executable run with the label "mylabel".
There is no way to do this directly.
SMJobSubmit is a simple wrapper around a complicated task. It also returns synchronously despite launching a task asynchronously. So, while it can give you an error if it fails to submit the job, if it successfully submits a job that fails to run, there is no way to find that out.
So, you will have to explicitly write some code to communicate from your helper to your app, to report that it's up and running.
If you've already built some communication mechanism (signals, files, Unix or TCP sockets, JSON-RPC over HTTP, whatever), just use that.
If you're designing something from scratch, XPC may be the best answer. You can't use XPC to launch your helper (since it's privileged), but you can manually create a connection by registering a Mach service and calling xpc_connection_create_mach_service.
This is sort of related to a previous, yet so far unsuccessful question of mine. I have a daemon that is placed in the LaunchAgents folder (on Mac) and it should run perpetually in the background, but after a couple of days it just stops for no apparent reason. I have no idea why and thus my question:
What are the reasons that a daemon might randomly stop?
Thanks for the help!
A Daemon is just a long lasting (forked) process. The reason a Daemon crashes is the same any other program crashes:
attempting to read or write memory
that is not allocated for reading or
writing by that application
(segmentation fault) or x86 specific
(general protection fault)
attempting to execute privileged or
invalid instructions
attempting to perform I/O operations
on hardware devices to which it does
not have permission to access
passing invalid arguments to system
calls
attempting to access other system
resources to which the application
does not have permission to access
(bus error)
attempting to execute machine
instructions with bad arguments
(depending on CPU architecture):
divide by zero, operations on denorms
or NaN values, memory access to
unaligned addresses, etc.
Since it's a LaunchAgent, it runs as part of your login session, and hence will be killed if you log out.
On the other hand, if it's dying before you log out, and you can't find/fix whatever is causing it to crash/exit, or you can tell launchd to automatically restart it by adding
<key>KeepAlive</key>
</true>
to its .plist