Why would std::process::Command::ouput fail? - error-handling

What would cause std::process:Command::output to fail? If the callee program fails, the error will be captured as part of the resulting Output.stderr, so I guess output will only return an Error if the OS fails to create a new process for some reason? Is that something that I can safely ignore for my simple CLI tool?

There could be some issue opening the binary being executed (i.e. access denied, doesn't exist)
When waiting for the process to finish, the waitpid syscall could be interrupted
Getting the output involves creating a pipe, which will fail if the file descriptor limit is hit (cat /proc/sys/fs/file-max to check)
It also involves opening a file, which will fail if the limit on open files is reached (ulimit -n to check)
You probably only need to worry about the first two: you can't do anything about hitting limits in the kernel.

Related

How can an uncalled test affect another in Go?

I have a test function TestJobqueue() in https://github.com/VertebrateResequencing/wr/blob/develop/jobqueue/jobqueue_test.go that I can call in isolation: go test -tags netgo ./jobqueue -v -run 'TestJobqueue$'.
I recently started getting test failures related to boltdb (one of my dependencies) bombing out with signal SIGBUS: bus error code panics, or just normally failing tests because the database couldn't be opened. But only when working off an NFS mounted directory. Fair enough, I or boltdb have some kind of NFS-related bug.
But the thing I can't wrap my head around is that I only get these errors when an entirely different test function exists.
As per the comments in TestREST() in https://github.com/VertebrateResequencing/wr/blob/92fb61ccd7819c8f1edfa8cce8468c4250d40ea7/jobqueue/rest_test.go, if I call Serve(serverConfig) (a function in the package being tested, a function call which is made many times in TestJobqueue() and other test functions) in that test function, TestJobqueue() fails. If I don't, it doesn't.
In short, the failure of tests in one test function can be controlled by the value of a boolean in a test function that I'm not running.
How is this possible?
Edit: to address some points brought up by the first answer, TestJobqueue() is being run in isolation. No other test runs before or after it. If the database file already exists, Serve() results in those files being deleted first, then a new one created to run the new set of tests. The odd thing that I'm seeking an answer for is how an unexecuted function can have this side effect. I can demonstrate it is really unexecuted by beginning or ending TestREST() with a panic call: the output of that panic is never seen, but TestJobqueue() failure can still be controlled by the boolean in TestREST() (if the panic comes at the end).
Edit2: this turns out to be caused by an unusual thing I do in TestJobqueue(), which is to call go test on itself. Needless to say, if you do this, strange things can happen...
In short, the failure of tests in one test function can be controlled by the value of a boolean in a test function that I'm not running.
This is not a great summary. Your test starts a server. The other test starts a server, clearly, the problem is there. You appear to have commented out the bit of code that stops the server at the end of the test? You can't run two servers on the same port.
You probably have a port conflict or some network condition that is triggered by running the two servers at once, because they both appear to use a similar (identical?) config loaded like this:
config := internal.ConfigLoad("development", true)
Running with no config uses default values, avoiding the conflict, running with config causes the conflict. So to pin it down, try creating a config with one setting at a time till you find the config setting that causes the problem (most likely Port or WebPort). Alternatively, make sure the tests stop the server at the end.
[EDIT] Looks like you have narrowed it down to DBFile config setting by changing one at a time. This implies the server starts a new db instance - if both try to use the same file for a new db, this would cause contention and the second test to run would fail.
It's not entirely clear from your description above what you're doing or what the problem is, so you could try to improve that to state exactly the sequence of actions and the problem. If for example you have previously run a test which creates a db, it could affect later test runs because of the presence of a db file, so your tests are not completely independent.
[EDIT 2 - after further edits to question]
If commenting out TestREST completely solves your problem (or a panic before it starts), and given changing it breaks the other test, you are executing TestREST somehow.
Looking at your code for jobqueue_test, it appears to invoke go test so you might be running more tests that you assume? Given you don't see the panic output I'd suspect your use of exec.Command in this big test. Try removing bits of the failing test till it works to narrow down exactly which invocation is running the other test. Calling go test within a test is pretty unusual!
https://github.com/VertebrateResequencing/wr/blob/develop/jobqueue/jobqueue_test.go#L2445

Breaking/opening files after failed jobs (sas7bdat.lck issue)

Good day,
Tl;dr:
a) Is it feasibly possible to recover data from .lck file?
b) If .lck issue appears, the SAS would work around it.
We have automated mundane jobs running on SAS machines. Every now and then job fails. This sometimes leaves locked file behind. (< filename>.sas7bdat.lck instead of < filename>.sas7bdat file)
This issue prevents re-running the program as SAS sees that there is already specified filename and tries to access it, failing. Message:
Attempt to rename temporary member of <dataset> failed.
Currently we handle them by manually deleting the file and adjusting generation number.
Question is two folded: a) Is it feasibly possible to recover data from .lck file? b) If .lck issue appears, the SAS would work around it. (Note that we have a lot of jobs and inputting checking code in all of them is work intensive.)
The .sas7bdat.lck file is the one that SAS writes to as it's creating a data set. If the data step (or PROC) completes successfully, the original data set file is deleted and the .sas7bdat.lck file gets renamed to remove the .lck part. If any errors occur, the .lck file gets deleted and the original data set is left in place, unmodified. That's how SAS avoids overwriting existing data sets when errors occur.
Therefore, you should be able to just rename the file to remove the .lck, or maybe rename it to damaged.sas7bdat for example, and then try accessing the file. You can try a PROC DATASETS REPAIR (https://v8doc.sas.com/sashtml/proc/z0247721.htm) if you really need to get whatever data might be present.
The best solution will obviously be to correct whatever fault is causing your jobs to bomb out like this in the first place. No SAS program should ever leave .lck files lying about, even if it encounters errors - your jobs must actually be crashing the SAS environment itself, or perhaps they're being killed prematurely by another process. Simply accepting that this happens and trying to work around it is likely to just be storing up more problems for the future.

How allow only one python code process to run if same is executed at the same time

if I have two or more running python console applications at the same time of same application, but executed several times by hand or any other way.
Is there any method from python code itself to stop all extra processes, close console window and keep running only one
The solution I would use would be to have a lockfile created in the tmp directory.
The first instance would start, check for the existence of the file, create the file since it is not there, then run; the following instances will start, check for the existence of the file, then quit since it's there. The original instance would remove the lockfile as its last instruction. NOTE: If the app runs into an error and does not execute the instruction to remove the lockfile, you would need to manually remove it else the app will always see the file.
I've seen on other threads that some suggest using the ps command and look for your app's name, which would work; however, if your app will ever run on Windows, you would need to use tasklist.

Debug Diag generate large memory dump files

I configured Debug Diag on Production where I set a Crash rule for a specific app pool with action type Long Stack Trace. But the problem is it's generating dump file those are very large in size approx 700mb each. I'm not sure why these files are too large. Is there a way to truncate it?
When you use "Log Stack Trace" option, the callstack for the exception will be logged in to a text file (not dump file) that Debug Diagnostic generates for the process to which it is attached to. I am assuming that the dump is getting generated if your process is crashing with a 2nd chance exception (that is, if you didn't change anything else in the default crash rule).
If you look at the name of the dump file, you would be able to identify on what exact condition the dump got generated.

PHP script stops running arbitrarily with no errors

I have a PHP script that seemed to stop running after about 20 minutes.
To try to figure out why, I made a very simple script to see how long it would run without any complex code to confuse me.
I found that the same thing was happening with this simple infinite loop. At some point between 15 and 25 minutes of running, it stops without any message or error. The browser says "Done".
I've been over every single possible thing I could think of:
set_time_limit ( session.gc_maxlifetime in the php.ini)
memory_limit
max_execution_time
The point that the script is stopped is not consistent. Sometimes it will stop at 15 minutes, sometimes 22 minutes.
Please, any help would be greatly appreciated.
It is hosted on a 1and1 server. I contacted them and they don't provide support for bugs caused by developers.
At some point your browser times out and stops loading the page. If you want to test, open up the command line and run the code in there. The script should run indefinitely.
Have you considered just running the script from the command line, eg:
php script.php
and have the script flush out a message every so often that its still running:
<?php
while (true) {
doWork();
echo "still alive...";
flush();
}
in such cases, i turn on all the development settings in php.ini, of course on a development server. This display many more messages, including deprecation warnings.
In my experience of debugging long running php scripts, the most common cause was memory allocation failure (Fatal error: Allowed memory size of xxxx bytes exhausted...)
I think what you need to find out is the exact time that it stops (you can set an initial time and keep dumping out the current time minus initial). There is something on the server side that is stopping the file. Also, consider doing an ini_get to check to make sure the execution time is actually 0. If you want, set the time limit to 30 and then EVERY loop you make, continue setting at 30. Every time you call set_time_limit, the counter resets and this might allow you to bypass the actual limits. If this still isn't working, there is something on 1and1's servers that might kill the script.
Also, did you try the ignore_user_abort?
I appreciate everyone's comments. Especially James Hartig's, you were very helpful and sent me on the right path.
I still don't know what the problem was. I got it to run on the server with using SSH, just by using the exec() command as well as the ignore_user_abort(). But it would still time out.
So, I just had to break it into small pieces that will run for only about 2 minutes each, and use session variables/arrays to store where I left off.
I'm glad to be done with this fairly simple project now, and am supremely pissed at 1and1. Oh well...
I think this is caused by some process monitor killing off "zombie processes" in order to allow resources for other users.
Run the exec using "2>&1" to log anything including stderr.
In my output I managed to catch this:
...
script.sh: line 4: 15932 Killed php5-cli -d max_execution_time=0 -d memory_limit=128M myscript.php
So something (an external force, not PHP itself) is killing my process!
I use IdWebSpace which is excellent BTW but I think most shared hosting providers impose this resource/process control mechanism just to be sane.