Desactivated environment in Colab - google-colaboratory

When running a notebook on Colab i get the message:
"desactivated execution environment"
and the execution stops.
Note that i disconnected from colab then reconnected gain just before running this notebook.
Any idea of the problem ?
thanks

I think we are having a similar issue, and it's probably due to memory. The session will just disconnect and you have to start the run all over again.
The 2 things I did to somewhat reduce the times this occurs is:
before I start or restart running my code I run the below "kill" command. This basically clears any potential items left behind in memory.
!kill -9 -1
I use python so every now and then I run the below garbage collector.
gc.collect()
This helps, but not bulletproof.

Related

Terminal process get killed with code ELIFECYCLE errno: 137 when VS Code is open. Quitting VS Code resolves the issue?

I've only recently in the last two days begun encountering this issue.
When I attempt to build my Angular project, It's getting to this one point and failing with errors below.
The only way I can get it to run is to quit VS code and rerun the exact same command and it builds without issue.
Any ideas what may be causing this?
137 is 128 + 9. In some situations—and I'm guessing that this is one of them—this indicates that the process died with a signal 9. Signal 9 is, on macOS (and multiple other OSes), SIGKILL. This signal is sent by the "out of memory" killer.
This also explains why exiting VSCode fixes things: VSCode is a memory hog. Exiting it returns the memory to the system.
To fix this more permanently, either reduce the memory needs of your build and/or of VSCode, or add more memory to your system.
See also What killed my process and why?

reset memory usage of a single GPU

I have access to 4 GPUs(not root user). One of the GPU(no. 2) behaves weird, their is some memory blocked but the power consumption and temperature is very low(as if nothing is running on it). See details from nvidia-smi in the image below:
How can I reset the GPU 2 without disturbing the processes running on the other GPUs?
PS: I am not a root user but I think I can catch hold of some root user as well.
resetting a gpu can resolve you problem somehow it could be impossible due your GPU configuration
nvidia-smi --gpu-reset -i "gpu ID"
for example if you have nvlink enabled with gpus it does not go through always, and also it seems that nvidia-smi in your case is unable to find the process running over your gpu, the solution for your case is finding and killing associated process to that gpu by running following command, fill out the PID with one that are you find by fuser there
fuser -v /dev/nvidia*
kill -9 "PID"

Electron threw a compile error and Windows command prompt goes non writable

So my code had an invalid syntax, which I was trying out to see if it works and while compiling I got App threw an error during load.
Now in the command-prompt the error is listed in detail with the cursor still blinking but NOT WRITABLE. Closing, re-opening, re-navigating and restarting with electron . only seems to work.
Can't do the same for many errors I might face. So, is there a way to not let that happen? How are you guys dealing with it? Is it in anyway connected to stopping the npm server? If it helps, I'm using a Win 7 64bit OS.
Found a way, which is to terminate the batch job by hitting CTRL + C, which asks Terminate batch job (Y/N)? where choosing Y terminates and makes the command prompt writable.
I was searching for methods to terminate without confirmation and learnt it cannot be terminated without confirmation.

Tensorflow Serving Client Script Hangs At stub.predict.future()

This is my first time asking a question here so I will try to be descriptive. I am relatively new to python and tensorflow, and have been learning it specifically for a project.
I am currently working to deploy a tensorflow model using tensorflow serving and flask with wsgi. I have been following this architecture: https://github.com/Vetal1977/tf_serving_flask_app
I am running tensorflow_model_server on port=9000.
I know that tensorflow_model_server is working, because when I execute the tensorflow_serving_client.py from command line, I get the expected response. I have tested this command line execution on every user account.
Similarly, I know that Flask + WSGI is working, because I can see log.info points dropping into the apache error log as it works its way through the script. If I return something before it gets to the line in question, it works just fine.
However, when the application is executed with Flask + WSGI, it hangs at this line: result = stub.Predict.future(request, 5.0) # 5 seconds (https://github.com/Vetal1977/tf_serving_flask_app/blob/master/api/gan/logic/tf_serving_client.py#L70)
I can see that it hangs as I monitor top and tail -f error.log and see the same apache process sit there until it is killed or apache is restarted.
I am really stuck on the fact that it works when executed via command line, but not when Flask + WSGI runs it. Can anyone provide suggestions or point me in the right direction? Am I headed down the right path with this? Any assistance at all would be greatly appreciated.
EDIT: I have uploaded the minimal code to a github repo here: https://github.com/raymondtri/client-test along with a minimal setup that does require flask, wsgi, tensorflow, and tensorflow-serving.
Thanks in advance,
Ray
After much research and many hours, I think that this has something to do with how mod_wsgi forks processes, and how grpc has some known issues with that. I have a feeling that things are getting orphaned as the process is forked and that is what is causing the script to hang.
I could be wrong though, that is just my current hypothesis.

How to run forever without sudo on Ec2

The title is pretty much tells what the question is about. I am trying to use forever to start a script on Ec2 but it does not work unless I use sudo.
If I start without sudo, I get
warn: --minUptime not set. Defaulting to: 1000ms
warn: --spinSleepTime not set. Your script will exit if it does not stay up for at least 1000ms
info: Forever processing file: ci.js
But when I do forever list
info: No forever processes running
You should run forever list under same user you've started forever (it seems like you are doing that right).
Try to check ps aux | grep node after you do forever start. Maybe you haven't started any process (because of errors in command line or in your NodeJS file) so forever list returns empty list.
p.s. I've checked forever on my machine and it is behaving exactly as you said - if i run it under my 'ubuntu' user -> list of running is empty even though process is alive... Seem like a bug in forever.