Tensorflow Serving Client Script Hangs At stub.predict.future()

Tensorflow Serving Client Script Hangs At stub.predict.future() - tensorflow

This is my first time asking a question here so I will try to be descriptive. I am relatively new to python and tensorflow, and have been learning it specifically for a project.
I am currently working to deploy a tensorflow model using tensorflow serving and flask with wsgi. I have been following this architecture: https://github.com/Vetal1977/tf_serving_flask_app
I am running tensorflow_model_server on port=9000.
I know that tensorflow_model_server is working, because when I execute the tensorflow_serving_client.py from command line, I get the expected response. I have tested this command line execution on every user account.
Similarly, I know that Flask + WSGI is working, because I can see log.info points dropping into the apache error log as it works its way through the script. If I return something before it gets to the line in question, it works just fine.
However, when the application is executed with Flask + WSGI, it hangs at this line: result = stub.Predict.future(request, 5.0) # 5 seconds (https://github.com/Vetal1977/tf_serving_flask_app/blob/master/api/gan/logic/tf_serving_client.py#L70)
I can see that it hangs as I monitor top and tail -f error.log and see the same apache process sit there until it is killed or apache is restarted.
I am really stuck on the fact that it works when executed via command line, but not when Flask + WSGI runs it. Can anyone provide suggestions or point me in the right direction? Am I headed down the right path with this? Any assistance at all would be greatly appreciated.
EDIT: I have uploaded the minimal code to a github repo here: https://github.com/raymondtri/client-test along with a minimal setup that does require flask, wsgi, tensorflow, and tensorflow-serving.
Thanks in advance,
Ray

After much research and many hours, I think that this has something to do with how mod_wsgi forks processes, and how grpc has some known issues with that. I have a feeling that things are getting orphaned as the process is forked and that is what is causing the script to hang.
I could be wrong though, that is just my current hypothesis.

Related

How to tell github action that the job had done successfully？

I use github action to deploy my website to my server. The last ssh cmd is npm run start. It will output ready - started server on http://localhost:4000(Since i use Nextjs) finally but it seems that github doesn't know what did it mean and print :
2021/01/09 14:24:14 Error: command timeout
err: Run Command Timeout!
Although the website is successfully deployed, it shows that the Github action failed to execute.
So how to tell github action that the job had done successfully？

You should find a way to start the application in a daemon process of its own, rather than as a process within the SSH session. Perhaps this tool (pm2) might solve your problem? This question and answer is somewhat related.
There are definitely other ways to start your app in a daemon process, or perhaps as a service, but this might be the most straightforward for you since it's a Node tool.

Elm install always fails with "ConnectionTimeout" error (in WSL)

I'm new to Elm. and I'm not good at English. So, if any ambiguous or wrong thing is there, please let me correct it.
----------- edit -----------
All my problem below is on WSL. when I'm trying on windows, all work fine. then... why elm install doesn't work on WSL? did you have any idea?
-------- problem --------
when I try to elm-test init, it doesn't work like below
$ elm-test init
Here is my plan:
Add:
elm/random 1.0.0
elm-explorations/test 1.2.2
Would you like me to update your elm.json accordingly? [Y/n]:
-- PROBLEM DOWNLOADING PACKAGE -------------------------------------------------
I was trying to download the source code for elm/random 1.0.0, so I tried to
fetch:
https://github.com/elm/random/zipball/1.0.0/
But my HTTP library is giving me the following error message:
ConnectionTimeout
Are you somewhere with a slow internet connection? Or no internet? Does the link
I am trying to fetch work in your browser? Maybe the site is down? Does your
internet connection have a firewall that blocks certain domains? It is usually
something like that!
but my Browser(Chrome) is working beautifully, and even in WSL (the environment that I run elm-test init command at) is too.
$ curl https://github.com/elm/random/zipball/1.0.0/
https://codeload.github.com/elm/random/legacy.zip/1.0.0<body>You are being redirected.</body></html>
then I also try again to redirect the URL
$ curl https://codeload.github.com/elm/random/legacy.zip
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
I think there's no Network Problem. My PC can connect with the repo, and I think it will be downloaded well.
I lastly tried just elm install, and that throws the same error too. it seems like Elm can't connect with the repo, even if My PC can.
$ elm install elm/random
Here is my plan:
Add:
elm/random 1.0.0
Would you like me to update your elm.json accordingly? [Y/n]:
-- PROBLEM DOWNLOADING PACKAGE -------------------------------------------------
I was trying to download the source code for elm/random 1.0.0, so I tried to
fetch:
https://github.com/elm/random/zipball/1.0.0/
But my HTTP library is giving me the following error message:
ConnectionTimeout
Are you somewhere with a slow internet connection? Or no internet? Does the link
I am trying to fetch work in your browser? Maybe the site is down? Does your
internet connection have a firewall that blocks certain domains? It is usually
something like that!
Please help me, what should I do?

I recently had reason to use WSL for elm development. It wasn't much fun and I'm glad to be back on Mac! What I found was that certain disk operations on WSL seemed to go very slowly and that could lead a variety of weird issues.
I was using Webstorm, which does not handle WSL well, so in the end I did everything on C: drive (rather than in /home/...) so that webstorm could run the windows version of elm-format, while my node development environment was run on the linux layer.
That's not a precise answer to your question but just to say that it can be done, but its not an ideal way to write Elm code in my experience

I had same issue and it was solved.
It was due to DNS server settings.
Create a file /etc/resolv.conf and write the following line.
nameserver 8.8.8.8
Then WSL will refer to Google Public DNS and works fine.
However, when restart WSL, the settings revert back.
Therefore, the following settings are also required.
Create a file /etc/wsl.conf and write the following line.
[network]
generateResolvConf = false
wsl --shutdown and restart WSL.
Reference link
https://github.com/microsoft/WSL/issues/4285#issuecomment-522201021

Segmentation Error: Local Machine Fails (16gb) but AWS EC2 works (1gb)

I understand this is a little vague but not sure where else to go to or things to debug. My python script was running fine yesterday. I made minor changes today and now it only runs successfully on my Amazon LightSail (ec2) machine. Everything I read about segmentation errors is that there is not enough memory, however my local machine has 16gb of ram while the cloud machine only has 1gb. Plus I am not working with big files? The files being imported/manipulated are typically under 2mb and there are like 7-10 files.
I feel it may be something related to my terminal/zsh rather than my codes.
The below is the error code I can not seem to manage to get around.
I've done enough research to find the python faulthandler module import faulthandler; faulthandler.enable() to give the debugging below:
Fatal Python error: Segmentation fault
Current thread 0x000000010c58edc0 (most recent call first):
File "/Users/garrett/opt/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 1795 in <genexpr>
File "/Users/garrett/opt/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 1797 in <listcomp>
File "/Users/garrett/opt/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 1797 in count
File "GmailDownloader.py", line 215 in <module>
zsh: segmentation fault python *.py
The code seems to regularly break on line 215 while trying to compute a gorupby in pandas but it is very similar to other groupbys in the code that were successful before it.
I am on a Mac Catlina using the pre-baked zsh for my terminal handling but even when I switch to good ol' bash using chsh -s /bin/bash in my terminal and then running the code I still get a zsh segmentation error.
I have recently tried out PyCharm today and it asked for permissions to store something in a bin folder to which I just said yes. I'm not sure if that is correlated at all or not.
The full code repository: https://github.com/GarrettMarkScott/AutomotiveCRMPuller
Ongoing list of other things I have tried:
Trashing the Terminal preferences (~/Library/Preferences/com.apple.Terminal.plist)

I almost threw in the towel but tried to reinstall my pandas since it was mentioned in my bug error and what do you know it worked after running pip install --upgrade pandas
Would of been impossible without the FaultHandler! Hopefully this helps someone out there!

Running a JSON graph locally

I've been using Flowhub.io to do my development on the nodejs device. Now that the GUI-based design is done, I'm ready to take it offline and run the code via the command line. How would do I do this? I have the JSON file corresponding to the graph I created online, but not sure how to use the noflo nodejs module.
Could someone help me by showing me an example of how to load a graph using the noflo module, please? Thanks!

f you want to run an existing graph, you can use the --graph option.
noflo-nodejs --graph graphs/MyMainGraph.json
If you also want the process to exit when the network stops, you can pass --batch.
PS: I added this to the noflo-nodejs README.

RabbitMQ startup error on OS X Leopard

Can anyone help diagnose why RabbitMQ is failing to start? It has worked fine in the past, but suddenly is refusing to start. I'm running the command sudo rabbitmqctl start, and I get the following output:
Error: {'EXIT',
{function_clause,
[{rabbit_control,action,
[start,rabbit#BELLOMACPRO,[],
[{"-p","/"},{"-n","rabbit#BELLOMACPRO"},{"-q",false}],
#Fun<rabbit_control.1.100323439>],
[{file,"src/rabbit_control.erl"},{line,168}]},
{rabbit_control,start,0,
[{file,"src/rabbit_control.erl"},{line,84}]},
{init,start_it,1,[]},
{init,start_em,1,[]}]}}
I'm not finding this particularly informative, but perhaps there's something obvious to someone who better understands Erlang. The log file has no recent entries to shed any light on the issue.

I never figured out what the cause of the problem was, but I worked around it by executing the server directly with sudo rabbitmq-server, rather than attempting to use rabbitmqctl. Explanations are still welcome.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas