tf.distribute.experimental.MultiWorkerMirroredStrategy() does not work properly - tensorflow

Enviornment settings for this project is
ubuntu 18.04
cuda 10.0
cudnn 7.6.2
created a virtual enviornment
install tensorflow-gpu=2.0.0
nccl 2.6.4
The code is
os.environ['TF_CONFIG'] = json.dumps({
'cluster': {
'worker': ["35.223.196.89:2222"]
},
'task': {'type': 'worker', 'index': 0}
})
strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
This does not work properly.
console details is attached below
stopped after...
Unable to destroy server_ object, so releasing instead. Servers don't support clean shutdown.
why this error?
or how to work on MultiWorkerMirroredStrategy()?

Related

KFServing pod "error: container storage-initializer is not valid"

I am new to KFServing and Kubeflow.
I was following https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1alpha2/tensorflow to deploy a simple inference service.
However, when looking at the logs, I am unable to find the container storage-initializer. The only containers my predict service pod has are kfserving and queue-proxy.
I am currently on Kubeflow 1.2 and Kubernetes 1.17 on IBM Cloud.
Error Message Image
storage-initializer is an init container, so if you describe the pod you won't find it in the containers section of pod spec but in the initContainers section.
$ kubectl get pod flowers-sample-predictor-default-00002-deployment-58bb9557sf7g2 -o json | jq .status.initContainerStatuses
[
{
"containerID": "docker://e40e5f86401b3715118b873fec4ae6c3ef57765ffbb5c9ab48757234c4f53b6f",
"image": "gcr.io/kfserving/storage-initializer:v0.5.0",
"imageID": "docker-pullable://gcr.io/kfserving/storage-initializer#sha256:1d396c0c50892f5562a1c24d925691ec786e5d48e08200f3f9bb17bb48da40ae",
"lastState": {},
"name": "storage-initializer",
"ready": true,
"restartCount": 0,
"state": {
"terminated": {
"containerID": "docker://e40e5f86401b3715118b873fec4ae6c3ef57765ffbb5c9ab48757234c4f53b6f",
"exitCode": 0,
"finishedAt": "2021-02-27T20:13:25Z",
"reason": "Completed",
"startedAt": "2021-02-27T20:13:11Z"
}
}
}
]
I'm not familiar with the model label you are using, can you retry by using the app label or the pod name directly?
$ kubectl logs -l app=flowers-sample-predictor-default-00002 -c storage-initializer
[I 210227 20:13:12 initializer-entrypoint:13] Initializing, args: src_uri [gs://kfserving-samples/models/tensorflow/flowers] dest_path[ [/mnt/models]
[I 210227 20:13:12 storage:43] Copying contents of gs://kfserving-samples/models/tensorflow/flowers to local
[W 210227 20:13:15 _metadata:104] Compute Engine Metadata server unavailable onattempt 1 of 3. Reason: timed out
[W 210227 20:13:15 _metadata:104] Compute Engine Metadata server unavailable onattempt 2 of 3. Reason: [Errno 113] No route to host
[W 210227 20:13:18 _metadata:104] Compute Engine Metadata server unavailable onattempt 3 of 3. Reason: timed out
[W 210227 20:13:18 _default:250] Authentication failed using Compute Engine authentication due to unavailable metadata server.
[I 210227 20:13:19 storage:127] Downloading: /mnt/models/0001/saved_model.pb
[I 210227 20:13:19 storage:127] Downloading: /mnt/models/0001/variables/variables.data-00000-of-00001
[I 210227 20:13:25 storage:127] Downloading: /mnt/models/0001/variables/variables.index
[I 210227 20:13:25 storage:76] Successfully copied gs://kfserving-samples/models/tensorflow/flowers to /mnt/models

My nightwatch.js tests not runs in Chrome headless of CentOS

I run nightwatch.js tests using Nightwatch version 1.0.18 and It's working in windows environment but when I run it in centOS after installment of Xvfb I found below error.
Error while running .navigateTo() protocol action: invalid session id
Error while running .locateMultipleElements() protocol action: invalid session id
Error while running .locateMultipleElements() protocol action: invalid session id
Here is my nightwatch.json file code:
{
"src_folders": [
"./tests"
],
"output_folder": "./reports",
"custom_commands_path": "./custom_commands",
"custom_assertions_path": "",
"test_workers": false,
"webdriver": {
"start_process": true
},
"test_settings": {
"default": {
"webdriver": {
"port": 9515,
"server_path": "./node_modules/chromedriver/lib/chromedriver/chromedriver",
"cli_args": [
"--log",
"debug"
]
},
"skip_testcases_on_fail": true,
"desiredCapabilities": {
"browserName": "chrome",
"javascriptEnabled": true,
"acceptSslCerts": true,
"chromeOptions": {
"args": [
"headless",
"no-sandbox",
"disable-gpu"
]
}
}
}
}
}
am I missing something to run my tests in the centOS environment because it is running in the windows environment?
I had the same issue with Nightwatchjs and the npm chomedriver setup.
Background:
Everything was working until I just recently updated Chromium on my system. In addition to the errors in the original post, verbose logging also showed:
{
message: 'unknown error: Chrome failed to start: exited abnormally',
error: [
"(unknown error: DevToolsActivePort file doesn't exist)",
'(The process started from chrome location /usr/bin/chromium is no longer running, so ChromeDriver is assuming that Chrome has crashed.)',
'(Driver info: chromedriver=2.46.628388 (4a34a70827ac54148e092aafb70504c4ea7ae926),platform=Linux 4.9.0-8-amd64 x86_64)'
],
}
After downloading the standalone chromedriver (2.46.628388) to match my Chromium version (72.0.3626.69) it was still showing the same errors.
Solution:
I ended up downloading an older version of Chromium (71.0.3578.127) and setting chromeOptions.binary to the new path of the chromium 71 binary. I also had to include 'no-sandbox' with chromeOptions.args.
Here is the snippet from the site mentioned above:
Downloading old builds of Chrome / Chromium
Let's say you want a build of Chrome 44 for debugging purposes. Google does not offer old builds as they do not have up-to-date security fixes.
However, you can get a build of Chromium 44.x which should mostly match the stable release. Here's how you find it:
Look in https://googlechromereleases.blogspot.com/search/label/Stable%20updates for the last time "44." was mentioned.
Loop up that version history ("44.0.2403.157") in the Position Lookup
In this case it returns a base position of "330231". This is the commit of where the 44 release was branched, back in May 2015.*
Open the continuous builds archive
Click through on your platform (Linux/Mac/Win)
Paste "330231" into the filter field at the top and wait for all the results to XHR in.
Eventually I get a perfect hit: https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Mac/330231/
Sometimes you may have to decrement the commit number until you find one.
Download and run!
Upgrading to the latest version of chromedriver solved the issue for me. You can find the latest version here; https://www.npmjs.com/package/chromedriver
In my situation, when that error occurs:
Error while running .navigateTo() protocol action: invalid session id
I added the following code into .travis.yml:
addons:
chrome: stable

Win10, VirtualBox,Ubuntu, Vue-cli 3 - watching not working

I cloned project from github and deployment on VM.
All works fine except watching if any file in project updating.
VagrantFile have string to syn folder
config.vm.synced_folder './', '/app', owner: 'vagrant', group: 'vagrant'
Tried to add vue.config.js with
module.exports = {
configureWebpack: {
devServer: {
watchOptions: {
ignored: ['node_modules'],
aggregateTimeout: 300,
poll: 1500
},
public: '192.168.83.181' // vagrant machine address
}
}
}
Below how project structure and terminal with executed vue-cli-service build --watch --mode development looks
node --version
v8.12.0
vue --version
3.1.3
Tried on Ubuntu 16.04 and 18.04 versions.
I have the same issue, but I think the problem is not relative to vue-cli. Because if you change your js code in vm with vi. Then vue-cli can watch this change and pre-compile.But with change from win 10. nothing to happen, despite of changing code is reflected on share folder

PyQt5 QtQuick Error - "libQt5Network undefined symbol: _Z24qt_subtract_from_timeoutii"

After installing PyQt5.5.1 together with Qt5.5.1 on my Ubuntu 14.04 successfully, I ran my simple pyqt file using QtQuick and met this error:
libQt5Network.so.5: undefined symbol: _Z24qt_subtract_from_timeoutii
Anyone has run into this before?
Thanks.
Python.py:
# Main Function
if __name__ == '__main__':
# Create main app
myApp = QApplication(sys.argv)
# Create a label and set its properties
appLabel = QQuickView()
appLabel.setSource(QUrl('basic.qml'))
# Show the Label
appLabel.show()
# Execute the Application and Exit
myApp.exec_()
sys.exit()
Basic.qml:
Grid {
id: colorPicker
rows: 2; columns: 3; spacing: 3
Rectangle { color: "white";}
Rectangle { color: "green";}
Rectangle { color: "blue"; }
Rectangle { color: "yellow";}
Rectangle { color: "steelblue";}
Rectangle { color: "black";}
}
The reason is i also installed python-Qt5, which is based on older Qt5 version.
With
find / -name libQt*
I could see some old qt lib residing in /usr/lib folder:
/usr/lib/i386-linux-gnu/libQt5Network.so
/usr/lib/i386-linux-gnu/libQt5Network.so.5
/usr/lib/i386-linux-gnu/libQt5Network.so.5.2
/usr/lib/i386-linux-gnu/libQt5Network.so.5.2.1
/home/tad/Qt5.5.1/gcc/lib/libQt5Network.so.5.5
/home/tad/Qt5.5.1/gcc/lib/libQt5Network.so
/home/tad/Qt5.5.1/gcc/lib/libQt5Network.so.5.5.1
/home/tad/Qt5.5.1/gcc/lib/libQt5Network.so.5
/home/tad/Qt5.5.1/Tools/QtCreator/lib/qtcreator/libQt5Network.so.5
/home/tad/Qt5.5.1/Tools/QtCreator/lib/qtcreator/libQt5Network.so.5.5.1
/home/tad/Qt5.5.1/Tools/QtCreator/lib/qtcreator/libQt5Network.so.5
The problem maybe inconsistent qt libs, so I remove all qt libs in /usr/lib and replace them with the ones in my home folder. It worked!However this is not recommended since some built-in Ubuntu components may use libQt* in /usr/lib folders. So, just remove python-qt5 and reinstall pyqt5 all over again!
By the way, for the error relating to Sip API version, we just have to run to remove all sip-related packages then reinstall sip again:
dpkg -l | grep sip
then
sudo apt-get purge python3-sip python3-sip-dev

Running simple QtWebEngine app on Raspberry Pi 2, page not showing

I compiled and installed QtWebEngine + QML plugins on Raspberry Pi 2 with Yocto recipes using information in this tutorial using Yocto dizzy branch and run the following script:
root#raspberrypi2:~# more chromium.qml
import QtQuick 2.1
import QtQuick.Controls 1.1
import QtWebEngine 0.9
ApplicationWindow {
width: 1280
height: 720
color: "lightgray"
visible: true
WebEngineView {
id: webview
url: "file:///home/root/hello.html"
anchors.fill: parent
}
}
Note that the IMPORT VERSION 0.9, not 1.0
I have tried both url: "file:///home/root/hello.html" and url: "https://duckduckgo.com" but all I am getting is a red screen with the black square mouse pointer.
root#raspberrypi2:~# more hello.html
<html>
<header><title>This is title</title></header>
<body>
Hello world
</body>
</html>
On the console:
root#raspberrypi2:~# /usr/bin/qt5/qmlscene -v -platform eglfs chromium.qml
[0605/163256:WARNING:resource_bundle.cc(280)] locale_file_path.empty()
[0605/163257:WARNING:proxy_service.cc(890)] PAC support disabled because there is no system implementation
[0605/163257:WARNING:resource_bundle.cc(280)] locale_file_path.empty()
PAC support disabled ... seems to be a none issue read here
UPDATE
I have followed this step-by-step tutorial (Poky fido branch) and then added qtwebengine (import QtWebEngine 1.0 this time) and qtwebengine-qmlplugins in my Yocto Image and created again my image with bitbake
When I booted and ran /usr/bin/qt5/qmlscene -v -platform eglfs chromium.qml I could see my HTML page.
I have tested a couple of dozen of websites and not all page show. So their might be a little more to it.
e.g.
http://wikipedia.com shows!!!
http://google.com doesn't show ???
http://https://stackoverflow.com/ shows!!!
http://facebook.com doesn't
Any further pointers are welcome
UPDATE 20160309
root#raspberrypi2:~/app# uname -a
Linux raspberrypi2 4.1.10 #1 SMP PREEMPT Wed Feb 17 16:51:44 CET 2016 armv7l GNU/Linux
root#raspberrypi2:~/app# lsb_release -a
LSB Version: core-4.1-noarch:core-4.1-arm
Distributor ID: poky
Description: Poky (Yocto Project Reference Distro) 2.0.1
Release: 2.0.1
Codename: jethro
QML
root#raspberrypi2:~/app# more chromium.qml
import QtQuick 2.1
import QtQuick.Controls 1.1
import QtWebEngine 1.0
ApplicationWindow {
width: 800
height: 600
color: "lightgray"
visible: true
WebEngineView {
id: webview
//url: "http://raspberrypi.stackexchange.com/" // PASS
//url: "http://google.com" // FAIL
//url: "http://video.webmfiles.org/big-buck-bunny_trailer.webm" // PASS but no Sound
//url: "https://youtube.com/" // FAIL
//url: "https://opentokrtc.com/anybots" // FAIL
//url: "http://speedof.me/" // PASS
url: "http://facebook.com" // FAIL
anchors.fill: parent
}
}
Maybe it is a little late but I tried to build QtWebEngine in Qt 5.6 alpha and it works properly for me on Raspberry Pi 2 for all the URLs you listed. This is a demo. Maybe they fixed something in QtWebEngine, so you can try 5.6-alpha.
Unfortunately, the branch jethro for meta-qt5 the qtwebengine caused many problems.
I was happy to see that in this branch master with chromium 45:
Branch jethro:
QT_MODULE BRANCH CHROMIUM = "40.0.2214-based"
Branch master:
QT_MODULE BRANCH CHROMIUM = "45-based"
I will try to build;)