Why does Tensorboard not refresh when used with rsync? - tensorflow

I am running a tensorflow experiment on a remote machine continuously writing to the same events.out.tfevents.xxx file. I would expect tensorboard to refresh automatically every minute or so displaying the new logs. This does work when using sshfs to mount the remote machine on my laptop and using the mounted directory to run tensorboard on.
However, when using rsync to copy the files over and run tensorboard on the local files, the tensorboard never refreshes, I have to restart it in order to get the updates.
This is my rsync command:
rsync -aP --del -e ssh server_name:folder_on_server local_folder --exclude='*checkpoints*' --exclude='*.json' --exclude='*.DS_Store'
Any help would be greatly appreciated!

It's a known issue with the Tensorboard, see this issue on github.
Here's an quote from the issue (emphasis is mine) :
It looks like when the tensorboard reads an event file from local directory - it will not notice that the event file was deleted and recreated (which is quite valid case when you are using [...] rsync to sync the data)
One workaround is to use --inplace as an option in your rsync command.

Related

redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots

I'm with this problem when I try to save to redis. Introducing the message below.
MISCONF Redis is configured to save RDB snapshots, but it's currently unable to persist to disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Red
The redis log file displays this:
Background saving started by pid 73
Write error saving DB on disk: Function not implemented
Has anyone ever experienced this?
I found the answer. You need to wsl 2 To find out the version run below command in PowerShell
wsl -l -v
If it is version 1, run the command below and open ubuntu again
wsl --set-version 2
More information: https://learn.microsoft.com/en-us/windows/wsl/install

packer vmware iso local and on esxi

I am curious to find out why packer is failing to get ssh access on an ESXi server. The build works just fine for vmware_fusion locally.
As JSON does not seem to display nicely directly here on SF - a link to a gist with the builder configuration: https://gist.github.com/geoHeil/5acf06cb0f3afadfa347d437c2695a7c
When running
packer build -var-file variables.json -only=vmwarevmwareRemote template.json
the kickstart file is loaded, configured and installed. However, in the case of ESXi as the builder the build seems to be stuck on waiting for SSH to become available.
I noticed in the logs that:
/var/log/auth.log
2017-02-08T17:33:20Z sshd[94210]: User 'root' running command 'esxcli --formatter csv network vm list\n'
2017-02-08T17:33:25Z sshd[94210]: User 'root' running command 'esxcli --formatter csv network vm list\n'
displays a lot of the same commands.
Executing this command manually shows
esxcli --formatter csv network vm list
Name,Networks,NumPorts,WorldID,
ubunu-test,"VM Network,",1,87986,
someOther,"VM Network,",1,84833,
What could be wrong here?
edit
packer version is latest 0.12.2, esxi 6.5
edit2
when applying the suggestion of setting a network the same problem persists. But now I see 2 commands in the logs
[root#vm-bd-dev:/var/log] tail -f auth.log
2017-02-09T09:05:56Z sshd[111376]: User 'root' running command 'esxcli --formatter csv network vm list\n'
2017-02-09T09:05:56Z sshd[111376]: User 'root' running command 'esxcli --formatter csv network vm port list -w 111433\n'
The second (new) one has the following output:
ActiveFilters,DVPortID,IPAddress,MACAddress,PortID,Portgroup,TeamUplink,UplinkPortID,vSwitch,
,,0.0.0.0,00:0c:29:47:d5:3d,33554450,VM Network,vmnic2,33554437,vSwitch0,
You probably need some more vmx_data settings for the network, something like:
"vmx_data": {
"ethernet0.networkName": "VM Network",
"ethernet0.present": "true",
"ethernet0.virtualDev": "vmxnet3",
"ethernet0.startConnected": "true",
"ethernet0.addressType": "generated"
}
Switching the network interface to something not hard coded like
network --bootproto=dhcp --ipv6=auto --activate
solved the problem for me.
Apparently different interfaces (no eth0) were available on ESXi.

tmux : config files are not used

I use tmux (tmux 1.8) from Ubuntu 14.04.
I wanted to configure it a bit via ~/.tmux.conf. But whatever I set inside this file my tmux session looks the same. Then I tried a fresh new /etc/tmux.conf but I still get the same display.
It seems that my config is hardcoded and that I cannot change it.
If I remove these two files (~/.tmux.conf and /etc/tmux.conf) my tmux session is still the same. Tmux runs but I can not configure it. But it should be so simple...
Does anybody have already seen this? And how I could solve that? Do I need to compile a fresh new release of tmux?
Today, I have more details :
on one machine it works as expected. It's OK. But I did not changed anything! Strange...
But on another machine (also running Ubuntu same release and up2date like the first machine) it does not work.
The file /etc/tmux.conf does not exist on none of these 2 machines. I put this little config file (~/.tmux.conf) :
# start Window Numbering at 2
set -g base-index 2
When I launch tmux on this second machine, window numbering starts at 0. On the first machine with the same config file, it behaves correctly : it starts at 2.
I'm going crazy!
After you make changes to ~/.tmux.conf make sure tmux sources them with the tmux source-file ~/.tmux.conf shell command.
Try removing all sessions before running tmux. I have noticed that if you have sessions still running, tmux will still load the previous .tmux.config file.
Executing tmux kill-server can stop the server and then try to run the server again using tmux command.
Please note that after killing the server you will lose all open sessions / tabs.

Docker: How to live sync host folder with container folder?

I am working on a website powered by Node. So I have made a simple Dockerfile that adds my site's files to the container's FS, installs Node and runs the app when I run the container, exposing the private port 80.
But if I want to change a file for that app, I have rebuild the container image and re-run it. That takes some seconds.
Is there an easy way to have some sort of "live sync", NFS like, to have my host system's app files be in sync with the ones from the running container?
This way I only have to relaunch it to have changes apply, or even better, if I use something like supervisor, it will be done automatically.
You can use volumes in order to do this. You have two options:
Docker managed volumes:
docker run -v /src/path nodejsapp
docker run -i -t -volumes-from <container id> bash
The file you edit in the second container will update the first one.
Host directory volume:
docker run -v `pwd`/host/src/path:/container/src/path nodejsapp
The changes you make on the host will update the container.
If you are under OSX, those kind of volume shares can become very slow, especially with node-based apps ( a lot of files ). For this issue, http://docker-sync.io can help, by providing a volume-share like synchronisation, without using volume shares, this usually speeds up your container read/write speed of the code-directory from 50-80 times, depending on what docker-machine you use.
For performance see https://github.com/EugenMayer/docker-sync/wiki/4.-Performance and for easy examples how to use it, see the boilerplates https://github.com/EugenMayer/docker-sync-boilerplate for your case the unison example https://github.com/EugenMayer/docker-sync-boilerplate/tree/master/unison is the one you would need for NFS like sync
docker run -dit -v ~/my/local/path:/container/path/ myimageId
For /container/path/ you could use for instance /usr/src/app.
The flags:
-d = detached mode,
-it = interactive,
-v + paths = specifies the volume.
(If you just care about the volume, you can drop the -dit flag.)
Docker run reference
I use Scaffold's File Sync functionality for this. It gets the job done, and without needing overly complex configuration.
Setting up Scaffold in my project was as simple as installing Skaffold (through chocolatey, since I'm on Windows), running skaffold init --generate-manifests in my project folder, and answering a couple questions it asked.

rsync "failed to set times on "XYZ": No such files or directory (2)

I have a Dlink NAS (dns-323) in RAID1 that I use to backup family photos, videos and some other data. I also manually rsync to a dedicated backup drive on a little Atom Linux box whenever we add a lot of new files to the NAS. I finally lost a drive on the NAS and through a misstep of my own, also lost the entire volume. No problem, that's what the backup drive is for. I used the same rsync command in reverse to restore files to the NAS after I replaced the bad drive and created a new RAID volume. This worked well, except that after the command finished, I noticed that it did not preserve timestamps. Timestamps were preserved in the NAS->backup direction, but not the backup->NAS direction.
I run the rsync command on the Atom Linux box with these options (this does preserve timestamps):
rsync --archive --human-readable --inplace --numeric-ids --delete /mnt/dns-323 /mnt/dlink_backup --progress --verbose --itemize-changes
The reverse command to restore the volume from the backup (which did not preserve timestamps) is very similar:
rsync --archive --human-readable --inplace --numeric-ids --delete /mnt/dlink_backup/dns-323/ /mnt/dns-323/ --progress --verbose --itemize-changes
which actually restores the files, but gives many errors like:
rsync: failed to set times on "/mnt/dns-323/Rich/Code/.emacs": No such file or directory (2)
I've been googling most of the afternoon and trying different things, but so far haven't solved my problem. I used the 'touch' command to successfully modify the times of one or two files on the NAS, just to prove that it can be done since I believe that is one thing that rsync must do. I've tried doing this as my user and as root. By this I mean that I've run sudo rsync ..... as well as rsync --rsync-path='/usr/bin/sudo /usr/bin/rsync' ..... where ..... is all of the previously mentioned parameters. My /etc/fstab has these entries for the NAS and the backup drive, respectively:
# the dns-323
//192.168.1.202/Volume_1 /mnt/dns-323 cifs guest,rw,uid=1000,gid=1000,nounix,iocharset=utf8,file_mode=0777,dir_mode=0777 0 0
# the dlink_backup drive
/dev/sdb /mnt/dlink_backup ext3 defaults 0 0
It's not absolutely critical to preserve timestamps if it just plain can't be done, but it seems like it should be possible - I'm just stumped.
Thanks in advance. Let me know if I can provide any additional information.
I've earned my "tumbleweed" badge as a result of this one. pats self on back
What I've learned:
My solution:
1) Removed the left hard drive from the dns-323, which is half of the RAID1 volume.
2) Mounted (ext3) this drive using a USB-to-SATA adapter to the machine where I run rsync.
3) Performed the rsync command for the restore outlined above. I removed the --delete option which really shouldn't be there and I added the option --size-only. The size-only option made it so that timestamps were essentially the only thing that got restored, since files had already restored properly.
4) Unmounted the left drive from the Atom machine and returned that drive to the dns-323, while also removing the right drive. The right drive needs to be removed so that the dns-323 recognizes that the RAID volume is degraded.
5) Re-add the right drive to the dns-323 and tell it to rebuild the RAID volume.
6) All timestamps are now good.
A possible alternate solution:
I've read enough about rsync and NFS/Samba/cifs now to understand that this problem is likely related to permissions on the NFS server (dns-323). Internally, the user/group ids in the dns-323 are 501/501. No permutation of how I mounted the dns-323 on the Atom box would allow rsync to properly set timestamps. I do believe that changing my user account on the Atom box to have uid/gid of 501/501 would have worked, though. My user had the default 1000/1000 and root had 0/0 IIRC.