Tensorboard Unble to get first event timestamp for run - tensorflow

I am trying to visualize a training session I trained in a remote server. I used scp to copy the file in my local iMac. I tried to visualize the data by running tensorboard. It runs the tensorboard site but I can't get the visualization. Every chart has a single dot at zero. I get this warning on the terminal.
WARNING:tensorflow:Unable to get first event timestamp for run
470_313_0.0001_2500_200/train
WARNING:tensorflow:Unable to get first event timestamp for run
470_313_0.0001_2500_200/train
WARNING:tensorflow:Unable to get first event timestamp for run
470_313_0.0001_2500_200/val
WARNING:tensorflow:Unable to get first event timestamp for run
470_313_0.0001_2500_50/train
WARNING:tensorflow:Unable to get first event timestamp for run
470_313_0.0001_2500_50/val
Any idea what is going on?

I ran into the same problem (with a TensorFlow 1.4 trainer running in the cloud with Google Cloud ML Engine).
Explicitly closing tf.summary.FileWriters with close() solved it in my case.

I ran into a similar problem. There are 2 solutions to it -
Delete all past tfevents files from the directory and keep the latest one (temporary solution).
Create a new directory for building your logs (permanent solution).
In given below picture, first I build logs in directory 2 which resulted in same error/warnings. Later I changed the directory to 3 and build logs there and tensorboard ran successfully.

In my case, the problem was the directory names for the runs were too long. After I manually renamed them, the problem is solved.

Related

How to save tensorflow models in RAMDisk?

In my original python code, there is a frequent restore of the ckpt model file. It takes too much time to read the checkpoints again and again. So I decided to save the model in the memory. A simple way is to create a RAMDisk and save the model in that disk. However, something unexpected happens.
I deployed 1G of RAMDisk according to the tutorial How to Create RAM Disk in Windows 10 for Super-Fast Read and Write Speeds. My system is windows 11.
I made two attempts: In the first one, I copied my code to the RAMDisk E: and used tf.train.Saver().save(self.sess,'./') to save the model, but it reports that UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb4 in position 114: invalid start byte. However, if I put the code on other normal folders, it runs successfully.
In the second attempt, I put the code under D: and modified the line as tf.train.Saver().save(self.sess,'E:\\'), and it reports that cannot create directory E: Permission Denied. Obviously, E:\ is not a directory to create. So I don't know how to handle this.
Your jupyter/python environment cannot go beyond the directory from which jupyter/python is started from and that's why you get a permission denied error.
However, you can run shell commands from the jupyter notebook. If your user has write access to your destination, you can do the following.
model.save("my_model") # This will save the model to the current directory.
!mv "my_model" "E:\my_model" # This will move the model from the current directory to your required directory.
On a side note, I when searching for tf.train.Saver().save(), I get this page as the only relevant result, which says it is used for saving checkpoints and not model. Also they recommend switching to the newer tf.train.Checkpoint or tf.keras.Model.save_weights. None the less, the above method should work as expected.

Sentinel 2 download not triggered

I am trying to download sentinel 2 data from the Copernicus Open Access Hub, with a python script and using the sentinelsat library. I know that the rate at which you can download data is limited: max 2 concomitant downloads. However, I seem to have problems when I try to download a higher quantity of data (10-20 queries) one after the other in a for loop, therefore not in concomitance.
The command api.download(<product_id>) does not trigger the download, rather it just prints something like This is the output I get rather than the actual download happening

Why does Google Colab say I have too many sessions?

I'm trying to run two notebooks on Google Colab but could only connect one notebook at a time to the virtual machine. There's a pop-up message saying "Too many sessions. You have too many active sessions. Terminate an existing session to continue." when I click the "connect" button on the second notebook. Does anybody know why?
Screenshot:
Edit: I'm using Google Chrome on Windows 10
Edit March 3, 2020: I ended up not using Colab that day, but I came back the next day and was able to run two Colab notebooks just fine. strange. I had this issue a couple of times since I posted this question, but the error disappeared the following day.
I am used to use it with 2 active sessions. I mean, it gives that error when I try to connect for the third notebook. Today, however, I could only connect 1 notebook at a time. It does not permit the second connection. Therefore, the limit changes time to time.
I have the same problem and I found my solution in this issue. I change the Runtime Shape to Standard in the 2nd notebook, it worked for me.
Try this:
Go to Menu "Runtime" > "Manage Sessions", you should see a list of active sessions. Terminate those you don't need. Although you think you are opening only 2 notebooks, some previous sessions may still linger around if you just close the browser tab.
Note: However, when I hit the problem today, I did the above to check, and I have only 1 other session. Usually, I am able to run up to 3 separate sessions. I am not sure if google is dynamically adjusting this based on overall demand. I also suspect ever since they introduce the Pro, priorities may be given to subscribers.
Im not sure but may be collab provides services to a account one at a time..

Tensorboard, only show latest tfevents

Tensboard shows all the events which it finds in the given logdir.
If I ran my training (or whatever) multiple times, I will have multiple tfevents file in the logdir. Tensorboard will show all such variable summaries merged together in a graph which looks strange.
On stdout, it writes sth like:
WARNING:tensorflow:Found more than one graph event per run. Overwriting the graph with the newest event.
WARNING:tensorflow:Found more than one "run metadata" event with tag step_0000. Overwriting it with the newest event.
How can I make it only show the summaries/events from the latest tfevents file, so that it ignores all older tfevents files?
Try putting your tfevents files into unique directories with the name of the run, as documented here

How allow only one python code process to run if same is executed at the same time

if I have two or more running python console applications at the same time of same application, but executed several times by hand or any other way.
Is there any method from python code itself to stop all extra processes, close console window and keep running only one
The solution I would use would be to have a lockfile created in the tmp directory.
The first instance would start, check for the existence of the file, create the file since it is not there, then run; the following instances will start, check for the existence of the file, then quit since it's there. The original instance would remove the lockfile as its last instruction. NOTE: If the app runs into an error and does not execute the instruction to remove the lockfile, you would need to manually remove it else the app will always see the file.
I've seen on other threads that some suggest using the ps command and look for your app's name, which would work; however, if your app will ever run on Windows, you would need to use tasklist.