DQN cannot converge while the example with the same graph works - tensorflow

Here is the problem. I try to rewrite a dpn example of the CartPole-v0. The example converges with less than 2000 episodes, However, my version could not converge.
Therefore I use TensorBoard as the tool to help me figure out the problem. I find that these two codes have the same graph
What's more, I find that the graph of the loss is very different.
loss of the example
loss of mine
I wonder what's the problem. I will be grateful if anyone could help me with it! Thank you!
P.S. If the codes are needed. I could upload them. Both of them are shorter than 200 lines.

Related

Pytorch Adam optimizer's awkward behavior? better with restart?

I'm trying to train a CNN text classifier with Pytorch. I'm using the Adam optimizer like this.
optimizer = torch.optim.Adam(CNN_Text.parameters(), lr=args.lr)
I figured out that the optimizer converges really fast, and then it keeps on slowly dropping on accuracy. (the validation loss decreases a lot in 1-2 minutes, then it keeps on increasing slowly)
So, I implemented learning-rate decay,
If curr_loss > val_loss:
prev_lr = param_group['lr']
param_group['lr'] = prev_lr/10
I found out that it didn't really help a lot. But if I manually save the model, load it, and run the training with decreased learning rate, it really gets way better performance!
This gets me in hard time because I need to keep on watching the gradient descent and manually change the options. I tried SGD and other optimizers because I thought this was Adam's problem, but I couldn't find out a good way.
Can anyone help me with it?
What is param_group? With that code snippet it looks like a variable not associated with the optimizer in any way. What you need to modify is the 'lr' entry of each element of optimizer.param_groups, which is what ADAM actually looks at.
Either way, unless you have a good reason to hand-roll it yourself, I suggest you use the LR scheduler provided with PyTorch. And if you do need to reimplement it, check out its code and take inspiration from there.
The problem is that Adam has additional internal parameters (cumulative averages of gradients, etc.) that need also to be reset.
For this reason, you have a better chance deleting the instantiating the optimiser with a lower learning rate.
At least that worked for me.

Very weird behaviour when running the same deep learning code in two different GPUs

I am training networks using pytorch framework. I had K40 GPU in my computer. Last week, I added 1080 to the same computer.
In my first experiment, I observed identical results in both GPU. Then, I tried my second code on both GPUs. In this case, I "constantly" got good results in K40 while getting "constantly" awful results in 1080 for "exactly the same code".
First, I thought the only reason for getting such diverse outputs would be the random seeds in the codes. So, I fixed the seeds like this:
torch.manual_seed(3)
torch.cuda.manual_seed_all(3)
numpy.random.seed(3)
But, this did not solve the issue. I believe issue cannot be randomness because I was "constantly" getting good results in K40 and "constantly" getting bad results in 1080. Moreover, I tried exactly the same code in 2 other computers and 4 other 1080 GPUs and always achieved good results. So, problem has to be about the 1080 I recently plugged in.
I suspect problem might be about driver, or the way I installed pytorch. But, it is still weird that I only get bad results for "some" of the experiments. For the other experiments, I had the identical results.
Can anyone help me on this?
Q: can you please tell what type of experiment this is.. and what architecture of NN you use ?
In below tips, I will assume you are running a straight backpropagation neural net.
You say learning of your test experiment is "unstable" ? Training of a NN should not be "unstable". When it is, different processors could end up with a different outcome, influenced by numeric precision and rounding errors. Saturation could have occurred.. Check if your weight values have become too large. In that case 1) check if your training input and output are logically consistent, and 2) add more neurons in hidden layers and train again.
Good idea to check random() calls, but take into account that in a backprop NN there are several places random() functions can be used. Some backprop NN's also add dynamic noise to training patterns, to prevent early saturation of weights. When this training noise is scaled wrong, you could get bizarre results. When the noise is not added or too small, you could end up with saturation.
I had the same problem. I solved the problem by simply changing
sum
to
torch.sum
. Please try to change all the build-in functions to GPU one.

Tensorflow Prediction with LSTM

I have a question about predicting multiple steps into the future with an LSTM net in Tensorflow.
I know and understood the possibility of predicting one step into the future and than adding the predicted value to the input of the next step like shown below:
I have also found some information on predicting all the neccessary steps into the future at once like so:
I just coundn't figure out if this second option actually works and if so how the implementation works in tensorflow. I would be grateful if somebody knew this and could shed some light on it for me.
Also if anyone has tried which approach works better would be great to know as well.
Thx LIZ.

The GANs gaussian simulator doesn't converge.

I try to reimplement the Gaussian simulator described in the GANs paper with mxnet.
But there are two problems in my code.
First, the model doesn't converge very well, even after I try to set a learning rate scheduler.
This is how it looks like after about 500 epochs and the accuracy is bouncing around 0.5 ~ 0.6.
Second, I don't know how to draw the output of the discriminative model. The current curve doesn't look like the one described in the paper.
Could anyone please offer any suggestion?
I think I solved the two problems by myself. Now the code in the Github link should be correct.

Tensorflow - Any input gives me same output

I am facing a very strange problem where I am building an RNN model using tensorflow and then storing the model variables (all) using tf.Saver after I finish training.
During testing, I just build the inference part again and restore the variables to the graph. The restoration part does not give any error.
But when I start testing on the evaluation test, I always get same output from the inference all i.e. for all test inputs, I get the same output.
I printed the output during training and I do see that output is different for different training samples and cost is also decreasing.
But when I do testing, it always gives me same output no matter what is the input.
Can someone help me to understand why this could be happening? I want to post some minimal example but as I am not getting any error, I am not sure what should I post here. I will be happy to share more information if it can help the issue.
One difference I have between the inference graph during training and testing is the number of time steps in RNN. During training I train for n steps (n = 20 or more) for a batch before updating gradients while for testing I just use one step as I only want to predict for that input.
Thanks
I have been able to resolve this issue. This seemed to be happening as one of my input feature was very dominant in its original values due to which after some operations all values were converging to single number.
Scaling that feature has helped to resolve this.
Thanks
Can you create a small reproducible case and post this as a bug to https://github.com/tensorflow/tensorflow/issues ? That will help this question get attention from the right people.