Error when online training a stateful deep LSTM stack - tensorflow

I would like to use online training with a deep LSTM model.
When using online training, my batch size is 1.
I would also like the model to be stateful.
As such, I don't want my input to be a sequence, but rather a single state. Thus the number of steps in my input is also 1. I would like the model to learn the sequence based on the individual input states
Thus, I've constructed the following model (Tensorflow JS):
this.model = tf.sequential();
this.model.add(tf.layers.lstm({ units: 256, returnSequences: false, batchInputShape: [1, 1, input_dim], stateful: true }));
this.model.add(tf.layers.dense({ units: output_dim}));
this.model.compile({ optimizer: 'adam', loss: 'meanSquaredError' });
this.model.summary();
This model works perfectly fine, however when I want to stack more LSTM layers, for example trying the following model:
this.model = tf.sequential();
this.model.add(tf.layers.lstm({ units: 256, returnSequences: true, batchInputShape: [1, 1, input_dim], stateful: true }));
this.model.add(tf.layers.lstm({ units: 128, returnSequences: false}));
this.model.add(tf.layers.dense({ units: output_dim}));
this.model.compile({ optimizer: 'adam', loss: 'meanSquaredError' });
Above I've changed returnSequences to true in the first LSTM layer so that its output shape can be accepted by the second LSTM layer.
When attempting to fit on the same input and output that works with the first model on the second model, I receive the following error:
Error: Argument tensors passed to stack must be a `Tensor[]` or `TensorLike[]`
What does this error mean?

Related

TensorflowJS model doesn't predict multiclass data properly

As a beginner, I have tried build a really simple multi-class classifier in tensorflowJS which is suppose to predict the direction of my eye sight.
Step 1: I created data set in the browser to train my model where I am storing images of my eyes rendered by webcam on a HTML5 canvas. I use arrow keys to label my images as 0=left,1=normal and 2=right. To train the model, I convert these lables using tf.onHot() before passing to the method.
// data collection
let imageArray = [];
let labelArray = [];
let collectData = (label) => {
const img = tf.tidy(() => {
const captureImg = getImage();
//console.log(captureImg.shape)
return captureImg;
})
imageArray.push(img)
labelArray.push(label) //--- labels are 0,1,2
}
// label conversion
let labelSet = tf.oneHot(tf.tensor1d(labelArray, 'int32'), 3);
Step 2: Instead of loading any per-trained model, I used my custom model that I built using tensorflowJS.
let createModel = () => {
const model = tf.sequential();
let config_one = {
kernelSize: 3,
filters: 40,
strides: 1,
activation: 'relu',
inputShape: [imageHeight, imageWidth, imageChannels]
}
model.add(tf.layers.conv2d(config_one));
let config_two = {
poolSize: [2, 2],
strides: [2, 2],
}
model.add(tf.layers.maxPooling2d(config_two));
model.add(tf.layers.flatten());
model.add(tf.layers.dropout(0.2));
// Two output values x and y
let congfig_output = {
units: 3,
activation: 'tanh',
}
model.add(tf.layers.dense(congfig_output));
// Use ADAM optimizer with learning rate of 0.0005 and MSE loss
let config_compile = {
optimizer: tf.train.adam(0.00005),
loss: 'categoricalCrossentropy',
}
model.compile(config_compile);
tf.memory()
return model;
}
Problems : There are several problems I am facing right now.
When I use meanSquared as loss function and adam learning rate 0.000005, my model starts predicting but it only predicts two of the eye's state normal and left/right so to do multi-class classification, I changed loss function to categoricalCrossentropy but the result is still same or sometime worst.
I tried other combination of hyper parameters but no luck. The worst situation I got into was my loss function was showing only three constant values repeatedly.
My browser would crashed in some case where - if - I pass too much data or use other type of optimizer in compile config such as sgd or anything else. When I did a quick search on google, I found I can use tf.memory() to check any memory leak which could be causing browser crash but that line didn't log anything in the console.
I was adjusting various values and parameters in the code and training the model which made it work sometimes, partially, and most of the time didn't even work. It was all hit and trial. Eventually I learned about parameters to use for loss function in the compile method and activation function in con2d input layer but other stuff is still confusing such as - number of epochs, batch size, learning rate in adam etc.
I understood or I think I understood these - kernalsize, filters, strides, inputshape but still have no idea how to decide number of layers various hyper parameters etc.
Edit - this is what I get after updating the code as per the suggestion. I still don't proper classification. I am training with minimum of 1000+ images.
A. I still get the loss recurring with fixed valeus
B. Accuracy is also repeating itself with 1, 0.5 and 0
function getImage() {
return tf.tidy(function () {
const image = tf.browser.fromPixels($('#eyes')[0]);
const batchedImage = image.expandDims(0);
const norm = batchedImage.toFloat().div(tf.scalar(255)).sub(tf.scalar(1));
return norm;
});
}
Here are the console output
Sample images -
Most obvious thing to me that is wrong with this is your output layer's activation function, where you use tanh you should be using softmax instead. Next, your learning rate is way to low try setting it to 0.001 which is a good default to use.
You also probably don't need dropout as you have not gotten any results to justify that the model is overfitting. You could also add in more convolutional layers to this, try the example below.
model.add(tf.layers.conv2d({
inputShape: [28, 28, 1],
kernelSize: 5,
filters: 8,
strides: 1,
activation: 'relu',
}));
model.add(tf.layers.maxPooling2d({
poolSize: [2, 2],
strides: [2, 2],
}));
model.add(tf.layers.conv2d({
kernelSize: 5,
filters: 16,
strides: 1,
activation: 'relu',
}));
model.add(tf.layers.maxPooling2d({
poolSize: [2, 2],
strides: [2, 2],
}));
model.add(tf.layers.flatten());
model.add(tf.layers.dense({
units: 3,
activation: 'softmax',
}));
const LEARNING_RATE = 0.001;
const optimizer = tf.train.adam(LEARNING_RATE);
model.compile({
optimizer: optimizer,
loss: 'categoricalCrossentropy',
metrics: ['accuracy'],
});

Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]

I've been trying to set up a simple reinforcement learning example using tfjs. However, when trying to train the model I am running into the following error:
Uncaught (in promise) Error: Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]
I built the model up as following:
const NUM_OUTPUTS = 4;
const model = tf.sequential();
//First hidden Layer, which also defines the input shape of the model
model.add(
tf.layers.dense({
units: LAYER_1_UNITS,
batchInputShape: [null, NUM_INPUTS],
activation: "relu",
})
);
// Second hidden Layer
model.add(tf.layers.dense({ units: LAYER_2_UNITS, activation: "relu" }));
// Third hidden Layer
model.add(tf.layers.dense({ units: LAYER_3_UNITS, activation: "relu" }));
// Fourth hidden Layer
model.add(tf.layers.dense({ units: LAYER_4_UNITS, activation: "relu" }));
// Defining the output Layer of the model
model.add(tf.layers.dense({ units: NUM_OUTPUTS, activation: "relu" }));
model.compile({
optimizer: tf.train.adam(),
loss: "sparseCategoricalCrossentropy",
metrics: "accuracy",
});
The training is done by a function that calculates the Q-values for some examples:
batch.forEach((sample) => {
const { state, nextState, action, reward } = sample;
// We let the model predict the rewards of the current state.
const current_Q: tf.Tensor = <tf.Tensor>model.predict(state);
// We also let the model predict the rewards for the next state, if there was a next state in the
//game.
let future_reward = tf.zeros([NUM_ACTIONS]);
if (nextState) {
future_reward = <Tensor>model.predict(nextState);
}
let totalValue =
reward + discountFactor * future_reward.max().dataSync()[0];
current_Q.bufferSync().set(totalValue, 0, action);
// We can now push the state to the input collector
x = x.concat(Array.from(state.dataSync()));
// For the labels/outputs, we push the updated Q values
y = y.concat(Array.from(current_Q.dataSync()));
});
await model.fit(
tf.tensor2d(x, [batch.length, NUM_INPUTS]),
tf.tensor2d(y, [batch.length, NUM_OUTPUTS]),
{
batchSize: batch.length,
epochs: 3,
}
);
This appeared to be the right way to provide the examples to the fit function, seeing as when logging the model, the shape of the last dense layer is correct:
Log of the shape of dense_Dense5
However it results in the error shown above, where instead of the expected shape [3,4] it checks for the shape [,1]. I really dont understand where this shape is suddenly coming from and would much appreciate some help with this!
For a better overview, you can simply view/check out the whole project from its Github repo:
Github Repo
The tensorflow code in question is in the AI folder.
EDIT:
Providing a summary of the model plus some info of the shape of the tensor im providing for y in model.fit(x,y) :
Solved: Issue occured due to using the wrong loss function. Moving from categoricalCrossEntropy to meanSquaredError fixed the issue with the shape of the output layer mismatching the batch shape.

Enhance a Machin Learning model for periodic data

I am learning ML through TensorFlow (tfjs).
My first test was to train my model to predict cos(x) as a function of x (from 0 to 2*Math.PI*4 aka 4 periods)
feature: 2000 values of x (random)
label: 2000 values of cos(x)
model:
const model = tf.sequential({
layers: [
tf.layers.dense({ inputShape: [1], units: 22, activation: 'tanh' }),
tf.layers.dense({ units: 1 }),
]
});
model.compile({
optimizer: tf.train.adam(0.01),
loss: 'meanSquaredError',
metrics: ['mae']
});
...
await model.fit(feature, label, {
epochs: 500,
validationSplit: 0.2,
})
The result is quite "fun":
Now I would like to know how to enhance my model to fit with the periodicity nature of cos(x) (without using the mathematical periodicity of cos(x) like y = cos(x modulo 2PI) ).
Is it possible for my model to "understand" that there is a periodicity ?
I think that the network you built is too small to learn the periodic behaviour of the cosine function (try increasing the number of hidden units and/or adding hidden layers), also I don't think a regular (fully connected neural network) is the right choice if you want to learn a function that has a periodic sequential nature, try using an RNN or LSTM for this.

Time series CNN, trying to use 1,1 input shape

I'm trying to do a CNN 1D for time series.
First issue:
When trying to use an input shape of [1,1] I get an error:
Error: Negative dimension size caused by adding layer average_pooling1d_AveragePooling1D1 with input shape [,0,128]
2nd issue
I have 2 different arrays (1d) for my data: first array is the input data containing the time series and the 2nd array contains the output data with closed values for a stock.
Something that got me to a few more results was to set the input shape to [6,1].
Model summary:
_________________________________________________________________
Layer (type) Output shape Param #
=================================================================
conv1d_Conv1D1 (Conv1D) [null,5,128] 384
_________________________________________________________________
average_pooling1d_AveragePoo [null,4,128] 0
_________________________________________________________________
conv1d_Conv1D2 (Conv1D) [null,3,64] 16448
_________________________________________________________________
average_pooling1d_AveragePoo [null,2,64] 0
_________________________________________________________________
conv1d_Conv1D3 (Conv1D) [null,1,16] 2064
_________________________________________________________________
average_pooling1d_AveragePoo [null,0,16] 0
_________________________________________________________________
flatten_Flatten1 (Flatten) [null,0] 0
_________________________________________________________________
dense_Dense1 (Dense) [null,1] 1
=================================================================
Here training the model got me into issues:
const trainX = tf.tensor1d(data.inTime).reshape([100, 6, 1])
100 - size of my array
6 - features
1 - 1 unit as output
Error: Size(100) must match the product of shape 100,6,1
I'm stuck at the training step because I don't know how to train it.
I would prefere to have a [1,1] input shape, to give only 1 time series and to have 1 output from it.
The model
async function buildModel() {
const model = tf.sequential()
// settings
const kernelSize = 2
const poolSize = [2]
// tf layers
model.add(tf.layers.conv1d({
inputShape: [6, 1],
kernelSize: kernelSize,
filters: 128,
strides: 1,
useBias: true,
activation: 'relu',
kernelInitializer: 'varianceScaling'
}))
model.add(tf.layers.averagePooling1d({poolSize: poolSize, strides: [1]}))
// 2nd layer
model.add(tf.layers.conv1d({
kernelSize: kernelSize,
filters: 64,
strides: 1,
useBias: true,
activation: 'relu',
kernelInitializer: 'varianceScaling'
}))
model.add(tf.layers.averagePooling1d({poolSize: poolSize, strides: [1]}))
model.add(tf.layers.conv1d({
kernelSize: kernelSize,
filters: 16,
strides: 1,
useBias: true,
activation: 'relu',
kernelInitializer: 'varianceScaling'
}))
model.add(tf.layers.averagePooling1d({poolSize: poolSize, strides: [1]}))
model.add(tf.layers.flatten())
model.add(tf.layers.dense({
units: 1,
kernelInitializer: 'VarianceScaling',
activation: 'linear'
}))
// optimizer + learning rate
const optimizer = tf.train.adam(0.0001)
model.compile({
optimizer: optimizer,
loss: 'meanSquaredError',
metrics: ['accuracy'],
})
return model
}
Training where the error is occurring
async function train(model, data) {
console.log(`MODEL SUMMARY:`)
model.summary()
// Train the model
const epochs = 2
// train data size, 28, 28, 1
const trainX = tf.tensor1d(data.inTime).reshape([100, 6, 1])
const trainY = tf.tensor([data.outClosed], [1, data.size, 1])
let result = await model.fit(trainX, trainY, {
epochs: epochs
})
print("Loss after last Epoch (" + result.epoch.length + ") is: " + result.history.loss[result.epoch.length-1])
return result
}
Any ideas into how to fix it will be much appreciated!
Time series is a sequence taken at successive equally spaced points in time according to wikipedia. The goal of the neural network NN used on time series is to find the pattern between the series of data. Convolutiona Neural Networks CNN are rarely if not never used on this kind of data. Other NN often used are RNN and LSTM. If we are interested in finding a pattern in a series of data, the inputShape can't be [1, 1]; otherwise it will mean finding a pattern on a unique point. It can be done theoretically, but in reality it does not capture the essence of the time series.
The model used here is using CNN with average pooling layer. Of course, a pooling layer cannot be applied on a layer with a pooling size bigger than the shape of the layer thus throwing the error:
Error: Negative dimension size caused by adding layer average_pooling1d_AveragePooling1D1 with input shape [,0,128]
The last error:
Error: Size(100) must match the product of shape 100,6,1
indicates a mismatch of the size of the tensors.
100 * 6 * 1 = 600 elements in the tensor (size =600) whereas the input tensor has 100 elements resulting in the error.

model predicts NaN

I am trying to learn and practice on Tensorflow.js.
So, I tried to train a neural network on a [,2] shaped array as x (as I understood, this would simulate a problem where I have x samples that each one has 2 variables) and a [,1] array as y (what would mean if I'm correct, that the combination of my 2 variables generate 1 output).
And I tried to code it:
const model = tf.sequential();
model.add(tf.layers.dense({ units: 2, inputShape: [2] }));
model.add(tf.layers.dense({ units: 64, inputShape: [2] }));
model.add(tf.layers.dense({ units: 1, inputShape: [64] }));
// Prepare the model for training: Specify the loss and the optimizer.
model.compile({ loss: 'meanSquaredError', optimizer: 'sgd' });
// Generate some synthetic data for training.
const xs = tf.tensor([[1,5], [2,10], [3,15], [4,20], [5,25], [6,30], [7,35], [8,40]], [8, 2]);
const ys = tf.tensor([1, 2, 3, 4, 5, 6, 7, 8], [8, 1]);
// Train the model using the data.
model.fit(xs, ys, { epochs: 100 }).then(() => {
// Use the model to do inference on a data point the model hasn't seen before:
// Open the browser devtools to see the output
model.predict(tf.tensor([10, 50], [1, 2])).print();
});
But, what I am facing is that when I try to predict the [10,50] input, I have the following console output:
Tensor
[[NaN],]
So, I think my problem might be very simple, but I am really stuck with this and probably it is a matter of some background knowledge I'm missing.
Thank you!
The first layer takes the shape of the input data
model.add(tf.layers.dense({ units: 2, inputShape: [2] }))
The inputShape is [2], which means that your input x is of shape [2].
The last layer unit value gives the dimension of the output y.
model.add(tf.layers.dense({ units: 1, inputShape: [64] }));
So the shape of y should be [1]
In this case, the NaN prediction is related to the number of epochs for your training. If you decrease it to 2 or 3, it will return a numerical value. Actually, the error is related to how your optimizer is updating the weights. Alternatively, you can change the optimizer to adam and it will be fine.
I think I am late but I hope this helps someone.
I got same problem once and it was because I am getting training and testing data from file using "fs" dependency and I solved the problem by just doing this to the returned variable before returning it to the main function to start training:
JSON.parse(JSON.stringify(data))
I don't know the reason but for some reason the tensorflow model only accepts JSON Array and not any JavaScript array so just by doing this you are converting your array to json array instead of leaving it as JavaScript Array.
Hope this saves someone's time.
I dealt with this same issue for the past 2 days, and the problem was that I trained my model with a GPU (using Google Colab) and performed inference on the CPU. After changing the settings on Google Colab to use no hardware acceleration, my problem was fixed!
Hope this helps someone in the future.