I've been stuck with this problem for a while and I can't find a solution here on StackOverflow for this case.
I'm trying to build a prediction model for a chatbot. The predictor is a message in a bag of words format, in a tensor of shape [1,108] while the target variable is another tensor with dummies from a nominal classificatory variable, with shape [1,15].
When I run a prediction it returns an array with random probabilities for each of the dummies.
I used brain.js using the same functions and it gave me a good prediction, however I am having trouble doing the same thing here on TensorFlow.
Here is my code:
async function trainModel() {
XandY = await createTrainingData()
let X = XandY[0];
var y = XandY[1];
const inputShape = [X.length,X[0].length] // [# of observations, # of features]
const outputShape = [y.length,y[0].length]
X = tf.tensor2d( X, inputShape )
y = tf.tensor2d( y, outputShape)
traningDataObject = {
data: X,
target: y,
}
const model = tf.sequential();
model.add(tf.layers.dense(
{ units: 128, activation: 'relu', inputShape: [108] }));
model.add(tf.layers.dense(
{ units: 64, activation: 'relu' }));
model.add(tf.layers.dense(
{ units: 32, activation: 'relu' }));
model.add(tf.layers.dense(
{ units: 15, activation: 'softmax' }));
model.compile({
optimizer: tf.train.adam(0.01),
loss: 'categoricalCrossentropy',
metrics: ['accuracy']
});
model.fit(X, y,
{epochs: 150, validationData: [X,y]});
return model
}
async function getPrediction(message,wordset) {
model = await trainModel();
bow = bagOfWords(message, wordset);
const input = tf.tensor([bagOfWords(message, wordset)]);
var prediction = model.predict(input);
var prediction_values = prediction.dataSync();
var prediction_array = Array.from(prediction_values);
console.log(prediction_array)
let greatestProba = 0;
prediction_array.forEach((element) => {
if (greatestProba < element) {
greatestProba = element;
}
});
if (greatestProba > 0.02) {
return intents[prediction_array.indexOf(greatestProba)];
} else {
return 'undefined'
}
}
async function main(){
const wordset = await getWordset();
const message = "Meu pagamento não caiu";
console.log(await getPrediction(message,wordset));
}
main()
How can I solve this? Where is the problem?
Using the same network in python gives me good predictions, but here not.
Alright, so the problem seemed to be the number of Epochs I used...
It is odd because I'm pretty sure I used a very reasonable number of epochs...
I removed the ValidationData as well...
If you do it like this:
const res = await model.fit(X, y, { epochs: 50 });
console.log(res.history.loss[0]);
it should work.
Related
I am training a simple image classification model with TFJS but the model keeps overfitting. I've experimented with various models/hyperparameters to no avail. I'm beginning to thing the problem is with the dataset itself, not the model as all models/params I've tried work perfectly on different datasets.
About the data itself, I have a small dataset of 600 flower images and a (3 actually) csv file containing the image paths and their labels that looks like this
path
label
05_067.png
9
05_068.png
2
...
...
The code below loads the CSV file, access the images through their paths and converts them to tensors.
/** load and normalize data */
const loadData = function (dataUrl: string, batches = batchSize) {
/** transform input array (xs) to 3D tensor, binarize output label (ys) */
const transform = ({ xs, ys }: any) => {
/** array of numbes (0 - labels.lengh) for use in `one-hot-ing` label values */
const zeros = [...Array(labels.length).keys()];
let buffer = fs.readFileSync(path.join(directory, `${xs[0]}`)),
imageTensor = tf.node.decodeImage(buffer, 3)
.resizeNearestNeighbor([128, 128])
.toFloat()
.div(tf.scalar(225.0)) //dividing pixel values by 225, converting the tensor to a Float32 datatype
// .expandDims()
return {
xs: imageTensor,
ys: tf.tensor1d(zeros.map(i => (i === ys ? 1 : 0)))
};
};
// load, normalize, transform, batch
return tf.data.csv(dataUrl, { columnConfigs: { label: { isLabel: true } } })
.map(({ xs, ys }: any) => ({ xs: Object.values(xs), ys: ys.label }))
.map(transform)
.batch(batches)
};
And this is the model that performs the highest on this dataset (0.003 training l0ss, 0.999 training accuracy, 1.9 validation loss and 0.5 validation accuracy).
/** Define the model architecture */
const model = tf.sequential();
model.add(tf.layers.conv2d({
inputShape: [128, 128, 3],
filters: 64,
kernelSize: 3,
activation: 'relu',
}));
model.add(tf.layers.maxPooling2d({ poolSize: 2, strides: 2 }));
model.add(tf.layers.dropout({ rate: 0.25 }));
model.add(tf.layers.conv2d({
filters: 128,
kernelSize: 3,
activation: 'relu',
}));
model.add(tf.layers.maxPooling2d({ poolSize: 2, strides: 2 }));
model.add(tf.layers.dropout({ rate: 0.25 }));
model.add(tf.layers.flatten());
model.add(tf.layers.dense({ units: 64, activation: 'relu' }));
model.add(tf.layers.dense({ units: labels.length, activation: 'softmax' }));
model.compile({
optimizer: tf.train.adam(0.001),
loss: 'categoricalCrossentropy',
metrics: ['accuracy'],
});
export default model;
This model goes up to 0.96 validation accuracy with the MNIST dataset but then that dataset contains pixel values from the start, not image paths.
My question now is 'What exactly am i doing wrong here?'. Am i loading the data wrongly or am i just using the wrong models?
I would like to add that i have the same dataset in another project but structured as individual folders for each label, and that trains and validates perfectly with the model above.
As a beginner, I have tried build a really simple multi-class classifier in tensorflowJS which is suppose to predict the direction of my eye sight.
Step 1: I created data set in the browser to train my model where I am storing images of my eyes rendered by webcam on a HTML5 canvas. I use arrow keys to label my images as 0=left,1=normal and 2=right. To train the model, I convert these lables using tf.onHot() before passing to the method.
// data collection
let imageArray = [];
let labelArray = [];
let collectData = (label) => {
const img = tf.tidy(() => {
const captureImg = getImage();
//console.log(captureImg.shape)
return captureImg;
})
imageArray.push(img)
labelArray.push(label) //--- labels are 0,1,2
}
// label conversion
let labelSet = tf.oneHot(tf.tensor1d(labelArray, 'int32'), 3);
Step 2: Instead of loading any per-trained model, I used my custom model that I built using tensorflowJS.
let createModel = () => {
const model = tf.sequential();
let config_one = {
kernelSize: 3,
filters: 40,
strides: 1,
activation: 'relu',
inputShape: [imageHeight, imageWidth, imageChannels]
}
model.add(tf.layers.conv2d(config_one));
let config_two = {
poolSize: [2, 2],
strides: [2, 2],
}
model.add(tf.layers.maxPooling2d(config_two));
model.add(tf.layers.flatten());
model.add(tf.layers.dropout(0.2));
// Two output values x and y
let congfig_output = {
units: 3,
activation: 'tanh',
}
model.add(tf.layers.dense(congfig_output));
// Use ADAM optimizer with learning rate of 0.0005 and MSE loss
let config_compile = {
optimizer: tf.train.adam(0.00005),
loss: 'categoricalCrossentropy',
}
model.compile(config_compile);
tf.memory()
return model;
}
Problems : There are several problems I am facing right now.
When I use meanSquared as loss function and adam learning rate 0.000005, my model starts predicting but it only predicts two of the eye's state normal and left/right so to do multi-class classification, I changed loss function to categoricalCrossentropy but the result is still same or sometime worst.
I tried other combination of hyper parameters but no luck. The worst situation I got into was my loss function was showing only three constant values repeatedly.
My browser would crashed in some case where - if - I pass too much data or use other type of optimizer in compile config such as sgd or anything else. When I did a quick search on google, I found I can use tf.memory() to check any memory leak which could be causing browser crash but that line didn't log anything in the console.
I was adjusting various values and parameters in the code and training the model which made it work sometimes, partially, and most of the time didn't even work. It was all hit and trial. Eventually I learned about parameters to use for loss function in the compile method and activation function in con2d input layer but other stuff is still confusing such as - number of epochs, batch size, learning rate in adam etc.
I understood or I think I understood these - kernalsize, filters, strides, inputshape but still have no idea how to decide number of layers various hyper parameters etc.
Edit - this is what I get after updating the code as per the suggestion. I still don't proper classification. I am training with minimum of 1000+ images.
A. I still get the loss recurring with fixed valeus
B. Accuracy is also repeating itself with 1, 0.5 and 0
function getImage() {
return tf.tidy(function () {
const image = tf.browser.fromPixels($('#eyes')[0]);
const batchedImage = image.expandDims(0);
const norm = batchedImage.toFloat().div(tf.scalar(255)).sub(tf.scalar(1));
return norm;
});
}
Here are the console output
Sample images -
Most obvious thing to me that is wrong with this is your output layer's activation function, where you use tanh you should be using softmax instead. Next, your learning rate is way to low try setting it to 0.001 which is a good default to use.
You also probably don't need dropout as you have not gotten any results to justify that the model is overfitting. You could also add in more convolutional layers to this, try the example below.
model.add(tf.layers.conv2d({
inputShape: [28, 28, 1],
kernelSize: 5,
filters: 8,
strides: 1,
activation: 'relu',
}));
model.add(tf.layers.maxPooling2d({
poolSize: [2, 2],
strides: [2, 2],
}));
model.add(tf.layers.conv2d({
kernelSize: 5,
filters: 16,
strides: 1,
activation: 'relu',
}));
model.add(tf.layers.maxPooling2d({
poolSize: [2, 2],
strides: [2, 2],
}));
model.add(tf.layers.flatten());
model.add(tf.layers.dense({
units: 3,
activation: 'softmax',
}));
const LEARNING_RATE = 0.001;
const optimizer = tf.train.adam(LEARNING_RATE);
model.compile({
optimizer: optimizer,
loss: 'categoricalCrossentropy',
metrics: ['accuracy'],
});
I've been trying to set up a simple reinforcement learning example using tfjs. However, when trying to train the model I am running into the following error:
Uncaught (in promise) Error: Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]
I built the model up as following:
const NUM_OUTPUTS = 4;
const model = tf.sequential();
//First hidden Layer, which also defines the input shape of the model
model.add(
tf.layers.dense({
units: LAYER_1_UNITS,
batchInputShape: [null, NUM_INPUTS],
activation: "relu",
})
);
// Second hidden Layer
model.add(tf.layers.dense({ units: LAYER_2_UNITS, activation: "relu" }));
// Third hidden Layer
model.add(tf.layers.dense({ units: LAYER_3_UNITS, activation: "relu" }));
// Fourth hidden Layer
model.add(tf.layers.dense({ units: LAYER_4_UNITS, activation: "relu" }));
// Defining the output Layer of the model
model.add(tf.layers.dense({ units: NUM_OUTPUTS, activation: "relu" }));
model.compile({
optimizer: tf.train.adam(),
loss: "sparseCategoricalCrossentropy",
metrics: "accuracy",
});
The training is done by a function that calculates the Q-values for some examples:
batch.forEach((sample) => {
const { state, nextState, action, reward } = sample;
// We let the model predict the rewards of the current state.
const current_Q: tf.Tensor = <tf.Tensor>model.predict(state);
// We also let the model predict the rewards for the next state, if there was a next state in the
//game.
let future_reward = tf.zeros([NUM_ACTIONS]);
if (nextState) {
future_reward = <Tensor>model.predict(nextState);
}
let totalValue =
reward + discountFactor * future_reward.max().dataSync()[0];
current_Q.bufferSync().set(totalValue, 0, action);
// We can now push the state to the input collector
x = x.concat(Array.from(state.dataSync()));
// For the labels/outputs, we push the updated Q values
y = y.concat(Array.from(current_Q.dataSync()));
});
await model.fit(
tf.tensor2d(x, [batch.length, NUM_INPUTS]),
tf.tensor2d(y, [batch.length, NUM_OUTPUTS]),
{
batchSize: batch.length,
epochs: 3,
}
);
This appeared to be the right way to provide the examples to the fit function, seeing as when logging the model, the shape of the last dense layer is correct:
Log of the shape of dense_Dense5
However it results in the error shown above, where instead of the expected shape [3,4] it checks for the shape [,1]. I really dont understand where this shape is suddenly coming from and would much appreciate some help with this!
For a better overview, you can simply view/check out the whole project from its Github repo:
Github Repo
The tensorflow code in question is in the AI folder.
EDIT:
Providing a summary of the model plus some info of the shape of the tensor im providing for y in model.fit(x,y) :
Solved: Issue occured due to using the wrong loss function. Moving from categoricalCrossEntropy to meanSquaredError fixed the issue with the shape of the output layer mismatching the batch shape.
const trainingData = tf.tensor3d(fixedData.map(item =>
[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],[...],[...],[...]]
))
model.add(tf.layers.dense({
inputShape: [4,16],
activation: "relu",
units: 4,
}))
model.compile({
loss: "meanSquaredError",
optimizer: tf.train.adam(0.05),
metrics: ['accuracy']
})
model.fit(trainingData, outputData, {epochs: 10})
.then((history) => {
// console.log(history)
model.predict(testingData).print()
})
Error:
(node:5118) UnhandledPromiseRejectionWarning: Error: Error when checking input: expected dense_Dense1_input to have 2 dimension(s). but got array with shape 935,4,16.
can the inputShape be 2-dimension?
you have not provided the complete code. It would be important to see your labels (output) data. I have fake the output data to match the output of your single dense layer.
Also, as #yudhiesh mentioned in the comments, your tensor had just 2 dimensions. I have also fixed that in case you wanna stick with [4,16] for each input.
here is the code running
const trainingData = tf.tensor3d(
[[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]]]
)
const output = tf.tensor3d(
[[ [1,2,3,4],
[1,2,3,4],
[1,2,3,4],
[1,2,3,4]]]
)
const model = tf.sequential()
model.add(tf.layers.dense({
inputShape: [4, 16],
activation: "relu",
units: 4
}))
model.compile({
loss: "meanSquaredError",
optimizer: tf.train.adam(0.05),
metrics: ['accuracy']
})
model.fit(trainingData,output, {epochs: 2})
.then((history) => {
model.predict(trainingData).print()
}).catch((e) => {
console.log(e.message);
});
I am new to tensorflow and are reading mnist_export.py in tensorflow serving example.
There is something here I cannot understand:
sess = tf.InteractiveSession()
serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
feature_configs = {
'x': tf.FixedLenFeature(shape=[784], dtype=tf.float32),
}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
x = tf.identity(tf_example['x'], name='x') # use tf.identity() to assign name
Above, serialized_tf_example is a Tensor.
I have read the api document tf.parse_example but it seems that serialized is serialized Example protos like:
serialized = [
features
{ feature { key: "ft" value { float_list { value: [1.0, 2.0] } } } },
features
{ feature []},
features
{ feature { key: "ft" value { float_list { value: [3.0] } } }
]
So how to understand tf_example = tf.parse_example(serialized_tf_example, feature_configs) here as serialized_tf_example is a Tensor, not Example proto?
Here serialized_tf_example is serialized string of a tf.train.Example. See tf.parse_example for the usage. Reading data chapter gives some example link.
tf_example.SerializeToString() converts tf.train.Example to string and tf.parse_example parses the serialized string to a dict.
The below mentioned code provides the simple example of using parse_example
import tensorflow as tf
sess = tf.InteractiveSession()
serialized_tf_example = tf.placeholder(tf.string, shape=[1], name='serialized_tf_example')
feature_configs = {'x': tf.FixedLenFeature(shape=[1], dtype=tf.float32)}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
feature_dict = {'x': tf.train.Feature(float_list=tf.train.FloatList(value=[25]))}
example = tf.train.Example(features=tf.train.Features(feature=feature_dict))
f = example.SerializeToString()
sess.run(tf_example,feed_dict={serialized_tf_example:[f]})