Feed tensorflow or keras neural nets input with custom dimensions - tensorflow

I would like to feed a neural net inputs of following shape:
Each training entry is a 2D array with dimensions 700x10. There are in total 204 training entries.
Labels is just 1-dimensional array of size 204 (binary output)
I tried to just use Dense layers:
model = Sequential()
model.add(Dense(300, activation='relu', input_shape=(700, 10)))
model.add(Dense(1, activation='sigmoid'))
But then I am getting following error (not related to input_shape on the first layer, but during validation of output):
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (204, 1)
204 - amount of training data.
Stacktrace:
model.fit(xTrain, yTrain, epochs=4, batch_size=6)
File "keras\models.py", line 867, in fit
initial_epoch=initial_epoch)
File "keras\engine\training.py", line 1522, in fit
batch_size=batch_size)
File "keras\engine\training.py", line 1382, in _standardize_user_data
exception_prefix='target')
File "keras\engine\training.py", line 132, in _standardize_input_data
What I found out while debugging Keras code:
It fails during validation before training. It validates output array.
According to the neural network structure, first Dense layer produces somehow 700, 1 dimensional output and it fails afterwards, since my output is just 1-d array with 204 in it.
How do I overcome this issue? I tried to add Flatten() after Dense() layer, but it probably affects accuracy in a bad way: I would like to keep information specific to one point from 700 array grouped.

The Dense layers works on only one dimension, the last.
If you're inputting (700,10) to it, it will output (700,units). Check your model.summary() to see this.
A simple solution is to flatten your data before applying dense:
model.add(Flatten(input_shape=(700,10)))
model.add(Dense(300,...))
model.add(Dense(1,...))
This way, the Dense layer will see a simple (7000,) input.
Now if you do want your model to understand those 2 dimensions separately, you should perhaps try more elaborated structures. What to do will depend a lot on what your data is and what you want to do, how you want your model to understand it, etc.

Related

Keras model.fit, dimensions must be equal?

I am a newbie in ML. I have a set of timeseries data with Date and Temp cols., that I want to use for anomaly detection. I used the MinMax scaler on the data and I got an array normal_train_data with shape (200, 0).
Then I used the autoencoder which uses
keras.layers.Dense(128, activation ='sigmoid').
After that, when I call
history = model.fit(normal_train_data, normal_train_data, epochs= 50, batch_size=128, validation_data=(train_data_scaled[:,1:], train_data_scaled[:,1:]) ...)
I get the error:
ValueaError: Dimensions must be equal but are 128 and 0 with input shapes: [?,128], [?,0].
As far as I understand the input has shape (200,0) and the output(1,128).
Can you help me to fix this error please? Thankyou
I tried to use tf.keras.layers.Flatten() in the encoder part. I am not sure if it's ok to use Dense layer or should I choose another.

Tensorflow Keras output layer shape weird error

I am fairly new to TF, Keras and ML in general.
I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.
Here is how I create the model:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(3)
])
and these are the X and y shapes:
X_train.shape,y_train.shape
TensorShape([64,3,2]),TensorShape([64,3])
On model.fit I am facing a weird error I cannot understand:
ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]
I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?
Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:
InvalidArgumentError: required broadcastable shapes at loc(unknown)
What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:
model = tf.keras.Sequential([
tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
tf.keras.layers.Dense(1)
])
model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)
I am very confused.
Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.
In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.
Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.
When you pass an input to a dense layer, the input should be flattened first. There are 2 ways to deal with this:
Way 1: Adding a flatten input as a first layer of your model:
model = Sequential()
model.add(Flatten(input_shape=(3,2)))
model.add(Dense(50, 'relu'))
model.add(Dense(3))
Way 2: Converting the 2D array to 1D before passing the inputs to your model:
X_train = tf.reshape(X_train, shape=([6]))
or
X_train = tf.reshape(X_train, shape=((6,)))
Then change the input shape of the first layer as:
model.add(Dense(50, 'relu', input_shape=(6,))

Why is it is asking for labels to have some other shape?

Hello I am trying to get an output of an array of 7 classes. But when I run my code it says that it expects my data output labels to have some other shape. Here is my code -
def make_model(self):
self.model.add(InceptionV3(include_top=False,
input_shape=(self.WIDTH, self.HEIGHT, 3),
weights="imagenet"))
self.model.add(Dense(7, activation='softmax'))
self.model.layers[0].trainable = False
My model compilation and fitment part
def train(self):
self.model.compile(optimizer=self.optimizer, loss='mse', metrics=['accuracy'])
self.model.fit(x=x, y=y, batch_size=64,
validation_split=0.15, shuffle=True, epochs=self.epochs,
callbacks=[self.tensorboard, self.reducelr])
I get the error -
File "model.py", line 60, in train
callbacks=[self.tensorboard, self.reducelr])
ValueError: A target array with shape (23639, 7) was passed for an output of shape (None, 6, 13, 7) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
Now here it is saying that it expected (None, 6, 13, 7) however i gave it labels - (23639, 7)
Now we can clearly see that in the self.model.add(Dense(7, activation='softmax')) I have specified 7 as the number of output categories
Here is the model summary -
So can someone tell me what is wrong here
By the way i did try using categorical_crossentropy to see if it makes a difference but it didn't.
In case you wanted the full code -
Full Code
The problem is in the output of the InceptionV3... it returns 4D sequences, you need to reduce the dimensionality before the final dense layer in order to match the target dimensionality (2D). you can do this using Flatten or GlobalPooling layers.
If yours is a classification problem I also recommend you use categorical_crossentropy (if you have one-hot encoded label) or sparse_categorical_crossentropy (if u have integer encoded labels). mse is suited for regression problems

How to use embedding models in tensorflow hub with LSTM layer?

I'm learning tensorflow 2 working through the text classification with TF hub tutorial. It used an embedding module from TF hub. I was wondering if I could modify the model to include a LSTM layer. Here's what I've tried:
train_data, validation_data, test_data = tfds.load(
name="imdb_reviews",
split=('train[:60%]', 'train[60%:]', 'test'),
as_supervised=True)
embedding = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[],
dtype=tf.string, trainable=True)
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Embedding(10000, 50))
model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)))
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.summary()
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_data.shuffle(10000).batch(512),
epochs=10,
validation_data=validation_data.batch(512),
verbose=1)
results = model.evaluate(test_data.batch(512), verbose=2)
for name, value in zip(model.metrics_names, results):
print("%s: %.3f" % (name, value))
I don't know how to get the vocabulary size from the hub_layer. So I just put 10000 there. When run it, it throws this exception:
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[480,1] = -6 is not in [0, 10000)
[[node sequential/embedding/embedding_lookup (defined at .../learning/tensorflow/text_classify.py:36) ]] [Op:__inference_train_function_36284]
Errors may have originated from an input operation.
Input Source operations connected to node sequential/embedding/embedding_lookup:
sequential/embedding/embedding_lookup/34017 (defined at Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py:112)
Function call stack:
train_function
I stuck here. My questions are:
how should I use the embedding module from TF hub to feed an LSTM layer? it looks like embedding lookup has some issues with the setting.
how do I get the vocabulary size from the hub layer?
Thanks
Finally figured out the way to link pre-trained embeddings to LSTM or other layers. Just post the steps here in case anyone feels helpful.
Embedding layer has to be the first layer in the model. (hub_layer is the same as Embedding layer.) The not very intuitive part is that any text input to the hub layer will be converted to only one vector of shape [embedding_dim]. You need to do sentence splitting and tokenization to make sure whatever input to the model is a sequence in the form of array of arrays. e.g., "Let us prepare the data." should be converted to [["let"],["us"],["prepare"], ["the"], ["data"]]. You will also need to pad the sequences if you are using batch mode.
In addition, you will need to convert your target tokens to int if your training labels are strings. The input to the model is array of strings with shape [batch, seq_length], the hub embedding layer converts it to [batch, seq_length, embed_dim]. (If you add a LSTM or other RNN layer, the output from the layer is [batch, seq_length, rnn_units]. ) The output dense layer will output index of text instead of actual text. The index of text is stored in the downloaded tfhub directory as "tokens.txt". You can load the file and convert text to the corresponding index. Otherwise you cannot compute the loss.

Regarding setting up the target tensor shape for sparse_categorical_crossentropy

I am trying to experiment with a multi-layer encoder-decoder type of network. The screenshot of the last several layers of network architecture is as follows. This is how I setup model compiling and training process.
optimizer = SGD(lr=0.001, momentum=0.9, decay=0.0005, nesterov=False)
autoencoder.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=['accuracy'])
model.fit(imgs_train, imgs_mask_train, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1,callbacks=[model_checkpoint])
imgs_train and imgs_mask_train are of shape (2000, 1, 128, 128). imgs_train represent the raw image and imgs_mask_train represents the mask image. I am trying to solve a semantic segmentation problem. However, running the program generates the following error message, (I only keep the main related part).
tensorflow.python.pywrap_tensorflow.StatusNotOK: Invalid argument: logits first dimension must match labels size. logits shape=[4096,128] labels shape=[524288]
[[Node: SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_364, Cast_158)]]
It seems to me that the loss function of sparse_categorical_crossentropy causes the problem for the current (imgs_train, imgs_mask_train) shape setting. The Keras API does not include the detail about how to setup the target tensor. Any suggestions are highly appreciated!
I am currently trying to figure the same problem and as far as I can tell it takes a sparse representation of the target category. That means integers as the target label instead of the one-hot encoded binary class matrix.
Concerning your problem, do you have categories in your masking or do you just have information about the outline of an object? With outline information it becomes a pixel wise binary loss instead of a categorical one. If you have categories, the output of your decoder should have dimensionality (None, number_of_classes, 128, 128). On that you should be able to use a sparse target mask but I haven't tried this myself...
Hope that helps