I'd like to classify a signal containing
X = (n_samples, n_timesteps, n_features), where n_samples=476, n_timesteps=400, n_features=16 are the number of samples, timesteps, and features (or channels) of the signal.
y = (n_samples, n_timesteps, 1). Each timestep is labeled by either 0 or 1 (binary classification).
My graph model is shown in the figure below.
The input is fed into a 32-unit LSTM. The LSTM output enters a 1-unit Dense layer to generate a 400x1 vector, where 400 is the number of timesteps. I then would like to put this 400x1 vector into a 400-unit Dense layer. I tried to flatten the 1-unit Dense, but the shape of the final output does not match the label 400x1 vector.
The snippet and model are shown as follows.
input_layer = Input(shape=(n_timestep, n_feature))
lstm1 = LSTM(32, return_sequences=True)(input_layer)
dense1 = Dense(1, activation='sigmoid')(lstm1)
flat1 = TimeDistributed(Flatten())(dense1)
dense2 = TimeDistributed(Dense(400, activation='sigmoid'))(flat1)
model = Model(inputs=input_layer, outputs=dense2)
model.summary()
The error is seen below.
ValueError: Error when checking target: expected time_distributed_4 to have shape (400, 400) but got array with shape (400, 1)
Please let me know how to fix it. Thanks.
Related
I am building a Siamese network using Keras(TensorFlow) where the target is a binary column, i.e., match or mismatch(1 or 0). But the model fit method throws an error saying that the y_pred is not compatible with the y_true shape. I am using the binary_crossentropy loss function.
Here is the error I see:
Here is the code I am using:
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[tf.keras.metrics.Recall()])
history = model.fit([X_train_entity_1.todense(),X_train_entity_2.todense()],np.array(y_train),
epochs=2,
batch_size=32,
verbose=2,
shuffle=True)
My Input data shapes are as follows:
Inputs:
X_train_entity_1.shape is (700,2822)
X_train_entity_2.shape is (700,2822)
Target:
y_train.shape is (700,1)
In the error it throws, y_pred is the variable which was created internally. What is y_pred dimension is 2822 when I am having a binary target. And 2822 dimension actually matches the input size, but how do I understand this?
Here is the model I created:
in_layers = []
out_layers = []
for i in range(2):
input_layer = Input(shape=(1,))
embedding_layer = Embedding(embed_input_size+1, embed_output_size)(input_layer)
lstm_layer_1 = Bidirectional(LSTM(1024, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(embedding_layer)
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=True,recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
in_layers.append(input_layer)
out_layers.append(lstm_layer_2)
merge = concatenate(out_layers)
dense1 = Dense(256, activation='relu', kernel_initializer='he_normal', name='data_embed')(merge)
drp1 = Dropout(0.4)(dense1)
btch_norm1 = BatchNormalization()(drp1)
dense2 = Dense(32, activation='relu', kernel_initializer='he_normal')(btch_norm1)
drp2 = Dropout(0.4)(dense2)
btch_norm2 = BatchNormalization()(drp2)
output = Dense(1, activation='sigmoid')(btch_norm2)
model = Model(inputs=in_layers, outputs=output)
model.summary()
Since my data is very sparse, I used todense. And there the type is as follows:
type(X_train_entity_1) is scipy.sparse.csr.csr_matrix
type(X_train_entity_1.todense()) is numpy.matrix
type(X_train_entity_2) is scipy.sparse.csr.csr_matrix
type(X_train_entity_2.todense()) is numpy.matrix
Summary of last few layers as follows:
Mismatched shape in the Input layer. The input shape needs to match the shape of a single element passed as x, or dataset.shape[1:]. So since your dataset size is (700,2822), that is 700 samples of size 2822. So your input shape should be 2822.
Change:
input_layer = Input(shape=(1,))
To:
input_layer = Input(shape=(2822,))
You need to set return_sequences in the lstm_layer_2 to False:
lstm_layer_2 = Bidirectional(LSTM(512, return_sequences=False, recurrent_dropout=0.2, dropout=0.2))(lstm_layer_1)
Otherwise, you will still have the timesteps of your input. That is why you have the shape (None, 2822, 1). You can also add a Flatten layer prior to your output layer, but I would recommend setting return_sequences=False.
Note that a Dense layer computes the dot product between the inputs and the kernel along the last axis of the inputs.
I want to do sentiment analysis using bert-embedding and lstm layer.
This is my code:
i = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
x = bert_preprocess(i)
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(x)
model = tf.keras.Model(i, x)
When compiling this code I got the following error:
ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected
ndim=3, found ndim=2. Full shape received: (None, 768)
Is the logic of my code correct? Can anyone please correct my code?
From bert like models you can expect generally three kinds of outputs (taken from huggingface's TFBertModel documentation)
last_hidden_state with shape (batch_size, sequence_length, hidden_size)
pooler_output with shape (batch_size, hidden_size)
hidden_states with shape (batch_size, sequence_length, hidden_size)
hidden_size is 768 above..
As the error says, the output from dropout layer lacks 3 dimensions (essentially the bert_encoder layer because dropout layers do not change tensor shape) and has only 2 dimensions.
x = bert_encoder(x)
x = tf.keras.layers.Dropout(0.2, name="dropout")(x['pooled_output'])
x = tf.keras.layers.LSTM(128, dropout=0.2)(x)
So if you are planning to use an LSTM layer after the bert_encoder layer, you would need a three dimensional input to the LSTM in the form of (batch_size, num_timesteps, num_features) hence you would have to use either the hidden_states or the last_hidden_state outputs instead of pooler_output.
You will have to choose between the two depending on your objective/use-case.
I'm trying to build a Sequential model with tensorflow.
import tensorflow as tf
import keras
from tensorflow.keras import layers
from keras import optimizers
import numpy as np
model = keras.Sequential (name="model")
model.add(keras.Input(shape=(786,)))
model.add(layers.Dense(2048, activation="relu", name="layer1"))
model.add(layers.Dense(786, activation="relu", name="layer2"))
model.add(layers.Dense(786, activation="relu", name="layer3"))
output = model.add(layers.Dense(786, activation="relu", name="output"))
model.summary()
model.compile(
optimizer=tf.optimizers.Adam(), # Optimizer
loss=keras.losses.CategoricalCrossentropy(),
metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
history = model.fit(
x_train,
y_train,
batch_size=1,
epochs=5,
)
The input shape is a vector with length of 768 (so the input shape is (768,) right?), representing a chess board:
def get_dataset():
container = np.load('/content/drive/MyDrive/test_data_vector.npz')
b, v = container['arr_0'], container['arr_1']
v = np.asarray(v / abs(v).max() / 2 + 0.5, dtype=np.float32) # normalization (0 - 1)
return b, v
xtrain, ytrain = get_dataset()
print(xtrain.shape)
print(ytrain.shape)
>> (37, 786) #there are 37 samples
>> (37, 786)
But I always get the error:
ValueError: Input 0 of layer model is incompatible with the layer: expected axis -1 of input shape to have value 786 but received input with shape (1, 1, 768)
I tried with np.expand_dims(), which ended in the same Error.
The error is just a typo, as the user mentioned the issue is resolved by changing the output shape from 786 to 768 and the issue is resolved.
One suggestion based on the model structure.
The number of units are not related to your input shape, you don't have to match that number.
The number of units like 2048 and 786 in dense layer is too large and this may not help the model to learn better.
Try with smaller numbers like 32,64 etc, you can refer some of the examples in the tensorflow document.
I'm trying to build a model in tensorflow that uses sentences in order to predict images. I transformed all the sentences to a list of lists of size 300 each one.
0 [-0.22607538080774248, 0.30380163341760635, 0....
1 [-0.10856867488473654, 0.17990960367023945, 0....
2 [-0.15721752890385687, 0.1608753204345703, 0.4...
3 [-0.12894394318573177, 0.13585415855050087, 0....
4 [-0.27382510248571634, 0.22385768964886665, 0....
40449 [-0.28715573996305466, 0.2722414545714855, 0.6...
40451 [-0.04035807272884995, 0.2275269404053688, 0.3...
40452 [-0.19741788890678436, 0.3378600552678108, 0.7...
40453 [-0.10771899553947151, 0.13040382787585258, 0....
40454 [-0.07718773453962058, 0.28313175216317177, 0....
Name: Text, Length: 31978, dtype: object
How can I give it to tensorflow as an input?
I tried
model = Sequential([
Dense(2, activation="relu", input_shape = (300,)),
Reshape((256, 256, 3), input_shape = (300,))
])
model.compile(loss='mse', optimizer='adam')
history = model.fit(x_ent, y_ent, epochs=3, batch_size=64)
But when I compile the model, it says
ValueError: Error when checking input: expected dense_2_input to have shape (300,) but got array with shape (1,)
Also, I used the Reshape layer in order to transform vectors to images, but I don't know if there is a better way to do that.
Does each image need 300 sentences for classification? Or does each sentence has a feature vector of size 300? If you have each sentence as a list which has a lenght of 300 and if you have 40454 sentences your input shape must be 40454x300. So you could pass input_shape = (40454,300) to Dense input layer. It should work.
I referred to the tensorflow keras documentation.
N-D tensor with shape: (batch_size, ..., input_dim). The most common
situation would be a 2D input with shape (batch_size, input_dim).
I have a dataset:
100 timesteps
10 variables
for example,
dataset = np.arange(1000).reshape(100,10)
The 10 variables are related to each other. So I want to reduce its dimension from 10 to 1.
Also, 100 time steps are related.
Which deep learning architecture is suitable for it guys?
edit:
from keras.models import Sequential
from keras.layers import LSTM, Dense
X = np.arange(1000).reshape(100,10)
model = Sequential()
model.add(LSTM(input_shape = (100, 10), return_sequences=False))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(???, epochs=50, batch_size=5)
In order to compress your data, the best course of action is to use an autoencoder.
Autoencoder architecture:
Input ---> Encoder (reduces dimensionality of the input) ----> Decoder (tries to recreate input) ---> Lossy version of input
By extracting the trained encoder, we can find a way to represent your data using fewer dimensions.
from keras.layers import Input, Dense
from keras.models import Model
input = Input(shape=(10,)) #will take an input in shape of (num_samples, 10)
encoded = Dense(1, activation='relu')(input) #returns a 1D vector from input
decoded = Dense(10, activation='sigmoid)(encoded) #tries to recreate input from 1D vector
autoencoder = Model(input, decoded) #input image ---> lossy reconstruction from decoded
Now that we have the autoencoder, we need to extract what you really want- the part encoder that reduces the input's dimensionality:
encoder = Model(input, encoded) #maps input to reduced-dimension encoded form
Compile and train the autoencoder:
autoencoder.compile(optimizer='adam', loss='mse')
X = np.arange(1000).reshape(100, 10)
autoencoder.fit(X, X, batch_size=5, epochs=50)
Now you can use the encoder to reduce dimensionality:
encoded_form = encoder.predict(<something with shape (samples, 10)>) #outs 1D vector
You probably also want the decoder as well. If you are going to use it put this block of code right before you compile and fit the autoencoder:
encoded_form = Input(shape=(1,))
decoder_layer = autoencoder.layers[-1]
decoder = Model.(encoded_form, decoder_layer(encoded_form))