3 dimensional array as input with Embedding Layer and LSTM in Keras - tensorflow

Hey guys I have built an LSTM model that works and now I am trying(unsuccessfully) to add an Embedding layer as a first layer.
This solution didn't work for me.
I also read these questions before asking:
Keras input explanation: input_shape, units, batch_size, dim, etc,
Understanding Keras LSTMs and keras examples.
My input is a one-hot encoding(of ones and zeros) of characters of a language that consists 27 letters. I chose to represent each word as a sequence of 10 characters. Input size for each word is (10,27) and I have 465 of them so it's X_train.shape (465,10,27), I also have a label of size y_train.shape (465,1). My goal is to train a model and while doing that to build a character embeddings.
Now this is the model that compiles and fits.
main_input = Input(shape=(10, 27))
rnn = Bidirectional(LSTM(5))
x = rnn(main_input)
de = Dense(1, activation='sigmoid')(x)
model = Model(inputs = main_input, outputs = de)
model.compile(loss='binary_crossentropy',optimizer='adam')
model.fit(X_train, y_train, epochs=10, batch_size=1, verbose=1)
After adding Embedding layer:
main_input = Input(shape=(10, 27))
emb = Embedding(input_dim=2, output_dim = 10)(main_input)
rnn = Bidirectional(LSTM(5))
x = rnn(emb)
de = Dense(1, activation='sigmoid')(x)
model = Model(inputs = main_input, outputs = de)
model.compile(loss='binary_crossentropy',optimizer='adam')
model.fit(X_train, y_train, epochs=10, batch_size=1, verbose=1)
output: ValueError: Input 0 is incompatible with layer bidirectional_31: expected ndim=3, found ndim=4
How do I fix the output shape?
Your ideas would be much appreciated.

My input is a one-hot encoding(of ones and zeros) of characters of a language that consists 27 letters.
You shouldn't pass a one-hot-encoding into an Embedding. Embedding layers map an integer index to an n-dimensional vector. As a result you should pass in the pre-one-hotted indexes directly.
I.e. before you have an one-hotted input like [[0, 1, 0], [1, 0, 0], [0, 0, 1]], which was created from a set of integers like [1, 0, 2]. Instead of passing on the (10, 27) one-hotted vector pass in original vector of (10,).
main_input = Input(shape=(10,)) # only pass in the indexes
emb = Embedding(input_dim=27, output_dim = 10)(main_input) # vocab size is 27
rnn = Bidirectional(LSTM(5))
x = rnn(emb)
de = Dense(1, activation='sigmoid')(x)
model = Model(inputs = main_input, outputs = de)
model.compile(loss='binary_crossentropy',optimizer='adam')
model.fit(X_train, y_train, epochs=10, batch_size=1, verbose=1)

Related

Keras functional api input shape error, lstm layer received 2d instead of 3d shape

I am using the keras functional api, but i'm getting an error about the input shape of the model -
ValueError: Input 0 is incompatible with layer financial_model: expected shape=(None, 1, 62), found shape=(1, 62)
samples = np.array(samples, dtype=np.float64)
labels = np.array(labels, dtype=np.uint8)
x_train, x_test, y_train, y_test = train_test_split(samples, labels, test_size=0.33,
random_state=42)
min_max = MinMaxScaler()
x_train = min_max.fit_transform(x_train)
lstm_input = np.expand_dims(x_train, axis=1).shape
inputs = keras.Input(shape=(lstm_input[1],lstm_input[2]))
hidden = keras.layers.LSTM(lstm_input[2], activation='tanh')(inputs)
output = keras.layers.Dense(2)(hidden)
model = keras.Model(inputs=inputs, outputs=output, name="financial_model")
model.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(learning_rate=0.001),
metrics=["accuracy"],
)
model.summary()
history = model.fit(x_train, y_train, batch_size=1, epochs=5, validation_split=0.2)
I've learnt from similar questions that the batch size is omitted in the input shape dimensions. How do I feed a 3 dimensional input shape into the lstm layer when the batch size is left out in the input object?
Since I have less than 50 reputation, I cannot comment. I'm not sure of this, but as the error says, your input shape is wrong. You have to add another dimension to it. Try something like this:
inputs = keras.Input(shape=(lstm_input[1],lstm_input[2], 1))

Ragged tensors as input for LSTM

Learning about ragged tensors and how can I use them with tensorflow.
My example
xx = tf.ragged.constant([
[0.1, 0.2],
[0.4, 0.7 , 0.5, 0.6]
])
yy = np.array([[0, 0, 1], [1,0,0]])
mdl = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=[None], batch_size=2, dtype=tf.float32, ragged=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(3, activation='softmax')
])
mdl.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
mdl.summary()
history = mdl.fit(xx, yy, epochs=10)
The error
Input 0 of layer lstm_152 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [2, None]
I am not sure if I can use ragged tensors like this. All examples I found have embedding layer before LSTM, but what I don't want to create additional embedding layer.
I recommend to use Input layer rather than InputLayer, you often not need to use InputLayer, Anyway the probelm that the shape of your input and LSTM layer input shape was wrong , here the modification i have made with some comments.
# xx should be 3d for LSTM
xx = tf.ragged.constant([
[[0.1, 0.2]],
[[0.4, 0.7 , 0.5, 0.6]]
])
"""
Labels represented as OneHotEncoding so you
should use CategoricalCrossentropy instade of SparseCategoricalCrossentropy
"""
yy = np.array([[0, 0, 1], [1,0,0]])
# For ragged tensor , get maximum sequence length
max_seq = xx.bounding_shape()[-1]
mdl = tf.keras.Sequential([
# Input Layer with shape = [Any, maximum sequence length]
tf.keras.layers.Input(shape=[None, max_seq], batch_size=2, dtype=tf.float32, ragged=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(3, activation='softmax')
])
# CategoricalCrossentropy
mdl.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
mdl.summary()
history = mdl.fit(xx, yy, epochs=10)

How can I reduce the dimension of data, loaded through the flow_from_directory function of ImageDataGenerator?

Since I load my data (images) from the structured folders, I utilize the flow_from_directory function of the ImageDataGenerator class, which is provided by Keras. I've no issues while feeding this data to a CNN model. But when it comes to an LSTM model, getting the following error: ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (64, 28, 28, 1). How can I reduce the dimension of the input data while reading it via ImageDataGenerator objects to be able to use an LSTM model instead of a CNN?
p.s. The shape of the input images is (28, 28) and they are grayscale.
train_valid_datagen = ImageDataGenerator(validation_split=0.2)
train_gen = train_valid_datagen.flow_from_directory(
directory=TRAIN_IMAGES_PATH,
target_size=(28, 28),
color_mode='grayscale',
batch_size=64,
class_mode='categorical',
shuffle=True,
subset='training'
)
Update: The LSTM model code:
inp = Input(shape=(28, 28, 1))
inp = Lambda(lambda x: squeeze(x, axis=-1))(inp) # from 4D to 3D
x = LSTM(num_units, dropout=dropout, recurrent_dropout=recurrent_dropout, activation=activation_fn, return_sequences=True)(inp)
x = BatchNormalization()(x)
x = Dense(128, activation=activation_fn)(x)
output = Dense(nb_classes, activation='softmax', kernel_regularizer=l2(0.001))(x)
model = Model(inputs=inp, outputs=output)
you start feeding your network with 4D data like your images in order to have the compatibility with ImageDataGenerator and then you have to reshape them in 3D format for LSTM.
These are the possibilities:
with only one channel you can simply squeeze the last dimension
inp = Input(shape=(28, 28, 1))
x = Lambda(lambda x: tf.squeeze(x, axis=-1))(inp) # from 4D to 3D
x = LSTM(32)(x)
if you have multiple channels (this is the case of RGB images or if would like to apply a RNN after a Conv2D) a solution can be this
inp = Input(shape=(28, 28, 1))
x = Conv2D(32, 3, padding='same', activation='relu')(inp)
x = Reshape((28,28*32))(x) # from 4D to 3D
x = LSTM(32)(x)
the fit can be computed as always with model.fit_generator
UPDATE: model review
inp = Input(shape=(28, 28, 1))
x = Lambda(lambda x: squeeze(x, axis=-1))(inp) # from 4D to 3D
x = LSTM(32, dropout=dropout, recurrent_dropout=recurrent_dropout, activation=activation_fn, return_sequences=False)(x)
x = BatchNormalization()(x)
x = Dense(128, activation=activation_fn)(x)
output = Dense(nb_classes, activation='softmax', kernel_regularizer=l2(0.001))(x)
model = Model(inputs=inp, outputs=output)
model.summary()
pay attention when you define inp variable (don't overwrite it)
set return_seq = False in LSTM in order to have 2D output

Keras "return_sequences" option returns 2D array instead of 3D

I'm trying to use a simple character-level Keras model for extract key text from a sentence.
I feed it x_train a padded sequence of dim (n_examples, 500) representing the entire sentence and y_train, a padded sequence of dim (n_examples, 100) representing the import text to extract.
I try a simple model like such:
vocab_size = 1000
src_txt_length = 500
sum_txt_length = 100
inputs = Input(shape=(src_txt_length,))
encoder1 = Embedding(vocab_size, 128)(inputs)
encoder2 = LSTM(128)(encoder1)
encoder3 = RepeatVector(sum_txt_length)(encoder2)
decoder1 = LSTM(128, return_sequences=True)(encoder3)
outputs = TimeDistributed(Dense(100, activation='softmax'))(decoder1)
model = Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')
When I try to train it with the following code:
hist = model.fit(x_train, y_train, verbose=1, validation_data=(x_test, y_test), batch_size=batch_size, epochs=5)
I get the error:
ValueError: Error when checking target: expected time_distributed_27 to have 3 dimensions, but got array with shape (28500, 100)
My question is: I have the return_sequences parameter set to True on the last LSTM layer, but the Dense fully-connected layer is telling me that the input is 2-dimensional.
What am I doing wrong here? Any help would be greatly appreciated!
It isn't complaining about the input to TimeDistributed but the target y_train.shape == (n_examples, 100) which isn't 3D. You have a mismatch between predicting a sequence and a single point. In other words, outputs is 3D but y_train is 2D.

How to use convolution 1D with lstm ?

I have time series data input 72 value by separate last 6 value for test prediction. I want to use CONV1D with LSTM.
This is my code.
df = pd.read_csv('D://data.csv',
engine='python')
df['DATE_'] = pd.to_datetime(df['DATE_']) + MonthEnd(1)
df = df.set_index('DATE_')
df.head()
split_date = pd.Timestamp('03-01-2015')
train = df.loc[:split_date, ['COLUMN3DATA']]
test = df.loc[split_date:, ['COLUMN3DATA']]
sc = MinMaxScaler()
train_sc = sc.fit_transform(train)
test_sc = sc.transform(test)
X_train = train_sc[:-1]
y_train = train_sc[1:]
X_test = test_sc[:-1]
y_test = test_sc[1:]
################### Convolution #######################
X_train_t = X_train[None,:]
print(X_train_t.shape)
X_test_t = X_test[:, None]
K.clear_session()
model = Sequential()
model.add(Conv1D(6, 3, activation='relu', input_shape=(12,1)))
model.add(LSTM(6, input_shape=(1,3), return_sequences=True))
model.add(LSTM(3))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam' )
model.summary()
model.fit(X_train_t, y_train, epochs=400, batch_size=10, verbose=1)
y_pred = model.predict(X_test_t)
When I run it show error like this
ValueError: Error when checking input: expected conv1d_1_input to have shape (None, 12, 1) but got array with shape (1, 64, 1)
How to use conv1D with lstm
The problem is between your input data and your input shape.
You said in the model that your input shape is (12,1) (= batch_shape=(None,12,1))
But your data X_train_t has shape (1,64,1).
Either you fix the input shape of the model, or you fix your data if this is not the expected shape.
For variable lengths/timesteps, you can use input_shape=(None,1).
You don't need an input_shape in the second layer.