Ragged tensors as input for LSTM - tensorflow

Learning about ragged tensors and how can I use them with tensorflow.
My example
xx = tf.ragged.constant([
[0.1, 0.2],
[0.4, 0.7 , 0.5, 0.6]
])
yy = np.array([[0, 0, 1], [1,0,0]])
mdl = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=[None], batch_size=2, dtype=tf.float32, ragged=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(3, activation='softmax')
])
mdl.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
mdl.summary()
history = mdl.fit(xx, yy, epochs=10)
The error
Input 0 of layer lstm_152 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [2, None]
I am not sure if I can use ragged tensors like this. All examples I found have embedding layer before LSTM, but what I don't want to create additional embedding layer.

I recommend to use Input layer rather than InputLayer, you often not need to use InputLayer, Anyway the probelm that the shape of your input and LSTM layer input shape was wrong , here the modification i have made with some comments.
# xx should be 3d for LSTM
xx = tf.ragged.constant([
[[0.1, 0.2]],
[[0.4, 0.7 , 0.5, 0.6]]
])
"""
Labels represented as OneHotEncoding so you
should use CategoricalCrossentropy instade of SparseCategoricalCrossentropy
"""
yy = np.array([[0, 0, 1], [1,0,0]])
# For ragged tensor , get maximum sequence length
max_seq = xx.bounding_shape()[-1]
mdl = tf.keras.Sequential([
# Input Layer with shape = [Any, maximum sequence length]
tf.keras.layers.Input(shape=[None, max_seq], batch_size=2, dtype=tf.float32, ragged=True),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(3, activation='softmax')
])
# CategoricalCrossentropy
mdl.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(1e-4),
metrics=['accuracy'])
mdl.summary()
history = mdl.fit(xx, yy, epochs=10)

Related

Tensorflow Macro F1 Score for multiclass and also for binary classification

I am trying to train 2 1D Conv neural networks - one for a multiclass classification problem and second for a binary classification problem. One of my metrics has to be Macro F1 score for both problems. However I am having a problem using tfa.metrics.F1Score from tensorflow addons.
Multiclass classification
I have 3 classes encoded as 0, 1, 2.
The last layer of the network and the compile method look like this (int_sequeces_input is the input layer):
preds = layers.Dense(3, activation="softmax")(x)
model = keras.Model(int_sequences_input, preds)
f1_macro = F1Score(num_classes=3, average='macro')
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy',f1_macro])
However when I run model.fit(), I get the following error:
ValueError: Dimension 0 in both shapes must be equal, but are 3 and 1. Shapes are [3] and [1]. for '{{node AssignAddVariableOp_7}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_7/resource, Sum_6)' with input shapes: [], [1].
shapes of data:
X_train - (23658, 150)
y_train - (23658,)
Binary classification
I have 2 classes encoded as 0,1
The last layer of the network and the compile method look like this (int_sequeces_input is the input layer):
preds = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(int_sequences_input, preds)
print(model.summary())
f1_macro = F1Score(num_classes=2, average='macro')
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy',f1_macro])
Again, when I run model.fit() I get error:
ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1. Shapes are [2] and [1]. for '{{node AssignAddVariableOp_4}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_4/resource, Sum_3)' with input shapes: [], [1].
shapes of data:
X_train - (15770, 150)
y_train - (15770,)
So my question is: how to evaluate both of my models using macro F1 score? How can I fix my implementation to make it work with tfa.metrics.F1Score? Or is there any other way to calculate macro F1 score without using tfa.metrics.F1Score? Thanks.
Have a look at the usage example from its doc page.
metric = tfa.metrics.F1Score(num_classes=3, threshold=0.5)
y_true = np.array([[1, 1, 1],
[1, 0, 0],
[1, 1, 0]], np.int32)
y_pred = np.array([[0.2, 0.6, 0.7],
[0.2, 0.6, 0.6],
[0.6, 0.8, 0.0]], np.float32)
metric.update_state(y_true, y_pred)
You can see that it expects the label to be in one-hot format.
But given the shapes you mentioned above:
shapes of data:
X_train - (23658, 150)
y_train - (23658,)
It looks like your labels are in index format. Try converting them to one hot with tf.one_hot(y_train, num_classes). You'll also need to change your loss to loss='categorical_crossentropy'.

Dimensions must be equal, but are 2 and 3 for node binary_crossentropy/mul

I was checking the code I found here, the example at Multivariate Multi-Step LSTM Models - > Multiple Input Multi-Step Output.
I altered the code and used binary_crossentropy and sigmoid activation for the last layer.
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out-1
# check if we are beyond the dataset
if out_end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps_in, n_steps_out = 3, 3
# convert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
n_features = X.shape[2]
# define model
model = Sequential()
model.add((LSTM(5, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features))))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# fit model
model.fit(X, y, epochs=20, verbose=0, batch_size=1)
The above code runs fine. But, when I try to change the n_steps_in, n_steps_out and use for example: n_steps_in, n_steps_out = 3, 2, it gives:
ValueError: Dimensions must be equal, but are 2 and 3 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](binary_crossentropy/Cast, binary_crossentropy/Log)' with input shapes: [1,2], [1,3].
Why this error comes up and how can I overcome this?
this is because your network is build to output 3D sequences of shape (None, 3, 1) while your targets have shape (None, 2, 1)
The best and automated way to handle this situation correctly is to build an encoder-decoder structure... Below the example:
model = Sequential()
model.add(LSTM(5, activation='relu', return_sequences=False,
input_shape=(n_steps_in, n_features))) # ENCODER
model.add(RepeatVector(n_steps_out))
model.add(LSTM(5, activation='relu', return_sequences=True)) # DECODER
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=20, batch_size=1)

ValueError: If your data is in the form of symbolic tensors, you cannot use `validation_split`

images = images / 255.0
model = tf.keras.Sequential([
keras.layers.InputLayer(input_shape=(227, 227, 3)),
keras.layers.Conv2D(96, [7, 7], [4, 4], data_format='channels_last'),
keras.layers.MaxPooling2D(pool_size=(3, 2), padding="same"),
keras.layers.Conv2D(2566, [5, 5], [1, 1], data_format='channels_last'),
keras.layers.MaxPooling2D(pool_size=(3, 2), padding='same'),
keras.layers.Conv2D(384, [3, 3], [1, 1], data_format='channels_last'),
keras.layers.MaxPooling2D(pool_size=(3, 2), padding='same'),
keras.layers.Flatten(),
keras.layers.Dense(512),
keras.layers.Dropout(1 - pkeep),
keras.layers.Dense(512),
keras.layers.Dropout(1 - pkeep),
])
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
batch_size = 128
epochs = 2
# Compute end step to finish pruning after 2 epochs.
validation_split = 0.1 # 10% of training set will be used for validation set.
num_images = int(images.shape[0]) * (1 - validation_split)
end_step = np.ceil(num_images / batch_size).astype(np.int32) * epochs
# Define model for pruning.
pruning_params = {
'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=0.50,
final_sparsity=0.80,
begin_step=0,
end_step=end_step)
}
model_for_pruning = prune_low_magnitude(model, **pruning_params)
# `prune_low_magnitude` requires a recompile.
model_for_pruning.compile(optimizer='Adadelta',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model_for_pruning.summary()
callbacks = [
tfmot.sparsity.keras.UpdatePruningStep(),
]
model_for_pruning.fit(images,
batch_size=batch_size, epochs=epochs, validation_split=validation_split, steps_per_epoch=100,
callbacks=callbacks)
with tf.variable_scope('output') as scope:
weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(model.output, weights), biases, name=scope.name)
return output
How do I fix images so it's not a symbolic tensor? Images is a tensor input for the preprocessed image size, with the input images being from the Adience benchmark.
The error says I can't use validation_split with a symbolic tensor. I tried converting the symbolic tensor to a variable and a numpy array, but the program still throws that the input images is a symbolic tensor.
Python 3.7
Keras 2.2.3
Tensorflow 1.14.0

3 dimensional array as input with Embedding Layer and LSTM in Keras

Hey guys I have built an LSTM model that works and now I am trying(unsuccessfully) to add an Embedding layer as a first layer.
This solution didn't work for me.
I also read these questions before asking:
Keras input explanation: input_shape, units, batch_size, dim, etc,
Understanding Keras LSTMs and keras examples.
My input is a one-hot encoding(of ones and zeros) of characters of a language that consists 27 letters. I chose to represent each word as a sequence of 10 characters. Input size for each word is (10,27) and I have 465 of them so it's X_train.shape (465,10,27), I also have a label of size y_train.shape (465,1). My goal is to train a model and while doing that to build a character embeddings.
Now this is the model that compiles and fits.
main_input = Input(shape=(10, 27))
rnn = Bidirectional(LSTM(5))
x = rnn(main_input)
de = Dense(1, activation='sigmoid')(x)
model = Model(inputs = main_input, outputs = de)
model.compile(loss='binary_crossentropy',optimizer='adam')
model.fit(X_train, y_train, epochs=10, batch_size=1, verbose=1)
After adding Embedding layer:
main_input = Input(shape=(10, 27))
emb = Embedding(input_dim=2, output_dim = 10)(main_input)
rnn = Bidirectional(LSTM(5))
x = rnn(emb)
de = Dense(1, activation='sigmoid')(x)
model = Model(inputs = main_input, outputs = de)
model.compile(loss='binary_crossentropy',optimizer='adam')
model.fit(X_train, y_train, epochs=10, batch_size=1, verbose=1)
output: ValueError: Input 0 is incompatible with layer bidirectional_31: expected ndim=3, found ndim=4
How do I fix the output shape?
Your ideas would be much appreciated.
My input is a one-hot encoding(of ones and zeros) of characters of a language that consists 27 letters.
You shouldn't pass a one-hot-encoding into an Embedding. Embedding layers map an integer index to an n-dimensional vector. As a result you should pass in the pre-one-hotted indexes directly.
I.e. before you have an one-hotted input like [[0, 1, 0], [1, 0, 0], [0, 0, 1]], which was created from a set of integers like [1, 0, 2]. Instead of passing on the (10, 27) one-hotted vector pass in original vector of (10,).
main_input = Input(shape=(10,)) # only pass in the indexes
emb = Embedding(input_dim=27, output_dim = 10)(main_input) # vocab size is 27
rnn = Bidirectional(LSTM(5))
x = rnn(emb)
de = Dense(1, activation='sigmoid')(x)
model = Model(inputs = main_input, outputs = de)
model.compile(loss='binary_crossentropy',optimizer='adam')
model.fit(X_train, y_train, epochs=10, batch_size=1, verbose=1)

Using Estimator for building an LSTM network

I am trying to build an LSTM network using an Estimator. My data looks like
X = [[1,2,3], [2,3,4], ... , [98,99,100]]
y = [2, 3, ... , 99]
I am using an Estimator:
regressor = learn.Estimator(model_fn=lstm_model,
params=model_params,
)
where the lstm_model function is
def lstm_model(features, targets, mode, params):
def lstm_cells(layers):
if isinstance(layers[0], dict):
return [tf.nn.rnn_cell.BasicLSTMCell(layer['steps'],state_is_tuple=True) for layer in layers]
return [tf.nn.rnn_cell.BasicLSTMCell(steps, state_is_tuple=True) for steps in layers]
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell(lstm_cells(params['rnn_layers']), state_is_tuple=True)
output, layers = tf.nn.rnn(stacked_lstm, [features], dtype=tf.float32)
return learn.models.linear_regression(output, targets)
and params are
model_params = {
'steps': 1000,
'learning_rate': 0.03,
'batch_size': 24,
'time_steps': 3,
'rnn_layers': [{'steps': 3}],
'dense_layers': [10, 10]
}
and then I do the fitting
regressor.fit(X, y)
The issue I am facing is
output, layers = tf.nn.rnn(stacked_lstm, [features], dtype=tf.float32)
requires a sequence but I am not sure how to split my features to into list of tensors. The shape of features inside the lstm_model function is (?, 3)
I have two questions, how do I do the training in batches? and how do I split 'features' so
output, layers = tf.nn.rnn(stacked_lstm, [features], dtype=tf.float32)
doesn't throw and error. The error I am getting is
raise TypeError("%s that don't all match." % prefix)
TypeError: Tensors in list passed to 'values' of 'Concat' Op have types [float64, float32] that don't all match.
I am using tensorflow 0.12
I had to set the shape for features to be
(batch_size, time_step, 1) or (None, time_step, 1) and then unstack the features to go in the rnn. Unstacking the features in the "time_step" so you have a list of tensors with the size of time steps and the shape for each tensor should be (None, 1) or (batch_size, 1)