Keras LSTM: How to give true value for every timestep (Many to many) - tensorflow

This might be noob question. I have tried my best find the answers.
Basically I want LSTM to calculated error based on very timestep. I want to give true value for every timestep. I have tried giving dimension x=(2,10,1) and y=(2,10,1) which doesn't work , predict function outputs 3d array instead of 2d array. what I am doing wrong here?
I

You should use LSTM with return_sequences=True followed by Dense layer and then flatten the output of the Dense layer.
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
ins = Input(shape=(10, 3)) # considering 3 input features
lstm = LSTM(256, return_sequences=True)(ins)
dense = Dense(1)(lstm)
flat = Flatten()(dense)
model = Model(inputs=ins, outputs=flat)
model.summary()
This will build the following model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 10, 3)] 0
_________________________________________________________________
lstm_1 (LSTM) (None, 10, 256) 266240
_________________________________________________________________
dense_1 (Dense) (None, 10, 1) 257
_________________________________________________________________
flatten (Flatten) (None, 10) 0
=================================================================
Total params: 266,497
Trainable params: 266,497
Non-trainable params: 0
_________________________________________________________________

Related

How does model.weights in tensorflow/keras work?

I have a model trained.
summary is as follows
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 256) 2560
dense_1 (Dense) (None, 128) 32896
dropout (Dropout) (None, 128) 0
dense_2 (Dense) (None, 1) 129
=================================================================
Total params: 35,585
Trainable params: 35,585
Non-trainable params: 0
_________________________________________________________________
And have weights
for i,weight in enumerate(Model.weights):
exec('w{}=np.array(weight)'.format(i))
have test data for predict
x=test_data.iloc[0]
then I predict with model
Model.predict(np.array(x).reshape(1,9))
get array([[226241.66]], dtype=float32)
then I predict with weights
((x#w0+w1)#w2+w3)#w4+w5
get array([98039.99664026])
Can someone explain how the weights in model works?
And how to get the model-predict result with weights?
Try Model.layers which will return a list of all layers in your model, each layer has a function get_weights() which will return the weights as numpy arrays. I was able to reproduce the output of a simple 3 layer feed-forward model with this approach.
for i,layer in enumerate(model.layers):
exec('w{}=np.array(layer.get_weights()[0])'.format(i)) # weight
exec('b{}=np.array(layer.get_weights()[1])'.format(i)) # bias
X = np.random.randn(1,9)
np.allclose(((X#w1[0] + b1[1])#w2[0] + b2[1])#w4[0] + b4[1], model.predict(X)) # True
Note: In my examle layer 0 was a input layer (no weights) and layer 3 a dropout layer (no weights). When calling model.predict(), dropout is not applied, therefore you can ignore it in this case.

Why does Tensor Flow add a dimension to my input & output?

Here is my code:
from tensorflow.keras import layers
import tensorflow as tf
from tensorflow import keras
TFDataType = tf.float16
XTrain = tf.cast(tf.ones((10,10)), dtype=TFDataType)
YTrain = tf.cast(tf.ones((10,10)), dtype=TFDataType)
model = tf.keras.models.Sequential()
model.add(layers.Dense(1, dtype=TFDataType, input_shape=(10, 10)))
model.add(layers.Dense(1, dtype=TFDataType, input_shape=(10, 10)))
print(model.summary())
I am feeding it a 2 dimensional matrix. But when I see the model summary, I see:
Model: "sequential"
_________________________________________________________________
2021-08-23 13:32:18.716788: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-TLG9US3
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10, 1) 11
_________________________________________________________________
dense_1 (Dense) (None, 10, 2) 4
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________
Why is the model asking for a 3 Dimensional (None, 10, 1) array?
How do I pass an array that meets the dimensionality of (None, 10, 1)?
I cannot call numpy.ones(None, 10, 1). I cannot reshape the array with -1 in the first dimension.
In your first layer the code input_shape=(10, 10) adds the extra dimension to account for the batch size of the data. Note you only need this code for the FIRST layer in your model so remove input_shape=(10, 10) in your second layer.

Purpose of additional parameters in Quantization Nodes of TensorFlow Quantization Aware Training

Currently, I am trying to understand quantization aware training in TensorFlow. I understand, that fake quantization nodes are required to gather dynamic range information as a calibration for the quantization operation. When I compare the same model once as "plain" Keras model and once as quantization aware model, the latter has more parameters, which makes sense since we need to store the minimum and maximum values for activations during the quantization aware training.
Consider the following example:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model
def get_model(in_shape):
inpt = layers.Input(shape=in_shape)
dense1 = layers.Dense(256, activation="relu")(inpt)
dense2 = layers.Dense(128, activation="relu")(dense1)
out = layers.Dense(10, activation="softmax")(dense2)
model = Model(inpt, out)
return model
The model has the following summary:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense_3 (Dense) (None, 256) 200960
_________________________________________________________________
dense_4 (Dense) (None, 128) 32896
_________________________________________________________________
dense_5 (Dense) (None, 10) 1290
=================================================================
Total params: 235,146
Trainable params: 235,146
Non-trainable params: 0
_________________________________________________________________
However, if i make my model optimization aware, it prints the following summary:
import tensorflow_model_optimization as tfmot
quantize_model = tfmot.quantization.keras.quantize_model
# q_aware stands for for quantization aware.
q_aware_model = quantize_model(standard)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 784)] 0
_________________________________________________________________
quantize_layer (QuantizeLaye (None, 784) 3
_________________________________________________________________
quant_dense_3 (QuantizeWrapp (None, 256) 200965
_________________________________________________________________
quant_dense_4 (QuantizeWrapp (None, 128) 32901
_________________________________________________________________
quant_dense_5 (QuantizeWrapp (None, 10) 1295
=================================================================
Total params: 235,164
Trainable params: 235,146
Non-trainable params: 18
_________________________________________________________________
I have two questions in particular:
What is the purpose of the quantize_layer with 3 parameters after the Input layer?
Why do we have 5 additional non-trainable parameters per layer and what are they used for exactly?
I appreciate any hint or further material that helps me (and others that stumble upon this question) understand quantization aware training.
The quantize layer is used to convert the float inputs to int8. The quantization parameters are used for output min/max and zero point calculations.
Quantized Dense Layers need a few additional parameters: min/max for kernel and min/max/zero-point for output activations.

How to feed only half of the RNN output to next RNN output layer in tensorflow?

I want to feed only RNN output at odd positions to the next RNN layer. How to achieve that in tensorflow?
I basically want to build the top layer in the following diagram, which halves the sequence size. The bottom layer is just a simple RNN.
Is this what you need?
from tensorflow.keras import layers, models
import tensorflow.keras.backend as K
inp = layers.Input(shape=(10, 5))
out = layers.LSTM(50, return_sequences=True)(inp)
out = layers.Lambda(lambda x: tf.stack(tf.unstack(out, axis=1)[::2], axis=1))(out)
out = layers.LSTM(50)(out)
out = layers.Dense(20)(out)
m = models.Model(inputs=inp, outputs=out)
m.summary()
You get the following model. You can see the second LSTM only gets 5 timesteps from the total 10 steps (i.e. every other output of the previous layer)
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 10, 5)] 0
_________________________________________________________________
lstm_2 (LSTM) (None, 10, 50) 11200
_________________________________________________________________
lambda_1 (Lambda) (None, 5, 50) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 50) 20200
_________________________________________________________________
dense_1 (Dense) (None, 20) 1020
=================================================================
Total params: 32,420
Trainable params: 32,420
Non-trainable params: 0

Tensorflow keras Sequential .add is different than inline definition?

Keras is giving different results when I define my model via the declarative method instead of the functional method. The two models appear to be equivillent, but using the ".add()" syntax works while using the declarative syntax gives errors -- it's a different error each time, but usually something like:
A target array with shape (10, 1) was passed for an output of shape (None, 16) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.
There seems to be something going on with auto-conversion of input shapes, but I can't tell what. Does anyone know what I'm doing wrong? Why aren't these two models exactly equivillent?
import tensorflow as tf
import tensorflow.keras
import numpy as np
x = np.arange(10).reshape((-1,1,1))
y = np.arange(10)
#This model works fine
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(1, 1), return_sequences = True))
model.add(tf.keras.layers.LSTM(16))
model.add(tf.keras.layers.Dense(1))
model.add(tf.keras.layers.Activation('linear'))
#This model fails. But shouldn't this be equivalent to the above?
model2 = tf.keras.Sequential(
{
tf.keras.layers.LSTM(32, input_shape=(1, 1), return_sequences = True),
tf.keras.layers.LSTM(16),
tf.keras.layers.Dense(1),
tf.keras.layers.Activation('linear')
})
#This works
model.compile(loss='mean_squared_error', optimizer='adagrad')
model.fit(x, y, epochs=1, batch_size=1, verbose=2)
#But this doesn't! Why not? The error is different each time, but usually
#something about the input size being wrong
model2.compile(loss='mean_squared_error', optimizer='adagrad')
model2.fit(x, y, epochs=1, batch_size=1, verbose=2)
Why aren't those two models equivalent? Why does one handle the input size correctly but the other doesn't? The second model fails with a different error each time (once in a while it even works) so i thought maybe there's some interaction with the first model? But I've tried commenting out the first model and that doesn't help. So why doesn't the second one work?
UPDATE: Here is the "model.summary() for the first and second model. They do seem different but I don't understand why.
For model.summary():
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 1, 32) 4352
_________________________________________________________________
lstm_1 (LSTM) (None, 16) 3136
_________________________________________________________________
dense (Dense) (None, 1) 17
_________________________________________________________________
activation (Activation) (None, 1) 0
=================================================================
Total params: 7,505
Trainable params: 7,505
Non-trainable params: 0
For model2.summary():
model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 1, 32) 4352
_________________________________________________________________
activation_1 (Activation) (None, 1, 32) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 16) 3136
_________________________________________________________________
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 7,505
Trainable params: 7,505
Non-trainable params: 0```
When you are creating the model with the inline declarations, you put the layers in curly braces {}, which makes it a set, which is inherently unordered. Change the curly braces to square brackets [] to put them in an ordered list. This will make sure that the layers are in the correct order in your model.