Why does Tensor Flow add a dimension to my input & output? - tensorflow

Here is my code:
from tensorflow.keras import layers
import tensorflow as tf
from tensorflow import keras
TFDataType = tf.float16
XTrain = tf.cast(tf.ones((10,10)), dtype=TFDataType)
YTrain = tf.cast(tf.ones((10,10)), dtype=TFDataType)
model = tf.keras.models.Sequential()
model.add(layers.Dense(1, dtype=TFDataType, input_shape=(10, 10)))
model.add(layers.Dense(1, dtype=TFDataType, input_shape=(10, 10)))
print(model.summary())
I am feeding it a 2 dimensional matrix. But when I see the model summary, I see:
Model: "sequential"
_________________________________________________________________
2021-08-23 13:32:18.716788: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-TLG9US3
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10, 1) 11
_________________________________________________________________
dense_1 (Dense) (None, 10, 2) 4
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________
Why is the model asking for a 3 Dimensional (None, 10, 1) array?
How do I pass an array that meets the dimensionality of (None, 10, 1)?
I cannot call numpy.ones(None, 10, 1). I cannot reshape the array with -1 in the first dimension.

In your first layer the code input_shape=(10, 10) adds the extra dimension to account for the batch size of the data. Note you only need this code for the FIRST layer in your model so remove input_shape=(10, 10) in your second layer.

Related

Keras LSTM: How to give true value for every timestep (Many to many)

This might be noob question. I have tried my best find the answers.
Basically I want LSTM to calculated error based on very timestep. I want to give true value for every timestep. I have tried giving dimension x=(2,10,1) and y=(2,10,1) which doesn't work , predict function outputs 3d array instead of 2d array. what I am doing wrong here?
I
You should use LSTM with return_sequences=True followed by Dense layer and then flatten the output of the Dense layer.
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
ins = Input(shape=(10, 3)) # considering 3 input features
lstm = LSTM(256, return_sequences=True)(ins)
dense = Dense(1)(lstm)
flat = Flatten()(dense)
model = Model(inputs=ins, outputs=flat)
model.summary()
This will build the following model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 10, 3)] 0
_________________________________________________________________
lstm_1 (LSTM) (None, 10, 256) 266240
_________________________________________________________________
dense_1 (Dense) (None, 10, 1) 257
_________________________________________________________________
flatten (Flatten) (None, 10) 0
=================================================================
Total params: 266,497
Trainable params: 266,497
Non-trainable params: 0
_________________________________________________________________

Using dilated convolution in Keras

In WaveNet, dilated convolution is used to increase receptive field of the layers above.
From the illustration, you can see that layers of dilated convolution with kernel size 2 and dilation rate of powers of 2 create a tree like structure of receptive fields. I tried to (very simply) replicate the above in Keras.
import tensorflow.keras as keras
nn = input_layer = keras.layers.Input(shape=(200, 2))
nn = keras.layers.Conv1D(5, 5, padding='causal', dilation_rate=2)(nn)
nn = keras.layers.Conv1D(5, 5, padding='causal', dilation_rate=4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer, nn)
opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='mse', optimizer=opt)
model.summary()
And the output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) [(None, 200, 2)] 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 200, 5) 55
_________________________________________________________________
conv1d_6 (Conv1D) (None, 200, 5) 130
_________________________________________________________________
dense_2 (Dense) (None, 200, 1) 6
=================================================================
Total params: 191
Trainable params: 191
Non-trainable params: 0
_________________________________________________________________
I was expecting axis=1 to shrink after each conv1d layer, similar to the gif. Why is this not the case?

How to feed only half of the RNN output to next RNN output layer in tensorflow?

I want to feed only RNN output at odd positions to the next RNN layer. How to achieve that in tensorflow?
I basically want to build the top layer in the following diagram, which halves the sequence size. The bottom layer is just a simple RNN.
Is this what you need?
from tensorflow.keras import layers, models
import tensorflow.keras.backend as K
inp = layers.Input(shape=(10, 5))
out = layers.LSTM(50, return_sequences=True)(inp)
out = layers.Lambda(lambda x: tf.stack(tf.unstack(out, axis=1)[::2], axis=1))(out)
out = layers.LSTM(50)(out)
out = layers.Dense(20)(out)
m = models.Model(inputs=inp, outputs=out)
m.summary()
You get the following model. You can see the second LSTM only gets 5 timesteps from the total 10 steps (i.e. every other output of the previous layer)
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 10, 5)] 0
_________________________________________________________________
lstm_2 (LSTM) (None, 10, 50) 11200
_________________________________________________________________
lambda_1 (Lambda) (None, 5, 50) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 50) 20200
_________________________________________________________________
dense_1 (Dense) (None, 20) 1020
=================================================================
Total params: 32,420
Trainable params: 32,420
Non-trainable params: 0

ValueError: Error when checking input: expected dense_1_input to have 3 dimensions, but got array with shape (5, 1)

I know this question is asked before but i was unable to get my ans out of them.
So
state is [[ 0.2]
[ 10. ]
[ 1. ]
[-10.5]
[ 41.1]]
and
(5, 1) # np.shape(state)
and when model.predict(state) it throw
ValueError: Error when checking input: expected dense_1_input to have
3 dimensions, but got array with shape (5, 1)
But....
model = Sequential()
model.add(Dense(5,activation='relu',input_shape=(5,1)))
My first layer of model have input_shape=(5,1) which is equal to the shape of state that I am passing.
I also have 2 dense more layers after this.
And
print(model.summary())
// output
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 5, 5) 10
_________________________________________________________________
dropout_1 (Dropout) (None, 5, 5) 0
_________________________________________________________________
dense_2 (Dense) (None, 5, 5) 30
_________________________________________________________________
dropout_2 (Dropout) (None, 5, 5) 0
_________________________________________________________________
dense_3 (Dense) (None, 5, 3) 18
=================================================================
And model definition is (!!noob alert )
model = Sequential()
model.add(Dense(5,activation='relu',input_shape=(5,1)))
model.add(Dropout(0.2))
# model.add(Flatten())
model.add(Dense(5,activation='relu'))
model.add(Dropout(0.2))
# model.add(Flatten())
model.add(Dense(3,activation='softmax'))
model.compile(loss="mse", optimizer=Adam(lr=0.001), metrics=['accuracy'])
A couple things. First, the predict function assumes the first dimension of the input tensor is the batch size (even if you're only predicting for one sample), but the input_shape attribute on your first layer in the Sequential model excludes batch size, as indicated here. Second, the dense layers are being applied over the last dimension, which is not going to give you what you want since I assume your input vector has 5 features, but you are adding this last 1 dimension which makes your model outputs the wrong size. Try the following code:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
model = Sequential()
model.add(Dense(5, activation='relu', input_shape=(5,)))
model.add(Dropout(0.2))
model.add(Dense(5,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(3,activation='softmax'))
model.compile(loss="mse", optimizer=Adam(lr=0.001), metrics=['accuracy'])
print(model.summary())
state = np.array([0.2, 10., 1., -10.5, 41.1]) # shape (5,)
print("Prediction:", model.predict(np.expand_dims(state, 0))) # expand_dims adds batch dimension
You should see this output for model summary and also be able to see a predicted vector:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 5) 30
_________________________________________________________________
dropout (Dropout) (None, 5) 0
_________________________________________________________________
dense_1 (Dense) (None, 5) 30
_________________________________________________________________
dropout_1 (Dropout) (None, 5) 0
_________________________________________________________________
dense_2 (Dense) (None, 3) 18
=================================================================
Total params: 78
Trainable params: 78
Non-trainable params: 0

Understanding Keras model architecture (tensor index)

This script defining a dummy using the functional API
from keras.layers import Input, Dense
from keras.models import Model
import keras
inputs = Input(shape=(100,), name='A_input')
x = Dense(20, activation='relu', name='B_dense')(inputs)
shared_l = Dense(20, activation='relu', name='C_dense_shared')
x = keras.layers.concatenate([shared_l(x), shared_l(x)], name='D_concat')
model = Model(inputs=inputs, outputs=x)
print(model.summary())
yields the following output
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
A_input (InputLayer) (None, 100) 0
____________________________________________________________________________________________________
B_dense (Dense) (None, 20) 2020 A_input[0][0]
____________________________________________________________________________________________________
C_dense_shared (Dense) (None, 20) 420 B_dense[0][0]
B_dense[0][0]
____________________________________________________________________________________________________
D_concat (Concatenate) (None, 40) 0 C_dense_shared[0][0]
C_dense_shared[1][0]
====================================================================================================
My question concerns the content of the Connected to column.
I understand that a layer can have multiple nodes.
In this case C_dense_shared has two nodes, and D_concat is connected to both of them (C_dense_shared[0][0] and C_dense_shared[1][0]). So the first index (the node_index) is clear to me. But what does the second index mean? From the source code I read that this is the tensor_index:
layer_name[node_index][tensor_index]
But what does the tensor_index mean? And in what situations can it have a value different from 0?
I think the docstring of the Node class makes it quite clear:
tensor_indices: a list of integers,
the same length as `inbound_layers`.
`tensor_indices[i]` is the index of `input_tensors[i]` within the
output of the inbound layer
(necessary since each inbound layer might
have multiple tensor outputs, with each one being
independently manipulable).
tensor_index will be nonzero if a layer has multiple output tensors. It's different from the situation of multiple "datastreams" (e.g. layer sharing), where layers have multiple outbound nodes. For example, LSTM layer will return 3 tensors if given return_state=True:
Hidden state of the last time step, or all hidden states if return_sequences=True
Hidden state of the last time step
Memory cell of the last time step
As another example, feature transformation can be implemented as a Lambda layer:
def generate_powers(x):
return [x, K.sqrt(x), K.square(x)]
model_input = Input(shape=(10,))
powers = Lambda(generate_powers)(model_input)
x = Concatenate()(powers)
x = Dense(10, activation='relu')(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(model_input, x)
From model.summary(), you can see that concatenate_5 is connected to lambda_7[0][0], lambda_7[0][1] and lambda_7[0][2]:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_7 (InputLayer) (None, 10) 0
____________________________________________________________________________________________________
lambda_7 (Lambda) [(None, 10), (None, 1 0 input_7[0][0]
____________________________________________________________________________________________________
concatenate_5 (Concatenate) (None, 30) 0 lambda_7[0][0]
lambda_7[0][1]
lambda_7[0][2]
____________________________________________________________________________________________________
dense_8 (Dense) (None, 10) 310 concatenate_5[0][0]
____________________________________________________________________________________________________
dense_9 (Dense) (None, 1) 11 dense_8[0][0]
====================================================================================================
Total params: 321
Trainable params: 321
Non-trainable params: 0
____________________________________________________________________________________________________