In a Sequential model, I'm trying to go from a layer output shape of (None, 300) to something like (1,1,None*300) to apply an AveragePooling layer. In fact I would like to flatten everything (even the batch axis), while both Flatten and Reshape layers always skip the batch axis. Any idea?
You can use a Lambda layer and the K.reshape from backend like this:
from keras import backend as K
out = Lambda(lambda x: K.reshape(x, (1, 1, -1)))(inp)
Related
A standard RNN computational graph looks like follows (In my case, for regression to a single scalar value y)
I want to construct a network which accepts as input m sequences X_1...X_m (where both m and sequence lengths vary), runs the RNN on each sequence X_i to obtain a representation vector R_i, averages the representations and then runs a fully connected net to compute the output y_hat. Computational graph should look something like this:
Question
Can this be implemented (preferably) in Keras? Otherwise in TensorFlow? I'd very much appreciate if someone can point me to a working implementation of this or something similar.
There isn't a straightforward Keras implementation, as Keras enforces the batch axis (sampels dimension, dimension 0) as fixed for the input & output layers (but not all layers in-between) - whereas you seek to collapse it by averaging. There is, however, a workaround - see below:
import tensorflow.keras.backend as K
from tensorflow.keras.layers import Input, Dense, GRU, Lambda
from tensorflow.keras.layers import Reshape, GlobalAveragePooling1D
from tensorflow.keras.models import Model
from tensorflow.keras.utils import plot_model
import numpy as np
def make_model(batch_shape):
ipt = Input(batch_shape=batch_shape)
x = Lambda(lambda x: K.squeeze(x, 0))(ipt)
x, s = GRU(4, return_state=True)(x) # s == last returned state
x = Lambda(lambda x: K.expand_dims(x, 0))(s)
x = GlobalAveragePooling1D()(x) # averages along axis1 (original axis2)
x = Dense(32, activation='relu')(x)
out = Dense(1, activation='sigmoid')(x)
model = Model(ipt, out)
model.compile('adam', 'binary_crossentropy')
return model
def make_data(batch_shape):
return (np.random.randn(*batch_shape),
np.random.randint(0, 2, (batch_shape[0], 1)))
m, timesteps = 16, 100
batch_shape = (1, m, timesteps, 1)
model = make_model(batch_shape)
model.summary() # see model structure
plot_model(model, show_shapes=True)
x, y = make_data(batch_shape)
model.train_on_batch(x, y)
Above assumes the task is binary classification, but you can easily adapt it to anything else - the main task's tricking Keras by feeding m samples as 1, and the rest of layers can freely take m instead as Keras doesn't enforce the 1 there.
Note, however, that I cannot guarantee this'll work as intended per the following:
Keras treats all entries along the batch axis as independent, whereas your samples are claimed as dependent
Per (1), the main concern is backpropagation: I'm not really sure how gradient will flow with all the dimensionality shuffling going on.
(1) is also consequential for stateful RNNs, as Keras constructs batch_size number of independent states, which'll still likely behave as intended as all they do is keep memory, but still worth understanding fully - see here
(2) is the "elephant in the room", but aside that, the model fits your exact description. Chances are, if you've planned out forward-prop and all dims agree w/ code's, it'll work as intended - else, and also for sanity-check, I'd suggest opening another question to verify gradients flow as you intend them to per above code.
model.summary():
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(1, 32, 100, 1)] 0
_________________________________________________________________
lambda (Lambda) (32, 100, 1) 0
_________________________________________________________________
gru (GRU) [(32, 16), (32, 16)] 864
_________________________________________________________________
lambda_1 (Lambda) (1, 32, 16) 0
_________________________________________________________________
global_average_pooling1d (Gl (1, 16) 0
_________________________________________________________________
dense (Dense) (1, 8) 136
_________________________________________________________________
dense_1 (Dense) (1, 1) 9
On LSTMs: will return two last states, one for cell state, one for hidden state - see source code; you should understand what this exactly means if you are to use it. If you do, you'll need concatenate:
from tensorflow.keras.layers import concatenate
# ...
x, s1, s2 = LSTM(return_state=True)(x)
x = concatenate([s1, s2], axis=-1)
# ...
I have the following network:
model = Sequential()
model.add(Embedding(400000, 100, weights=[emb], input_length=12, trainable=False))
model.add(Conv2D(256,(2,2),activation='relu'))
the output from the embedding layer is of shape (batchSize, 12, 100). The conv2D layer requires an input of shape (batchSize, filter, 12, 100), and I get the following error:
Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=3
So, how can I expand the output from the embedding layer to make it proper for the Conv2D layer?
I'm using Keras with Tensorflow as the back end.
Adding a reshape Layer should be the way to go https://keras.io/layers/core/#reshape
Depending on the concrete situation Conv1D cold although work.
I managed to add another dimension with the following piece of code:
model = Sequential()
model.add(Embedding(400000, 100, weights=[emb], input_length=12, trainable=False))
model.add(Lambda(lambda x: expand_dims(x, 3)))
model.add(Conv2D(256,(2,2),activation='relu'))
I want to build a customized layer in keras to do a linear transformation on the output of last layer.
For example, I got an output X from last layer, my new layer will output X.dot(W)+b.
The shape of W is (49,10), and the shape of X should be (64,49), the shape of b is (10,)
However, the shape of X is (?, 7, 7, 64), when I am trying to reshape it, it becomes shape=(64, ?). What is the meaning of question mark? Could you tell me a proper way to do linear transformation on the output of last layer?
The question mark generally represents the batch size, which has no effect on the model architecture.
You should be able to reshape your X with keras.layers.Reshape((64,49))(X).
You can wrap arbitrary tensorflow operations such as tf.matmul in a Lambda layer to include custom layers in your Keras model. Minimal working example that does the trick:
import tensorflow as tf
from keras.layers import Dense, Lambda, Input
from keras.models import Model
W = tf.random_normal(shape=(128,20))
b = tf.random_normal(shape=(20,))
inp = Input(shape=(10,))
x = Dense(128)(inp)
y = Lambda(lambda x: tf.matmul(x, W) + b)(x)
model = Model(inp, y)
Finally: refer to the Keras documentation on how to write custom layers with trainable weights.
I have an incredible simple algorithm that is erroring with, "ValueError: Error when checking input: expected dense_4_input to have shape (None, 5) but got array with shape (5, 1)"....
Here is the code I am running.
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
x = np.array([[1],[2],[3],[4],[5]])
y = np.array([[1],[2],[3],[4],[5]])
x_val = np.array([[6],[7]])
x_val = np.array([[6],[7]])
model = Sequential()
model.add(Dense(1, input_dim=5))
model.compile(optimizer='rmsprop', loss='mse')
model.fit(x, y, epochs=2, validation_data=(x_val, y_val))
There are two problems:
First: As the output already says: "ValueError: Error when checking input: expected dense_4_input to have shape (None, 5) but got array with shape (5, 1)" This means, that the Neural Network expects an array of shape (*, 5). With the asterisk I want to indicate that the dimensions is free to choose by the user. Say if you have tons of data and every example is a vector of shape (1, 5) you can stack them all underneath and pass one big chunk of data to the neural net, it will know how to handle it. Therefore you have to make x a row vector as follows:
x = np.array([[1,2,3,4,5]])
See also in the Keras docs- Specifying the input shape.
Second: You specify the output of the first Layer to be one. This means, the 5 dimensional input will be connected to only one neuron. Your output vector y however has 5 values. So your output vector dimension and your neural net output don't fit together.
So you have to go with a scalar y:
y = np.array([1])
Furthermore, your validation data and training data should have the same dimensions. Additionaly there is a typo in your code: y_val is never defined.
For example: I have a tensor with shape (5,10) and I want back a tensor with shape (5,10) but the first element should now be the last element. so [1,2,3,4,5]becomes [5,4,3,2,1] and [[1,2,3,4,5],[2,3,4,5,6]] becomes [[2,3,4,5,6],[1,2,3,4,5]].
If it matter, I am using tensorflow backend.
Using the Keras backend, there is the reverse function.
import keras.backend as K
flipped = K.reverse(x,axes=0)
For using it in a layer, you can create a Lambda layer:
from keras.layers import *
layer = Lambda(lambda x: K.reverse(x,axes=0),output_shape=(shape of x))
(If it's a sequential layer, model.add(layer), if a functional API model, output = layer(input)