Build a Keras model that takes a structured array as an input - numpy

I'm doing Reinforcement Learning to teach agents to accomplish tasks in a 2-dimensional world. A big part of that is to figure out how to represent their environment as neurons.
So far I've represented the world has a 3-d grid of shape (10, 10, 7). The first two 10s are because the size of the grid is 10 in each direction, and the 7 is because I have 7 different kinds of things to say about each space (whether it has food, an enemy, a wall...)
I then used convolutional layers in Keras to process this information and learn from it. It worked and the creatures are successfully walking towards the food.
Now I would like to also add more information that the neural network might figure out how to use. For example, I'd like to encode the last action the agent took. I might also encode the distance or angle to the nearest food. Obviously, this is not 3-d data, this is a sequence of 1-d data.
I want Keras to be able to use that as input together with the 3-d input, and learn from it. I've represented that combined data as a structured array in NumPy:
observation = np.zeros((1,), dtype=[('grid', np.float64, (10, 10, 7,)), ('sequential', np.float64, (7,))])
That way it's possible to access the grid data as observation['grid'] and the sequential data as observation['sequential'].
Unfortunately I don't know how to get Keras to work with this kind of structured array. My reasoning is that I should build a model using the Functional API, and that model will have two "prongs" for the input that'll connect together to a concatenate at some point and get merged to a final output layer.
But, I have no idea how to make Keras figure out that the NumPy structured array should be broken down to the subarrays that it's made of. Is that possible?
If I'm going the wrong way with this, please advise.

in keras you can give different inputs as follow
from keras.models import Input, Model
from keras.layers import Conv2D, Dense, Flatten, concatenate
first_input = Input(shape=(10, 10, 7))
second_input = Input(shape=(7))
c1 = Conv2D(32, (3,3), padding='same') (first_input)
c1 = Flatten() (c1)
d1 = Dense(10) (second_input)
m = concatenate([c1,d1])
m = Dense(5) (m)
model = Model(inputs=[first_input,second_input], outputs=m)
model.compile(optimizer='adam' loss='categorical_crossentropy')
model.fit([observation['grid'], observation['sequential']], Y_train)
this is rough design but it will take your input and concatenate, then produce result

Related

Keras Conv3D Layer with Discrete Values

I'm trying to build a model that will learn features of a 3D space. Unlike image processing, the values of the 3D matrix are not continuous; they represent some discrete value of what "material" can be found at that specific coordinate (grass with value 1 or stairs with value 2 for example).
Is it possible to train a model to learn the features of the space without interpolating in-between values? For example, I don't want the neural net to deduce 1.5 to be some kind of grass stairs.
You'll want to use one-hot encoding, which represents categorical values as arrays of zeroes with a single value set to one. This means that grass (id = 1) would be [0, 1, 0, 0, ...] and stairs (id = 2) would be [0, 0, 1, 0, ...]. To perform one-hot encoding, look into keras' to_categorical function.
Further reading:
one-hot encoding tutorial
one-hot preprocessing using to_categorical
one-hot on the fly using an embedding layer
As any categorical model, this should be a "one-hot" data.
The "channels" dimension of your data should have a size of n-materials.
Values = 0 mean there is no presence of that material
Values = 1 mean there is presence of that material
So, your input shape will be something like (samples, spatial1, spatial2, spatial3, materials). If your data is currently shaped as (samples, s1, s2, s3) and has the materias as integers as you described, you can use to_categorical to transform the integers to "one-hot".
Although I am not sure if this is what you are asking for, I would imagine that t after the bottleneck of the convolutional network, one would typically use a flatten layer and then the output goes to a dense layer. The output layer, if using sigmoid activation will give you probabilities for each of the classes which have to be one-hot encoded, as others have suggested.
If you want the output of the network itself to be in discreet values, I suppose you can use some sort of step-wise activation function in the output layer. However you have to take care that your loss remains differentiable throughout the network (which is why such activation functions are not available in keras). This might be of interest: https://github.com/keras-team/keras/issues/7370

What dimension is the LSTM model considers the data sequence?

I know that an LSTM layer expects a 3 dimension input (samples, timesteps, features). But which of it dimension the data is considered as a sequence.
Reading some sites I understood that is the timestep, so I tried to create a simple problem to test.
In this problem, the LSTM model needs to sum the values in timesteps dimension. Then, assuming that the model will consider the previous values of the timestep, it should return as an output the sum of the values.
I tried to fit with 4 samples and the result was not good. Does my reasoning make sense?
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, LSTM
X = np.array([
[5.,0.,-4.,3.,2.],
[2.,-12.,1.,0.,0.],
[0.,0.,13.,0.,-13.],
[87.,-40.,2.,1.,0.]
])
X = X.reshape(4, 5, 1)
y = np.array([[6.],[-9.],[0.],[50.]])
model = Sequential()
model.add(LSTM(5, input_shape=(5, 1)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X, y, epochs=1000, batch_size=4, verbose=0)
print(model.predict(np.array([[[0.],[0.],[0.],[0.],[0.]]])))
print(model.predict(np.array([[[10.],[-10.],[10.],[-10.],[0.]]])))
print(model.predict(np.array([[[10.],[20.],[30.],[40.],[50.]]])))
output:
[[-2.2417212]]
[[7.384143]]
[[0.17088854]]
First of all, yes you're right that timestep is the dimension take as data sequence.
Next, I think there is some confusion about what you mean by this line
"assuming that the model will consider the previous values of the
timestep"
In any case, LSTM doesn't take previous values of time step, but rather, it takes the output activation function of the last time step.
Also, the reason that your output is wrong is because you're using a very small dataset to train the model. Recall that, no matter what algorithm you use in machine learning, it'll need many data points. In your case, 4 data points are not enough to train the model. I used slightly more number of parameters and here's the sample results.
However, remember that there is a small problem here. I initialised the training data between 0 and 50. So if you make predictions on any number outside of this range, this won't be accurate anymore. Farther the number from this range, lesser the accuracy. This is because, it has become more of a function mapping problem than addition. By function mapping, I mean that your model will learn to map all values that are in training set(provided it's trained on enough number of epochs) to outputs. You can learn more about it here.

Keras: how to append a vector to a tensor

I have a network with keras / tf where two branches are built:
- one where a short sequence of words is transformed into 300-dim embeddings
- the other where the same sequence of words is transformed into ngrams
I then end up with two data structures:
termwords.shape = (?, 42, 300)
termngrams.shape = (?, 42)
(I make sure that both branches have the same 'length' of 42, ie. maximally 42 words and maximally 42 ngrams, padding/cutting where needed). I'd then need to merge these into one branch to arrive at the prediction layer.
But
merged = merge([termwords, termngrams], mode='concat')
tells me that the ranks don't match. I was hoping concat would allow me to append the 'termngrams' to the 'termwords' such that I end up just with a data structure of shape (?,42,301). But I can't find the proper way to express that.
The "rank" error is telling you that the tensors don't have the same number of dimensions. One is 2D and the other is 3D.
Use a Lambda layer with expand_dims to add an extra dimension to the 2D one.
import keras.backend as K
from keras.layers import Lambda
termngrams = Lambda(lambda x: K.expand_dims(x))(termngrams) #outputs (?,42,1)
Then use a Contatenate() layer (by default it uses the last axis, as you want).
merged = Concatenate()([termwords,termngrams])
(Assuming you're using a functional API Model instead of sequential models, sequential models aren't good for branching)

Time series translation into Keras input

I have timeseries data (ECG). I have annotations for blocks of 30seconds.
each block has 1000 data points. We have 500 of those data blocks.
The target, the annotations are e.g. in range 1 to 5.
To be clear please see Figure
About X-DATA
How translate that into the Keras notation for input data [Samples,timesteps, features]?
My guess:
Samples=Blocks (500)
timesteps=values(1000)
features= ECG as itselve (1)
resulting in [500,1000,1]
About Y-Data(target)
My target or y data would result in
[500,1,1]
after one hot encoding it would be
[500,5,1]
The problem is that Keras expect the X and y data to be of same dimensions. But increasing my ydata to 1000 per timestep would not make sense to me.
Thanks for your help
p.s. cannot answer directly as I am with my parent in law. Thanks in advance
I think you're thinking about y incorrectly. From my understanding based on you're graph.
y actually is (500, 5) after one hot encoding. That is, for every block there is a single outcome.
Also there is no need for X and y to have the same dimensions in Keras (unless you have a seq2seq requirement which is not the case here).
What we do want is the model to give us a probability distribution over
the possible labels for each block, and that we'll achieve using a softmax
on the last (Dense) layer.
Here is how I simulated your problem:
import numpy as np
from keras.models import Model
from keras.layers import Dense, LSTM
# using eye doesn't capture one-hot but works for the example
series = np.random.rand(500, 1000, 1)
labels = np.eye(500, 5)
inp = Input(shape=(1000, 1))
lstm = LSTM(128)(inp)
out = Dense(5, activation='softmax')(lstm)
model = Model(inputs=[inp], outputs=[out])
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(series, labels)

Why does Keras to_categorical method not return 3-D tensor when inputting 2-D tensor?

I was trying to build a LSTM neural net with Keras to predict tags for words in a set of sentences.
The implementation is all pretty straight forward, but the surprising thing was that
given the exactly same and otherwise correctly implemented code and
using Tensorflow 1.4.0 with Keras running on Tensorflow Backend,
on some people's computers, it returned tensors with wrong dimensions, while for others it worked perfectly.
The problem occured in the following context:
First, we turned the list of training sentences (sentences as a list of word indeces) into a 2-D matrix using the pad_sequences method from Keras (https://keras.io/preprocessing/sequence/):
def do_padding(sequences, length, padding_value):
return pad_sequences(sequences, maxlen=length, padding='post',
truncating='post', value=padding_value)
train_sents_padded = do_padding(train_sents, MAX_LENGTH,
word_to_id[PAD_TOKEN])
Next, we used our do_padding method on the corresponding training labels to turn them into a padded matrix. At the same time, we used the Keras to_categorical method (https://keras.io/utils/#to_categorical) to add a one-hot encoded vector to the created label matrix (one one-hot vector for each cell in the matrix, that means for word in each training sentence):
train_labels_padded = to_categorical(do_padding(train_labels, MAX_LENGTH,
label_to_id["O"]), NUM_LABELS)
We expected the resulting shape to be 3-D: (len(train_labels), MAX_LENGTH, NUM_LABELS). Yet, we found that the resulting shape was 2-D and basically looked like this: ((len(train_labels) x MAX_LENGTH), NUM_LABELS), meaning the numbers on the two expected dimensions len(train_labels) and MAX_LENGTH were multiplied and flattened into one dimension.
Interestingly, this problem as said before only occured for about 50% of the people, using Tensorflow 1.4.0 and Keras running on Tensorflow Backend.
We managed to solve the problem by reshaping the label matrix manually:
train_labels_padded = np.reshape(train_labels_padded, (len(train_labels),
MAX_LENGTH, NUM_LABELS))
I was just wondering if any of you have experienced a similar problem and have figured out the reason why this happens.