How to use trained Keras CNN model for prediction with new unlabeled data - tensorflow

The temperature prediction time series tutorial on Google colab provides a good walk through on setting up the training, validation, and test performance for various models. How can I use this trained multi_conv_model to run a temperature prediction with new unlabeled data. Specificallly looking for how to call the Keras predict function with a dataframe of inputs only.
https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/structured_data/time_series.ipynb
CONV_WIDTH = 3
multi_conv_model = tf.keras.Sequential([
# Shape [batch, time, features] => [batch, CONV_WIDTH, features]
tf.keras.layers.Lambda(lambda x: x[:, -CONV_WIDTH:, :]),
# Shape => [batch, 1, conv_units]
tf.keras.layers.Conv1D(256, activation='relu', kernel_size=(CONV_WIDTH)),
# Shape => [batch, 1, out_steps*features]
tf.keras.layers.Dense(OUT_STEPS*num_features,
kernel_initializer=tf.initializers.zeros()),
# Shape => [batch, out_steps, features]
tf.keras.layers.Reshape([OUT_STEPS, num_features])
])
history = compile_and_fit(multi_conv_model, multi_window)
IPython.display.clear_output()
multi_val_performance['Conv'] = multi_conv_model.evaluate(multi_window.val)
multi_performance['Conv'] = multi_conv_model.evaluate(multi_window.test, verbose=0)
multi_window.plot(multi_conv_model)
Here's what I tried but it is not giving meaningful 5 period forecast:
predict_inputs_df = test_df[:20] # or some other input data points
predict_inputs_df = (predict_inputs_df - train_mean) / train_std
predictions = conv_model(tf.stack([np.array(predict_inputs_df)]))
predictions

You need to do conv_model.evaluate(tf.stack([np.array(predict_inputs_df)])).
That should give you some results.

Related

multi gpu mixed_precision in Tensorflow yields NaN prediction value

def get_model():
with STRATEGY.scope():
# Set seed for deterministic weights initialization
seed_everything()
# Inputs, note the names are equal to the dictionary keys in the dataset
image = tf.keras.layers.Input(INPUT_SHAPE, name='image', dtype=tf.uint8)
image_norm = normalize(image)
backbone = convnext.ConvNeXtV2Tiny(
input_shape=(IMG_HEIGHT, IMG_WIDTH, 3),
pretrained='imagenet21k-ft1k',
num_classes=0,
)
# CNN Prediction in range [0,1]
features = backbone(image_norm1)
# Average Pooling BxHxWxC -> BxC
pooled_features = tf.keras.layers.GlobalAveragePooling2D()(features)
dropout_features = tf.keras.layers.Dropout(0.30)(lstm_features)
# Output value between [0, 1] using Sigmoid function
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(dropout_features)
# Loss
loss = tf.keras.losses.BinaryCrossentropy(from_logits=False)
model = tf.keras.models.Model(inputs=image, outputs=outputs)
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
return model
When I set mixed_precision.Policy('mixed_float16'), the prediction of model is Nan. However, with mixed_precision.Policy('float32'), the prediction of model is between 0 and 1.
Is there any reason why mixed precision causes NaN value?

Forecasting a subset of time-series features

I've followed the TensorFlow tutorial on time series forecasting:
https://www.tensorflow.org/tutorials/structured_data/time_series#multi-output_models
The tutorial demonstrates how to forecast a single feature or all features for a single time-step using a residual wrapper with a LSTM. How do I predict a subset (2) of features?
The code for predicting a single feature:
wide_window = WindowGenerator(
input_width=24, label_width=24, shift=1,
label_columns=['T (degC)'])
for example_inputs, example_labels in wide_window.train.take(1):
print(f'\nInputs shape (batch, time, features): {example_inputs.shape}')
print(f'Labels shape (batch, time, features): {example_labels.shape}\n')
class ResidualWrapper(tf.keras.Model):
def __init__(self, model):
super().__init__()
self.model = model
def call(self, inputs, *args, **kwargs):
delta = self.model(inputs, *args, **kwargs)
# The prediction for each timestep is the input
# from the previous time step plus the delta
# calculated by the model.
return inputs + delta
residual_lstm = ResidualWrapper(
tf.keras.Sequential([
# Shape [batch, time, features] => [batch, time, lstm_units]
tf.keras.layers.LSTM(32, return_sequences=True),
# Shape => [batch, time, features]
tf.keras.layers.Dense(
units=1,
# The predicted deltas should start small
# So initialize the output layer with zeros
kernel_initializer=tf.initializers.zeros)
]))
When I try predicting both T (degC) and p (mbar):
wide_window = WindowGenerator(
input_width=24, label_width=24, shift=1,
label_columns=['T (degC)','p (mbar)'])
residual_lstm = ResidualWrapper(
tf.keras.Sequential([
# Shape [batch, time, features] => [batch, time, lstm_units]
tf.keras.layers.LSTM(32, return_sequences=True),
# Shape => [batch, time, features]
tf.keras.layers.Dense(
units=2,
# The predicted deltas should start small
# So initialize the output layer with zeros
kernel_initializer=tf.initializers.zeros)
]))
I get an error:
ValueError: Dimensions must be equal, but are 19 and 2 for '{{node residual_wrapper/add}} = AddV2[T=DT_FLOAT](IteratorGetNext, residual_wrapper/sequential/dense/BiasAdd)' with input shapes: [?,24,19], [?,24,2].

How to stop training CNN part while continue training ANN part in a Multi-input Model?

I made a multi-input model in Keras which takes image shape=[N, 640, 480, 3] as well as numerical data shape=[N, 19] and does prediction on 12 classes.
Following is the model defining part of code:
# # %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# # MODEL === CNN
# # %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#
base_model = keras.applications.ResNet50(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(640, 480, 3),
include_top=False) # Do not include the ImageNet classifier at the top.
base_model.trainable = False
input_Cnn = keras.Input(shape=(640, 480, 3))
x = base_model(input_Cnn, training=False)
# Convert features of shape `base_model.output_shape[1:]` to vectors
x = keras.layers.GlobalAveragePooling2D()(x)
# A Dense classifier with a single unit (binary classification)
x1 = keras.layers.Dense(1024, activation="relu")(x)
out_Cnn = keras.layers.Dense(12, activation="relu")(x1)
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# MODEL === NN
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
inp_num = keras.layers.Input(shape=(19,)) # no. of columns of the numerical data
fc1 = keras.layers.Dense(units=2 ** 6, activation="relu")(inp_num)
fc2 = keras.layers.Dense(units=2 ** 8, activation="relu")(fc1)
fc3 = keras.layers.Dense(units=2 ** 10, activation="relu")(fc2)
fc4 = keras.layers.Dense(units=2 ** 8, activation="relu")(fc3)
fc5 = keras.layers.Dense(units=2 ** 6, activation="relu")(fc4)
out_NN = keras.layers.Dense(12, activation="relu")(fc5)
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# CONCATENATION
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
result = keras.layers.concatenate((out_Cnn, out_NN), axis=-1) # [N, 12] --- concatenate [N, 12] ==> [N, 24]
result = keras.layers.Dense(1024, activation='relu')(result)
result = keras.layers.Dense(units=12, activation="softmax")(result)
model = keras.Model([input_Cnn, inp_num], result)
print(model.summary())
Problem is that the CNN part (if independently trained) trains in a less number of epochs while the ANN part (if independently trained) takes a longer time (more epochs). But here in this code when both are combined, accuracy doesn't go beyond 10%. Is there any way to stop gradients flowing into the CNN part after a certain number of epochs so that after that model trains only the ANN part?
Im not using keras but after a quick google search this should be the answer:
You can freeze layers, so that certain parameters are not learnable anymore:
# this freezes the first N layers
for layer in model.layers[:N]:
layer.trainable = False
Where N is the amount of convolutional layers you have.

NaN loss in CNN-LSTM on Keras for Time Series forecasting

I've to predict the time dependence of soil wet from the rainfall and some several time series. For all of them I've forecasts and the only to do is prediction of soil wet.
According to guide I build a CNN model, cause Arima's can't take into account outer stohastic influence.
The model work's, but not as it should.
If You have a look on this picture enter image description here, You'll find that the forecasted series(yellow smsfu_sum) doesn't depend on rain (aprec series) as in training set. I want a sharp peak in forecast, but changing the sizes of kernel and pooling don't help.
So I tried to train CNN-LSTM model based on this guide
Here's code of architecture of model :
def build_model(train, n_input):
# prepare data
train_x, train_y = to_supervised(train, n_input)
# define parameters
verbose, epochs, batch_size = 1, 20, 32
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='softmax', input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=3, activation='softmax'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(RepeatVector(n_outputs))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(100, activation='softmax')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mse', optimizer='adam')
# fit network
model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
return model
I used batch size = 32, and split data with function:
def to_supervised(train, n_input, n_out=300):
# flatten data
data = train.reshape((train.shape[0]*train.shape[1], train.shape[2]))
X, y = list(), list()
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
X.append(data[in_start:in_end, :])
y.append(data[in_end:out_end, 2])
# move along one time step
in_start += 1
return array(X), array(y)
Using n_input = 1000 and n_output = 480 (I've to predict for this time)
So the first iteration on this Network tends the loss function to Nan.
How should I fix it? There no missing values in my data, I droped every NaNs.

How can I create a `tf.data.Dataset` for a `tf.keras` model that accepts multiple inputs of various shapes?

I have a tf.keras model that needs to accept multiple inputs of multiple shapes. My goal is to build it in such a way that I can train and evaluate it easily using its fit and evaluate API.
So far, the model is built as follows:
class MultipleLSTM(Model):
def __init__(self, lstm_dims=128, name='multi_lstm', **kwargs):
super(MultipleLSTM, self).__init__(name=name)
# initialize encoders for every attribute
self.encoders = []
for key, value in kwargs.items():
self.encoders.append(self._create_encoder(lstm_dims, value))
# initialize the rest of the network layers
self.concat = Concatenate(axis=0)
self.conv_1 = Conv2D(6, 4, activation='relu')
self.flatten = Flatten()
self.dense = Dense(128, activation='relu')
self.out = Dense(1, activation='sigmoid')
def call(self, inputs):
x_1 = self.encoders[0](inputs[0])
x_2 = self.encoders[1](inputs[1])
x_3 = self.encoders[2](inputs[2])
x_4 = self.encoders[3](inputs[3])
x = self.concat([x_1, x_2, x_3, x_4])
# fix the shape for the convolutions
x = tf.expand_dims(x, axis=0)
x = tf.expand_dims(x, axis=3)
x = self.conv_1(x)
x = self.flatten(x)
x = self.dense(x)
x = self.out(x)
return x
def _create_encoder(self, lstm_dims, conf):
with tf.name_scope(conf['name']) as scope:
encoder = tf.keras.Sequential(name=scope)
encoder.add(Embedding(conf['vocab'],
conf['embed_dim'],
input_length=conf['input_length']))
encoder.add(Bidirectional(LSTM(lstm_dims)))
return encoder
There are four different inputs, text sentences of different lengths, that are fed to four different Embedding and LSTM layers (encoders). Then the outputs of those layers are concatenated to create a single tensor that is forwarded to the subsequent layers.
To train this network, I'm passing as input a list of lists, for the different tokenized sentences. The label is just number, 0 or 1 (binary classification). For example, an input could be:
x = [[1, 2, 3, 4],
[2, 3, 5],
[3, 5, 6, 7],
[1, 5, 7]]
y = 0
For now, I have implemented a custom loop that takes such input and trains the network:
def train(data, model, loss_fn, optimizer, metric, epochs=10, print_every=50):
for epoch in range(epochs):
print(f'Start of epoch {epoch+1}')
for step, (x_batch, y_batch) in enumerate(data):
with GradientTape() as tape:
output = model(x_batch)
loss = loss_fn(y_batch, output)
grads = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
metric(loss)
if step % print_every == 0:
print(f'step {step}: mean loss = {metric.result()}')
But this prevents me from exploiting the easy to use tf.keras API, to fit and evaluate the model or even split the dataset into train and test sets. Thus, the question is: How can I create a tf.data.Dataset from such x's and y's and pass it to the fit function of tf.keras?
You can use the functional api of keras to do so. Here is the link of the keras documentation on multi input, output if you want : Multi-input and multi-output models
You can directly pass the different inputs as a list and fit and evaluate methods.
model.fit([X_train[:,0], X_train[:,1]], y_train, ...)