Keras: Droupout with functional api in mlp [closed] - tensorflow

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
i am using du functional api from keras and would like to add a dropout to my multi layer perceptron.
do i have to put the dropout before or after the layer and do i have to connect the next layer to the dropout or to the previous layer?
hidden_Layer_2 = Dense(152,activation='relu')(hidden_Layer_1)
dropout_2 = Dropout(0.4)(hidden_Layer_2)
hidden_Layer_3 = Dense(152, activation='relu')(hidden_Layer_2)
or
hidden_Layer_2 = Dense(152,activation='relu')(hidden_Layer_1)
dropout_2 = Dropout(0.4)(hidden_Layer_2)
hidden_Layer_3 = Dense(152, activation='relu')(dropout_2 )

The second option is the right one. You always need to connect the layers in the order you want to use them.
hidden_Layer_2 = Dense(152,activation='relu')(hidden_Layer_1)
dropout_2 = Dropout(0.4)(hidden_Layer_2)
hidden_Layer_3 = Dense(152, activation='relu')(dropout_2 )

Related

Why predicted values are towards center? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 10 months ago.
Improve this question
It looks like most predicted values are close to 0.5. How can the predicted values follow closer the original values?
normalizer = layers.Normalization()
normalizer.adapt(np.array(X_train))
model = keras.Sequential([
normalizer,
layers.Dense(8, activation='relu'),
layers.Dense(1, activation='linear'),
layers.Normalization()
])
There might be many issues here, but definitely you cannot normalize data at the output. You are literally saying "on average, I am expecting my output to be 0 and have unit variance". This makes sense iff your target is a standard, normalised Gaussian, but from the plot you can tell clearly it is not. Normalising inputs, or internal activations is fine, as there is always the final layer to apply final affine mapping. But if you do so at the end of the network, you are just making it impossible to learn most targets/signals.
Once this is solved, a network with 8 hidden neurons is extremely tiny and there is absolutely no guarantee it can learn anything, your training loss is very far from 0, you should make it much, much more expressive, and try to get your training to 0, if you can't do this - you have a bug somewhere else in the code (or the model is not expressive enough).

How to use Bahdanau attention for timeseries prediction? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Can we use Bahdanau attention for multivariate time-series prediction problem? Using the Bahdanau implementation from here, I have come up with following code for time series prediction.
from tensorflow.keras.layers import Input, LSTM, Concatenate, Flatten
from attention_keras import AttentionLayer
from tensorflow.keras import Model
num_inputs = 5
seq_length = 10
inputs = Input(shape=(seq_length, num_inputs), name='inputs')
lstm_out = LSTM(64, return_sequences=True)(inputs)
lstm_out = LSTM(64, return_sequences=True)(lstm_out)
# Attention layer
attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([lstm_out, lstm_out])
# Concat attention input and LSTM output, in original code it was decoder LSTM
concat_out = Concatenate(axis=-1, name='concat_layer')([lstm_out, attn_out])
flat_out = Flatten()(concat_out)
# Dense layer
dense_out = Dense(seq_length, activation='relu')(flat_out)
predictions= dense_time(1)(dense_out)
# Full model
full_model = Model(inputs=inputs, outputs=predictions)
full_model.compile(optimizer='adam', loss='mse')
For my data, the model does perform better than vanilla LSTM without attention, but I am not sure if this implementation make sense or not?

Multiple output single loss model [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have model subclassing the tensorflow.keras.models.Model class. The call method returns [output_1, ouput_2], where output_1 and output_2 have different shapes. How can I pack both outputs to be used on the same loss function? (Have y_pred on the custom loss be the list returned by the call method)
Do you necessarily need the neural net output to be 2 separate outputs?
Instead, you could combine them into one output and then separate them out later once you are using the data in the rest of your application. To combine them, use a tf.keras.layers.concatenate layer after your last layers, which will combine your 2 outputs into 1. That way, the only 1 vector needs to be passed to the loss function.

Is it possible to use tf.contrib.quantize.create_training_graph with Keras model? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Is it possible to use tf.contrib.quantize.create_training_graph for model quantiation with already trained Keras model?
As I understand I can import tf.Graph from Keras model, but can I finetune it after modification with tf.contrib.quantize.create_training_graph ?
I was able to call tf.contrib.quantize.create_training_graph(input_graph=K.get_session().graph, quant_delay=int(0)) after model definition and model load, but get:
2019-02-22 14:56:24.216742: W tensorflow/c/c_api.cc:686] Operation '{name:'global_average_pooling2d_1_1/Mean' id:3777 op device:{} def:{global_average_pooling2d_1_1/Mean = Mean[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](conv2d_15_1/act_quant/FakeQuantWithMinMaxVars:0, global_average_pooling2d_1_1/Mean/reduction_indices)}}' was changed by updating input tensor after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.
At least I was able to save model with uint8 weights when converting to keras->tensorflow->tflite, as I understand input to model and inference is still fp32.
converter = tf.contrib.lite.TFLiteConverter.from_frozen_graph(
graph_def_file='tf_model.pb',
input_arrays=input_node_names,
output_arrays=output_node_names)
converter.post_training_quantize = True
tflite_model = converter.convert()
https://github.com/keras-team/keras/issues/11105

Trained a model using ssd_inception_v2_coco, what do i do next? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I followed a tutorial about detecting objects using deep learning here: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html
At some point, after training up to 4082 steps, i stopped the training using CTRL+C.
Now i have bunch of files under my training directory, which looks like this:
list of files in the training directory
The question is, how do i proceed now? what to do next? the tutorial doesn't teach you how to use the training data, how to even test it if its recognizing correctly.
Thanks in advance.
The files you obtained are checkpoints. What you want to do now is to restore your model from the checkpoints. Indications from CV tricks:
with tf.Session() as sess:
model = tf.train.import_meta_graph('my_test_model-1000.meta')
model.restore(sess, tf.train.latest_checkpoint('./'))
After, you can evaluate your model on your test set:
test_accuracy = tfe.metrics.Accuracy()
for (x, y) in test_dataset:
logits = model(x)
prediction = tf.argmax(logits, axis=1, output_type=tf.int32)
test_accuracy(prediction, y)
print("Test set accuracy: {:.3%}".format(test_accuracy.result()))