Keras input layer - tensorflow

I am unsure if I need to add a Dense input layer before adding LSTM layers in my model. Forexample, with the following model:
# Model
model = Sequential()
model.add(LSTM(128, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(Dense(5, activation="linear"))
Will the LSTM layer be the input layer, and the Dense layer the output layer (meaning no hidden layers)? Or does Keras create an input layer meaning the LSTM layer will be a hidden layer?

You don't need too. It depends on what you want to accomplish.
Check here some cases.
In your case, yes the LSTm will be the first layer and the Dense layer will be the output layer.

The current configuration is okay for simple examples. Everything is based on what you want to get as results. The model and layers subject to change based on the target goal. So, if the model is complex, you can make mix model with different layers and shapes. See reference.
Mix model layering


Transform weights from channels_first to channels_last Conv2D layer

I have two models, their structure are exactly the same except
The data_format of Conv2D layers in first model is channels_first, while second model is channels_last
First model accepts NCHW tensor input while the second model accepts NHWC
What I want to do is :
Using get_weights to retrieve the weights from layer of first model, and then transform the weights and fill into the corresponding layer of second model by set_weights
The model only contains Conv2D and Dense layers, I assume Dense's weights also need be transformed since their inputs changed.
How should I do it?
P.S. I tried ONNX convert but failed, that's another question.

Dropout implementation in tf.Keras

Consider the following model:
model = Sequential()
model.add(Dense(60, input_shape=(60,), activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dense(30, activation='relu', kernel_constraint=MaxNorm(3)))
model.add(Dense(1, activation='sigmoid'))
I understand the idea behind Dropout for regularization. According to my understanding, Dropout is applied per layer with a rate p which determines the probability of a neuron being dropped. In the above example, I cannot understand whether the first dropout layer is applied to the first hidden layer or the second hidden layer. Because as I have mentioned before, the dropout is applied per layer, and what confuses me here is that Keras deals with dropout as a layer on its own. Moreover, if the first dropout layer is applied to the second hidden layer, what about the second dropout layer? Is it applied to the output layer (which is not valid at all to apply dropout to output neurons)? So please can someone clarify these points?
As per documentation in keras:
Applies Dropout to the input.
Therefore the input layer to the drop out is dropped at a probability of p. In your case it means the first layer. In your example, 20% of 60 Neurons from first layer will be dropped.
Also it doesn't make sense if drop out works on the layer succeeding it because, in that way you will drop out from the last layer - which in classification can be the result.

How do I create a deep learning model by concatenating two hidden layers of the same output shape of Resnet and VGG-16 using TensorFlow?

I want to create a CNN model using the concatenation of hidden layers two pretrained models Resnet and VGG16
After you define model, checkout these pretrained models layers by model.summary(), then when you define layer, try to take output of that layer in this way; first get the model.get_layer('layer_name') and then take its output by layer.output, and now concatenate the outputs of the layers that you have defined before.

Extracting activations from a specific layer of neural network

I was working on an image recognition problem. After training the model, I saved the architecture as well as weights. Now I want to use the model for extracting features from other images and perform SVM on that. For this, I want to remove the last two layers of my model and get the values calculated by the CNN and fully connected layers till then. How can I do that in Keras?
# a simple model
model = keras.models.Sequential([
keras.layers.Conv2D(16, 3, activation='relu'),
keras.layers.Dense(10, activation='softmax')
# after training
feature_only_model = keras.models.Model(model.inputs, model.layers[-2].output)
feature_only_model take a (32,32,3) for input and the output is the feature vector
If your model is subclassed - just change call() method.
If not:
if your model is complicated - wrap your model by subclassed model and change forward pass in call() method, or
if your model is simple - create model without the last layers, load weights to every layer separately

Keras Dense Layer Propagate Mask

After checking the official doc here keras mask tutorial, it is still not clear to me whether Keras Dense layer can propagate the mask to its following layers 4 and 5 in below example.
Another question is, when calculating the loss at the 5th layer, shall we apply the mask?
We can say it is not needed because the 2nd layer LSTM already ignored those <pad> tokens in input sequences. However, I've read somewhere that the LSTM output of <pad> tokens are NOT zero, but following the last valid token's output. It will affect the value of loss. Thus, we need to apply the mask at the 5th layer?
Our input are padded sequences, and we have a sequential model in Keras
Embedding layer with mask_zero = True //can generate mask
LSTM layer //can consume mask
Dense layer //Question: can this layer propagate mask to other layers in this model
other layers...
output layer with sigmoid as activation function
Thanks for your kind help!