How to get hidden node representations of LSTM in keras - tensorflow

I've implemented a model using LSTM program in keras. I am trying to get the representations of hidden nodes of the LSTM layer. Is this the right way to get the representation (stored in activations variable) of hidden nodes?
model = Sequential()
model.add(LSTM(50, input_dim=sample_index))
activations = model.predict(testX)
model.add(Dense(no_of_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adagrad', metrics=['accuracy'])
hist=model.fit(trainX, trainY, validation_split=0.15, nb_epoch=5, batch_size=20, shuffle=True, verbose=1)

Edit: your way to get the hidden representation is also correct.
Reference: https://github.com/fchollet/keras/issues/41
After training your model, you can save your model and the weights. Like this:
from keras.models import model_from_json
json_model = yourModel.to_json()
open('yourModel.json', 'w').write(json_model)
yourModel.save_weights('yourModel.h5', overwrite=True)
And then you can visualize the weights of your LSTM layers. Like this:
from keras.models import model_from_json
import matplotlib.pyplot as plt
model = model_from_json(open('yourModel.json').read())
model.load_weights('yourModel.h5')
layers = model.layers[1] # find the LSTM layer you want to visualize, [1] is just an example
weights, bias = layers.get_weights()
plt.matshow(weights, fignum=100, cmap=plt.cm.gray)
plt.show()

Related

Feeding tensorflow keras architecture with Sparse matrix of type scipy.sparse._csr.csr_matrix

Short Version:
I am trying to feed my data in the form of sparse matrix (of the type scipy.sparse._csr.csr_matrix') into a Tensorflow Keras Neural Network model. I highly appreciate any guidance. todense() and toarray() are not options for me. Also feeding in mini batches is not preferred.
Long version (including my efforts):
The problem is about a deep learning model with text, categorical and numerical features. My TfidfVectorizer creates a huge matrix which cannot be fed into a model as dense format.
text_cols = ['ca_name']
categorical_cols = ['cua_name','ca_category_modified']
numerical_cols = ['vidim1', 'vidim2', 'vidim3', 'vim', 'vid']
title_transformer = TfidfVectorizer()
numerical_transformer = MinMaxScaler()
categorical_transformer = OneHotEncoder(handle_unknown='ignore')
preprocessor = ColumnTransformer(
transformers=[
('title', title_transformer, text_cols[0]),
('num', numerical_transformer, numerical_cols),
('cat', categorical_transformer, categorical_cols)
])
# df['dur_linreg] is my numerical target
X_train, X_test, y_train, y_test = train_test_split(df[text_cols+categorical_cols+numerical_cols], df['dur_linreg'], test_size=0.2, random_state=42)
# fit_transform the preprocessor on X_train, only transform X_test
X_train_transformed = preprocessor.fit_transform(X_train)
X_test_transformed = preprocessor.transform(X_test)
I can build and compile a model as following:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train_transformed.shape[1],)))
modeladd(tf.keras.layers.Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
But cannot fit it:
history = model.fit(X_train_transformed, y_train, epochs=20, batch_size=32, validation_data=(X_test_transformed, y_test))
InvalidArgumentError: Graph execution error: TypeError: 'SparseTensor' object is not subscriptable
Obviously because I am feeding the model with a sparse scipy.sparse._csr.csr_matrix matrix.
The size of my matrix and my resources restrict me to transform it to
dense format:
X_train_transformed.todense()
MemoryError: Unable to allocate 205. GiB for an array with shape (275189, 100074) and data type float64
2) (obviously) array:
X_train_transformed.toarray()
MemoryError: Unable to allocate 205. GiB for an array with shape (275189, 100074) and data type float64
According to a post "https://stackoverflow.com/questions/41538692/using-sparse-matrices-with-keras-and-tensorflow" I there are two approaches
" Keep it as a scipy sparse matrix, then, when giving Keras a minibatch, make it dense
Keep it sparse all the way through, and use Tensorflow Sparse Tensors"
The second approach is preferred for me as well. Therefore, I tried the following as well:
However, again I could only build and compile the model without a problem:
from tensorflow.keras.layers import Input, Dense, Dropout
from tensorflow.keras.models import Model
input_layer = Input(shape=(X_train_transformed.shape[1],), sparse=True)
dense1 = Dense(64, activation='relu')(input_layer)
dropout1 = Dropout(0.2)(dense1)
dense2 = Dense(64, activation='relu')(dropout1)
dropout2 = Dropout(0.2)(dense2)
output_layer = Dense(1)(dropout2)
model = Model(input_layer, output_layer)
model.compile(optimizer='adam', loss='mean_squared_error')
But cannot fit it:
history = model.fit(X_train_transformed, y_train, validation_data=(X_test_transformed, y_test), epochs=5, batch_size=32)
InvalidArgumentError: Graph execution error:TypeError: 'SparseTensor' object is not subscriptable
Lastly, in case it is relevant I am using Tensorflow version 2.11.0 installed January 2023.
Many Thanks in advance for your help.

How to find class labels from a keras model

I am predicting classes, but there is something I don't get. In the simplified example below, I train a model to predict MNIST handwritten digits. My test set has an accuracy of 95%, when I use
model.evaluate(test_image, test_label)
However, when I use
model.predict(test_image)
and the extract the predicted labels using np.argmax(), this accuracy drops. When I run all the code again and again, this accuracy changes a lot.
I suspect now that the classes in the model are not ordered 0, 1 ... 9. Is there a way to see the class labels of a model? Or did I make another mistake?
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.datasets.mnist import load_data
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
import numpy as np
# Load data
(train_image, train_label), (test_image, test_label) = load_data()
# Train
model = Sequential([
Flatten(input_shape=(28,28)),
Dense(100, activation="relu"),
Dense(100, activation="relu"),
Dense(10, activation="sigmoid")
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics='accuracy')
history = model.fit(train_image, train_label,
batch_size=32, epochs=50,
validation_data=(test_image, test_label),
verbose = 0)
eval = model.evaluate(test_image, test_label)
print('Accuracy (auto):', eval[1]) # This is always high
# Predict and evaluate manually
predictions = model.predict(test_image)
pred = np.array([np.argmax(pred) for pred in predictions])
true = test_label
print('Accuracy (manually):', np.mean(pred == true)) # This varies a lot

How to unfold Xception layers in TensorFlow

I am following the official Keras transfer learning and fine-tuning tutorial. It consists of loading the Xception model with include_top=False, and adding a new classifier part on top.
I am then saving the model with model.save() and loading with load_model().
So this is what I see when I do model.summary()
My problem is that I would like to iterate through the layers, while now Xception layers are somehow folded (on the picture: xception(Functional)). Is there a way to somehow unfold it, to see all the layers (including those that are creating Xception)?
For model. summary(), you can unfold that as follows:
from tensorflow import keras
base_model = keras.applications.Xception(
weights='imagenet', # Load weights pre-trained on ImageNet.
input_shape=(150, 150, 3),
include_top=False) # Do not include the ImageNet classifier at the top.
inputs = keras.Input(shape=(150, 150, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)
model.summary() # full model
model.layers[1].summary() # only xception model
Note, you cal also see the layer using the plot_model utility.
keras.utils.plot_model(model, expand_nested=True)

Create keras model from another trained model

I am trying to create a new keras model from another trained keras model
Sample Code for model training referred from:
#TF version 2.2.0
from tensorflow.python.keras import models, layers
from tensorflow.python.keras.models import Sequential
from tensorflow.python import keras
import tensorflow as tf
from tensorflow.python.keras.layers import Dense
from tensorflow.keras.datasets import boston_housing
(x_train,y_train), (x_test,y_test) = boston_housing.load_data()
model = Sequential()
model.add(Dense(2, activation='relu', input_shape=(13,)))
model.add(Dense(1, activation='softmax'))
model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(x_train, y_train,batch_size= 64,epochs= 1,validation_split=0.2)
Saving the model as json
json_obj = model.to_json()
new_model = keras.models.model_from_json(json_obj)
But after creating the new_model the weights are different:
model.get_weights() != new_model.get_weights()
This is the same case if I create new_model using from_config(). The question is, shouldn't the weight be the same for both model and new_model as I am creating new_model from model or my understanding is wrong? Any suggestions are helpful
to_json doesn't save the model's weights, but only the architecture.
Check here: to_json method
I recommend you to use save_model method.
If you want to copy the model to another directly, do the following:
new_model = tf.keras.models.clone_model(model)
new_model.set_weights(model.get_weights())

Keras model with several inputs and several outputs

I want to build a Keras model with two inputs and two outputs which both use the same architecture/weights. Both outputs are then used to compute a​ single loss.
Here is a picture of my desired architecture.
This is my pseudo code:
model = LeNet(inputs=[input1, input2, input3],outputs=[output1, output2, output3])
model.compile(optimizer='adam',
loss=my_custom_loss_function([output1,outpu2,output3],target)
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
Can this approach work?
Do I need to use a different Keras API?
The architecture is fine. Here is a toy example with training data of how it can be defined using keras' functional API:
from keras.models import Model
from keras.layers import Dense, Input
# two separate inputs
in_1 = Input((10,10))
in_2 = Input((10,10))
# both inputs share these layers
dense_1 = Dense(10)
dense_2 = Dense(10)
# both inputs are passed through the layers
out_1 = dense_1(dense_2(in_1))
out_2 = dense_1(dense_2(in_2))
# create and compile the model
model = Model(inputs=[in_1, in_2], outputs=[out_1, out_2])
model.compile(optimizer='adam', loss='mse')
model.summary()
# train the model on some dummy data
import numpy as np
i_1 = np.random.rand(10, 10, 10)
i_2 = np.random.rand(10, 10, 10)
model.fit(x=[i_1, i_2], y=[i_1, i_2])
Edit given that you want to compute the losses together you can use Concatenate()
output = Concatenate()([out_1, out_2])
Any loss function you pass into model.compile will be applied to output in it's combined state. After you get the output from a prediction you can just split it back up into it's original state:
f = model.predict(...)
out_1, out_2 = f[:n], f[n:]