How to update the hmmlearn learned object when we have new samples? - hidden-markov-models

I have implemented a simple code for Hidden Markov Model by hmmlearn and it is working well. I used fit() method, i.e. hmmlearn.fit to learn the hmm parameter based on my data. If I have more data and want to update previously fitted model without training/fitting from scratch, what can I do?
In other words, how can I initialize a new model based on what I know so far, and keep going with the new piece of observations/samples to fit a better model to my data

In hmmlearn you may have noticed that once you train with hmmlearn.fit, the model parameters update:
import numpy as np
import pickle
from hmmlearn import hmm
np.random.seed(42)
# initialize model
model = hmm.GaussianHMM(n_components=3, covariance_type="full")
model.startprob_ = np.array([0.33, 0.33, 0.34])
model.transmat_ = np.array([[0.1, 0.2, 0.7],
[0.3, 0.5, 0.2],
[0.5, 0.1, 0.4]])
model.means_ = np.array([[1.0, 1.0], [2.0, 1.0], [3.0, 1.0]])
model.covars_ = np.tile(np.identity(2), (3, 1, 1))
# generate "fake" training data
emissions1, states1 = model.sample(100)
print("Transition matrix before training: \n", model.transmat_)
# train
model.fit(emissions1)
print("Transition matrix after training: \n", model.transmat_)
# save model
with open("modelname.pkl", "wb") as f: pickle.dump(model, f)
#################################
>>> Transition matrix before training:
[[0.1 0.2 0.7]
[0.3 0.5 0.2]
[0.5 0.1 0.4]]
>>> Transition matrix after training:
[[0.19065325 0.50905216 0.30029459]
[0.41888047 0.39276483 0.18835471]
[0.44558543 0.13767827 0.4167363 ]]
This means that if you have a new training data (ie. emissions2), you can use the same updated model to train on the new emission sequence. You can either choose to save the entire model by pickling (as shown above), or you can save the numpy arrays of the transition matrix, emission matrix, etc.

Related

How to standardize a tensor

When I have an image , I can standardize the image channel-wise as follows :
image[:, :, 0] = ((image[:, :, 0]-mean_1))/std_1
image[:, :, 1] = ((image[:, :, 1]-mean_2))/std_2
image[:, :, 2] = ((image[:, :, 2]-mean_3))/std_3
Where mean_1 and std_1 are the first channel mean and standard deviation . Same for mean_2, std_2 ,mean_3 and std_3. But right now the image is a tensor and has the following info :
(460, 700, 3) <dtype: 'float32'>
<class 'tensorflow.python.framework.ops.Tensor'>
I am new to tensorflow and I don't know how to convert the above formulas to a code that perform the same task on the tensor image ?
Edit : The means and the stds are calculated over all the dataset images by me. So I have their values.
Update 1 : I have tried to solve this problem using tf.keras.layers.Normalization impeded into my model :
inputs = keras.Input(shape=(460,700,3))
norm_layer = Normalization(mean=[200.827,160.252,195.008],
variance=[np.square(33.154),
np.square(45.877),
np.square(29.523)])
inputs=norm_layer(inputs)
This raises new two questions :
Does tf.keras.layers.Normalization and the above code normalizes the inputs per channel as I need ?
Using the above code , does tf.keras.layers.Normalization will work on test and validation data or training data only ? I need it to work on all the datasets.
Please help me guys :( I am so confused .
Update 1: Fix to show how to use with preprocessing layer
import tensorflow as tf
import numpy as np
# Create random img
img = tf.Variable(np.random.randint(0, 255, (10, 224, 224 ,3)), dtype=tf.uint8)
# Create prerprocessing layer
# Note: Works with tensorflow 2.6 and up
norm_layer = tf.keras.layers.Normalization(mean=[0.485, 0.456, 0.406], variance=[np.square(33.154), np.square(45.877), np.square(29.523)])
# Apply norm_layer to your image
# You need not add it to your model
norm_img = norm_layer(img)
# or
# Use via numpy but the output is a tensor since your running a preprocesisng layer
# norm_img = norm_layer(img.numpy())
# Run model prediction
predictions = model.predict(norm_img)

Cannot convert a symbolic Tensor (dense_2_target_2:0) to a numpy array

I'm trying to implement SVM as the last layer of a CNN for classification, I'm trying to implement this code:
def custom_loss_value(y_true, y_pred):
print(y_true)
print(y_pred)
X = y_pred
print(X)
Y = y_true
Predict = []
Prob = []
scaler = StandardScaler()
# X = scaler.fit_transform(X)
param_grid = {'C': [0.1, 1, 8, 10], 'gamma': [0.001, 0.01, 0.1, 1]}
SVM = GridSearchCV(SVC(kernel='rbf',probability=True), cv=3, param_grid=param_grid, scoring='auc', verbose=1)
SVM.fit(X, Y)
Final_Model = SVM.best_estimator_
Predict = Final_Model.predict(X)
Prob = Final_Model.predict_proba(X)
return categorical_hinge(tf.convert_to_tensor(Y, dtype=tf.float32), tf.convert_to_tensor(Predict, dtype=tf.float32))
sgd = tf.keras.optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss=custom_loss_value, optimizer=sgd, metrics=['accuracy'])
I'm getting the error: Cannot convert a symbolic Tensor (dense_2_target_2:0) to a numpy array
on the line SVM.fit(X,Y)
I also tried converting the y_true and y_pred to np array but was getting error then also
To train a neural network with gradient descent, you need a model to be differentiable. So, you need to be able to take a gradient w.r.t. every trainable parameter.
Some problems arise in your code:
You can't directly train an SVM inside a Keras loss function. It
takes a TensorFlow tensor and uses TF ops. The output is also a
Tensorflow tensor. sklearn can work with NumPy arrays or lists but
not tensors.
It is very hard and practically not useful to train SVM through backpropagation. Something about it can be read here.
You can train SVM on top of pretrained model instead of fully-connected layer.

Tensor construction with a loop over number of batches

I want to create a tensor which is some kind of a transformation matrix (rotation matrix for instance)
My model predicts 2 parameters: x1 and x2
so the output is a tensor of (B, 2), when B is number of batches.
however, when I write my loss, I have to know this "B" since I want to iterate over it:
def get_rotation_tensor(x):
roll_mat = K.stack([ [[1, 0, 0],
[0, K.cos(x[i, 0]), -K.sin(x[i, 0])],
[0, K.sin(x[i, 0]), K.cos(x[i, 0])]] for i in range(BATCH_SIZE)])
pitch_mat = K.stack([ [[K.cos(x[i, 1]), 0, K.sin(x[i, 1])],
[0, 1, 0],
[-K.sin(x[i, 1]), 0, K.cos(x[i, 1])]] for i in range(BATCH_SIZE)])
return K.batch_dot(pitch_mat, roll_mat)
the only solution I could have think of is to pre-define the BATCH_SIZE in advance.. but is there a way to write a general loss function that will work for every batch size?
THANKS
I found a solution
def get_rotation_tensor(x):
ones = K.ones_like(x[:, 0])
zeros = K.zeros_like(x[:, 0])
roll_mat = K.stack([[ones, zeros, zeros],
[zeros, K.cos(x[:, 0]), -K.sin(x[:, 0])],
[zeros, K.sin(x[:, 0]), K.cos(x[:, 0])]])
pitch_mat = K.stack([[K.cos(x[:, 1]), zeros, K.sin(x[:, 1])],
[zeros, ones, zeros],
[-K.sin(x[:, 1]), zeros, K.cos(x[:, 1])]])
return K.batch_dot(K.permute_dimensions(pitch_mat, (2, 0, 1)),
K.permute_dimensions(roll_mat, (2, 0, 1)))
Perhaps I'm not fully understanding your issue, but can't you just determine the batch size by the shape of the tensors passed into the loss function. Below is an example that shows the idea. I hope this helps.
# Install TensorFlow
try:
# %tensorflow_version only exists in Colab.
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
print(tf.__version__)
print(tf.executing_eagerly())
# Setup repro section from Keras FAQ with TF1 to TF2 adjustments
import numpy as np
import random as rn
# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.
np.random.seed(42)
# The below is necessary for starting core Python generated random numbers
# in a well-defined state.
rn.seed(12345)
# Force TensorFlow to use single thread.
# Multiple threads are a potential source of non-reproducible results.
# For further details, see: https://stackoverflow.com/questions/42022950/
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
# The below tf.set_random_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/set_random_seed
tf.compat.v1.set_random_seed(1234)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)
# Rest of code follows ...
# Custom Loss
def my_custom_loss(y_true, y_pred):
tf.print('inside my_custom_loss:')
tf.print('y_true:')
tf.print(y_true)
tf.print('y_true column 0:')
tf.print(y_true[:,0])
tf.print('y_true column 1:')
tf.print(y_true[:,1])
tf.print('y_pred:')
tf.print(y_pred)
# get length/batch size
batch_size=tf.shape(y_pred)[0]
tf.print('batch_size:')
tf.print(batch_size)
y_zeros = tf.zeros_like(y_pred)
y_mask = tf.math.greater(y_pred, y_zeros)
res = tf.boolean_mask(y_pred, y_mask)
logres = tf.math.log(res)
finres = tf.math.reduce_sum(logres)
return finres
# Define model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(1, activation='linear', input_dim=1, name="Dense1"))
model.compile(optimizer='rmsprop', loss=my_custom_loss)
print('model.summary():')
print(model.summary())
# Generate dummy data
data = np.array([[2.0],[1.0],[1.0],[3.0],[4.0]])
labels = np.array([[[2.0],[1.0]],
[[0.0],[3.0]],
[[0.0],[3.0]],
[[0.0],[3.0]],
[[0.0],[3.0]]])
# Train the model.
print('training the model:')
print('-----')
model.fit(data, labels, epochs=1, batch_size=3)
print('done training the model.')
print(data.shape)
print(labels.shape)

Can I use real probability distributions as labels for tf.nn.softmax_cross_entropy_with_logits?

In Tensorflow manual, description for labels is like below:
labels: Each row labels[i] must be a valid probability distribution.
Then, does it mean labels can be like below, if I have real probability distributions of classes for each input.
[[0.1, 0.2, 0.05, 0.007 ... ]
[0.001, 0.2, 0.5, 0.007 ... ]
[0.01, 0.0002, 0.005, 0.7 ... ]]
And, is it more efficient than one-hot encoded labels?
Thank you in advance.
In a word, yes, you can use probabilities as labels.
The documentation for tf.nn.softmax_cross_entropy_with_logits says you can:
NOTE: While the classes are mutually exclusive, their probabilities
need not be. All that is required is that each row of labels is
a valid probability distribution. If they are not, the computation of the
gradient will be incorrect.
If using exclusive labels (wherein one and only
one class is true at a time), see sparse_softmax_cross_entropy_with_logits.
Let's have a short example to be sure it works ok:
import numpy as np
import tensorflow as tf
labels = np.array([[0.2, 0.3, 0.5], [0.1, 0.7, 0.2]])
logits = np.array([[5.0, 7.0, 8.0], [1.0, 2.0, 4.0]])
sess = tf.Session()
ce = tf.nn.softmax_cross_entropy_with_logits(
labels=labels, logits=logits).eval(session=sess)
print(ce) # [ 1.24901222 1.86984602]
# manual check
predictions = np.exp(logits)
predictions = predictions / predictions.sum(axis=1, keepdims=True)
ce_np = (-labels * np.log(predictions)).sum(axis=1)
print(ce_np) # [ 1.24901222 1.86984602]
And if you have exclusive labels, it is better to use one-hot encoding and tf.nn.sparse_softmax_cross_entropy_with_logits rather than tf.nn.softmax_cross_entropy_with_logitsand explicit probability representation like [1.0, 0.0, ...]. You can have shorter representation that way.

Connect custom input pipeline to tf model

I am currently trying to get a simple tensorflow model to train by data provided by a custom input pipeline. It should work as efficient as possible. Although I've read lots of tutorials, I can't get it to work.
THE DATA
I have my training data split over several csv files. File 'a.csv' has 20 samples and 'b.csv' has 30 samples in it, respectively. They have the same structure with the same header:
feature1; feature2; feature3; feature4
0.1; 0.2; 0.3; 0.4
...
(No labels, as it is for an autoencoder.)
THE CODE
I have written an input pipeline and would like to feed the data from it to the model. My code looks like this:
import tensorflow as tf
def input_pipeline(filenames, batch_size):
dataset = tf.data.Dataset.from_tensor_slices(filenames)
dataset = dataset.flat_map(
lambda filename: (
tf.data.TextLineDataset(filename)
.skip(1)
.shuffle(10)
.map(lambda csv_row: tf.decode_csv(
csv_row,
record_defaults=[[-1.0]]*4,
field_delim=';'))
.batch(batch_size)
)
)
return dataset.make_initializable_iterator()
iterator = input_pipeline(['/home/sku/data/a.csv',
'/home/sku/data/b.csv'],
batch_size=5)
next_element = iterator.get_next()
# Build the autoencoder
x = tf.placeholder(tf.float32, shape=[None, 4], name='in')
z = tf.contrib.layers.fully_connected(x, 2, activation_fn=tf.nn.relu)
x_hat = tf.contrib.layers.fully_connected(z, 4)
# loss function with epsilon for numeric stability
epsilon = 1e-10
loss = -tf.reduce_sum(
x * tf.log(epsilon + x_hat) + (1 - x) * tf.log(epsilon + 1 - x_hat))
train_op = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(loss)
with tf.Session() as sess:
sess.run(iterator.initializer)
sess.run(tf.global_variables_initializer())
for i in range(50):
batch = sess.run(next_element)
sess.run(train_op, feed_dict={x: batch, x_hat: batch})
THE PROBLEM
When trying to feed the data to the model, I get an error:
ValueError: Cannot feed value of shape (4, 5) for Tensor 'in:0', which has shape '(?, 4)'
When printing out the shapes of the batched data, I get this for example:
(array([ 4.1, 5.9, 5.5, 6.7, 10. ], dtype=float32), array([0.4, 7.7, 0. , 3.4, 8.7], dtype=float32), array([3.5, 4.9, 8.3, 7.2, 6.4], dtype=float32), array([-1. , -1. , 9.6, -1. , -1. ], dtype=float32))
It makes sense, but where and how do I have to reshape this? Also, this additional info dtype only appears with batching.
I also considered that I did the feeding wrong. Do I need input_fn or something like that? I remember that feeding dicts is way to slow. If somebody could give me an efficient way to prepare and feed the data, I would be really grateful.
Regards,
I've figured out a solution, that requires a second mapping function. You have to add the following line to the input function:
def input_pipeline(filenames, batch_size):
dataset = tf.data.Dataset.from_tensor_slices(filenames)
dataset = dataset.flat_map(
lambda filename: (
tf.data.TextLineDataset(filename)
.skip(1)
.shuffle(10)
.map(lambda csv_row: tf.decode_csv(
csv_row,
record_defaults=[[-1.0]]*4,
field_delim=';'))
.map(lambda *inputs: tf.stack(inputs)) # <-- mapping required
.batch(batch_size)
)
)
return dataset.make_initializable_iterator()
This seems to convert the array-like output to a matrix, that can be fed to the network.
However, I'm still not sure if feeding it via feed_dict is the most efficient way. I'd still appreciate support here!