static value with svm.SVR.predict() - numpy

I have attempted to assess the relevance some predictions based on a dataset (n * 6), but I am wondering about the causes of strange results I am currently facing with svr.SVR.predict. The below code could illustrate my statement:
d = DataReader(...)
a = d.iloc[:,0:5]
b = d.iloc[:,5]
cut = 10
z = d.iloc[len(d.index) - cut :,0:5]
X,y = np.asarray(a[:-10]), np.asarray(b[:-10]) # train set
XT,yT = np.asarray(z), np.asarray(b[-10:]) # test set
clf = svm.SVR(kernel = 'rbf', gamma=0.1, C=1e3)
y_hat = clf.fit(X,y).predict(XT[i]) #, i = 0,1...
yields amazing static values for all i, despite differences in XT[i] (Ps: XT[i].shape = (5,)).
In a nutshell, the goal consisted of comparing y_hat vs yT.
Best

You need to normalize before SVM. Try the following:
from sklearn.preprocessing import StandardScaler
d = DataReader(...)
a = d.iloc[:,0:5]
b = d.iloc[:,5]
cut = 10
z = d.iloc[len(d.index) - cut :,0:5]
X,y = np.asarray(a[:-10]), np.asarray(b[:-10]) # train set
XT,yT = np.asarray(z), np.asarray(b[-10:]) # test set
scl = StandardScaler()
X = scl.fit_transform(X)
XT = scl.transform(XT)
clf = svm.SVR(kernel = 'rbf', gamma=0.1, C=1e3)
y_hat = clf.fit(X,y).predict(XT[i]) #, i = 0,1...

Related

tf.keras.layers.Cropping2D for pytorch

I'm currently working on a gan that generates time series and I want to tailor my generated time series to the desired time steps. In Tensorflow I always used "tf.keras.layers.Cropping2D" for this. Is there an equivalent for this in Pytorch?
class cont_cond_cnn_generator(nn.Module):
def __init__(self, nz=32, dim_embed=DIM_EMBED):
dim_embed=DIM_EMBED
super(cont_cond_cnn_generator, self).__init__()
self.z_dim = nz
self.dim_embed = dim_embed
self.dense = nn.Linear(self.z_dim, 3 * GEN_SIZE*16, bias=True)
self.dense = nn.Linear(self.z_dim, 4 * 4 * GEN_SIZE*16, bias=True)
self.final = nn.Conv1d(GEN_SIZE, channels, 3, stride=1, padding=1, bias=bias)
nn.init.xavier_uniform_(self.dense.weight.data, 1.)
nn.init.xavier_uniform_(self.final.weight.data, 1.)
self.genblock0 = ResBlockGenerator(in_channels=GEN_SIZE*16, out_channels = GEN_SIZE*8 , dim_embed=dim_embed) #4--->8 # 2. parameter original :GEN_SIZE*8
self.genblock1 = ResBlockGenerator(GEN_SIZE*8, GEN_SIZE*4, dim_embed=dim_embed) #8--->16 #2. parameter original :GEN_SIZE*4
self.genblock2 = ResBlockGenerator(GEN_SIZE*4,GEN_SIZE*2, dim_embed=dim_embed) #16--->32 # 2. parameter original :GEN_SIZE*2
self.genblock3 = ResBlockGenerator(GEN_SIZE*2, GEN_SIZE, dim_embed=dim_embed) #32--->64 2. parameter original :GEN_SIZE
self.final = nn.Sequential(
nn.BatchNorm1d(GEN_SIZE),
nn.ReLU(),
self.final,
nn.Sigmoid()
)
def forward(self, z, y):
z = z.view(z.size(0), z.size(1))
out = self.dense(z)
out = out.view(-1, GEN_SIZE*16, 3) #out = out.view(-1, GEN_SIZE*16, 4, 4)
out = self.genblock0(out, y)
out = self.genblock1(out, y)
out = self.genblock2(out, y)
out = self.genblock3(out, y)
out = self.final(out)
out = torchvision.transforms.functional.crop(out, 0,0,3,2201)
return out
I used the torchvision.transforms.functional.crop() and it works. But is it a good methode or would you recommend my another way to do that.

My model gives terrible results when im trying to forecast univariant time series

I am trying to do univariant time series forecasting. My model works Perfectly in different datasets. But for this dataset, the prediction is incredibly bad (tried 50,100,200 epochs, different batch sizes, and learning rates. Nothing has changed. So I think there is something wrong with my dataset.)
Here values of my dataset:
Mean: 49.840000, standard deviation: 31.786387
Here is my architecture
Here is the sample from my dataset
Here is my prediction values
Here is the code for normalization that im using:
def NormalizeMult(data):
#normalize 用于反归一化
data = np.array(data)
normalize = np.arange(2*data.shape[1],dtype='float64')
normalize = normalize.reshape(data.shape[1],2)
print(normalize.shape)
for i in range(0,data.shape[1]):
#第i列
list = data[:,i]
listlow,listhigh = np.percentile(list, [0, 100])
# print(i)
normalize[i,0] = listlow
normalize[i,1] = listhigh
delta = listhigh - listlow
if delta != 0:
#第j行
for j in range(0,data.shape[0]):
data[j,i] = (data[j,i] - listlow)/delta
#np.save("./normalize.npy",normalize)
return data,normalize
Here is the code that I'm using dataset and normalizing it:
INPUT_DIMS = 1
TIME_STEPS = 4
lstm_units = 64
#归一化
series = read_csv('/content/logs.csv')
series = series.drop(["timestamp"],axis=1)
series= series.dropna()
series = series.head(100)
data=series
data,normalize = NormalizeMult(data[0:50])
pollution_data = data[:,0].reshape(len(data[0:50]),1)
train_X, _ = split_sequence(data,TIME_STEPS)
_ , train_Y = split_sequence(data,TIME_STEPS)
optimizer = tf.keras.optimizers.Adam(lr=0.001)
m = attention_model()
m.summary()
m.compile(optimizer, loss='mse')
m.fit(train_X, train_Y, epochs=500, batch_size=2, validation_split=0.1)

Stratify batch in Tensorflow 2

I have minibatches that I get from an sqlite database with data of integer and float type, x, and a binary label in 0 and 1, y. I am looking for something like X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(y, x, test_size=0.1, random_state=1, stratify=True) from scikit-learn, where a keyword could stratify the data (i.e. the same number of class-0 and class-1 instances).
In Tensorflow 2, stratification seems not straightforwardly possible. My very complicated solution works for me, but takes a lot of time because of all the reshaping and transposing:
def stratify(x, y):
# number of positive instances (the smaller class)
pos = np.sum(y).item() # how many positive bonds there are
x = np.transpose(x)
# number of features
f = np.shape(x)[1]
# filter only class 1
y = tf.transpose(y)
x_pos = tf.boolean_mask(x,
y_pos = tf.boolean_mask(y, y)
# filter only class 1
x_neg = tf.boolean_mask(x, tf.bitwise.invert(y)-254)
x_neg = tf.reshape(x_neg, [f,-1])
y_neg = tf.boolean_mask(y, tf.bitwise.invert(y)-254)
# just take randomy as many class-0 as there are class-1
x_neg = tf.transpose(tf.random.shuffle(tf.transpose(x_neg)))
x_neg = x_neg[:,0:pos]
y_neg = y_neg[0:pos]
# concat the class-1 and class-0 together, then shuffle, and concat back together
x = tf.concat([x_pos,tf.transpose(x_neg)],0)
y = tf.concat([y_pos, tf.transpose(y_neg)],0)
xy = tf.concat([tf.transpose(x), tf.cast(np.reshape(y,[1, -1]), tf.float64)],0)
xy = tf.transpose((tf.random.shuffle(tf.transpose(xy)))) # because there is no axis arg in shuffle
x = xy[0:f,:]
x = tf.transpose(x)
y = xy[f,:]
return x, y
I am happy to see some feedback/improvement on my own function or novel, easier ideas.
Data division is best if it is done in raw format only or before you transform it into tensors. If there is a strong requirement to do it in TensorFlow only, then I will suggest you to make use of tf.data.Dataset class. I have added the demo code with relevant comments explaining the steps.
import tensorflow as tf
import numpy as np
TEST_SIZE = 0.1
DATA_SIZE = 1000
# Create data
X_data = np.random.rand(DATA_SIZE, 28, 28, 1)
y_data = np.random.randint(0, 2, [DATA_SIZE])
samples1 = np.sum(y_data)
print('Percentage of 1 = ', samples1 / len(y_data))
# Create TensorFlow dataset
dataset = tf.data.Dataset.from_tensor_slices((X_data, y_data))
# Gather data with 0 and 1 labels separately
class0_dataset = dataset.filter(lambda x, y: y == 0)
class1_dataset = dataset.filter(lambda x, y: y == 1)
# Shuffle them
class0_dataset = class0_dataset.shuffle(DATA_SIZE)
class1_dataset = class1_dataset.shuffle(DATA_SIZE)
# Split them
class0_test_samples_len = int((DATA_SIZE - samples1) * TEST_SIZE)
class0_test = class0_dataset.take(class0_test_samples_len)
class0_train = class0_dataset.skip(class0_test_samples_len)
class1_test_samples_len = int(samples1 * TEST_SIZE)
class1_test = class1_dataset.take(class1_test_samples_len)
class1_train = class1_dataset.skip(class1_test_samples_len)
print('Train Class 0 = ', len(list(class0_train)), ' Class 1 = ', len(list(class1_train)))
print('Test Class 0 = ', len(list(class0_test)), ' Class 1 = ', len(list(class1_test)))
# Gather datasets
train_dataset = class0_train.concatenate(class1_train).shuffle(DATA_SIZE)
test_dataset = class0_test.concatenate(class1_test).shuffle(DATA_SIZE)
print('Train dataset size = ', len(list(train_dataset)))
print('Test dataset size = ', len(list(test_dataset)))
Sample output:
Percentage of 1 = 0.474
Train Class 0 = 474 Class 1 = 427
Test Class 0 = 52 Class 1 = 47
Train dataset size = 901
Test dataset size = 99

How to extract cell state from a LSTM at each timestep in Keras?

Is there a way in Keras to retrieve the cell state (i.e., c vector) of a LSTM layer at every timestep of a given input?
It seems the return_state argument returns the last cell state after the computation is done, but I need also the intermediate ones. Also, I don't want to pass these cell states to the next layer, I only want to be able to access them.
Preferably using TensorFlow as backend.
Thanks
I was looking for a solution to this issue and after reading the guidance for creating your own custom RNN Cell in tf.keras (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AbstractRNNCell), I believe the following is the most concise and easy to read way of doing this for Tensorflow 2:
import tensorflow as tf
from tensorflow.keras.layers import LSTMCell
class LSTMCellReturnCellState(LSTMCell):
def call(self, inputs, states, training=None):
real_inputs = inputs[:,:self.units] # decouple [h, c]
outputs, [h,c] = super().call(real_inputs, states, training=training)
return tf.concat([h, c], axis=1), [h,c]
num_units = 512
test_input = tf.random.uniform([5,100,num_units])
rnn = tf.keras.layers.RNN(LSTMCellReturnCellState(num_units),
return_sequences=True, return_state=True)
whole_seq_output, final_memory_state, final_carry_state = rnn(test_input)
print(whole_seq_output.shape)
>>> (5,100,1024)
# Hidden state sequence
h_seq = whole_seq_output[:,:,:num_units] # (5,100,512)
# Cell state sequence
c_seq = whole_seq_output[:,:,num_units:] # (5,100,512)
As mentioned in an above solution, you can see the advantage of this is that it can be easily wrapped into tf.keras.layers.RNN as a drop-in for the normal LSTMCell.
Here is a Colab Notebook with the code running as expected for tensorflow==2.6.0
I know it's pretty late, I hope this can help.
what you are asking, technically, is possible by modifying the LSTM-cell in call method. I modify it and make it return 4 dimension instead of 3 when you give return_sequences=True.
Code
from keras.layers.recurrent import _generate_dropout_mask
class Mod_LSTMCELL(LSTMCell):
def call(self, inputs, states, training=None):
if 0 < self.dropout < 1 and self._dropout_mask is None:
self._dropout_mask = _generate_dropout_mask(
K.ones_like(inputs),
self.dropout,
training=training,
count=4)
if (0 < self.recurrent_dropout < 1 and
self._recurrent_dropout_mask is None):
self._recurrent_dropout_mask = _generate_dropout_mask(
K.ones_like(states[0]),
self.recurrent_dropout,
training=training,
count=4)
# dropout matrices for input units
dp_mask = self._dropout_mask
# dropout matrices for recurrent units
rec_dp_mask = self._recurrent_dropout_mask
h_tm1 = states[0] # previous memory state
c_tm1 = states[1] # previous carry state
if self.implementation == 1:
if 0 < self.dropout < 1.:
inputs_i = inputs * dp_mask[0]
inputs_f = inputs * dp_mask[1]
inputs_c = inputs * dp_mask[2]
inputs_o = inputs * dp_mask[3]
else:
inputs_i = inputs
inputs_f = inputs
inputs_c = inputs
inputs_o = inputs
x_i = K.dot(inputs_i, self.kernel_i)
x_f = K.dot(inputs_f, self.kernel_f)
x_c = K.dot(inputs_c, self.kernel_c)
x_o = K.dot(inputs_o, self.kernel_o)
if self.use_bias:
x_i = K.bias_add(x_i, self.bias_i)
x_f = K.bias_add(x_f, self.bias_f)
x_c = K.bias_add(x_c, self.bias_c)
x_o = K.bias_add(x_o, self.bias_o)
if 0 < self.recurrent_dropout < 1.:
h_tm1_i = h_tm1 * rec_dp_mask[0]
h_tm1_f = h_tm1 * rec_dp_mask[1]
h_tm1_c = h_tm1 * rec_dp_mask[2]
h_tm1_o = h_tm1 * rec_dp_mask[3]
else:
h_tm1_i = h_tm1
h_tm1_f = h_tm1
h_tm1_c = h_tm1
h_tm1_o = h_tm1
i = self.recurrent_activation(x_i + K.dot(h_tm1_i,
self.recurrent_kernel_i))
f = self.recurrent_activation(x_f + K.dot(h_tm1_f,
self.recurrent_kernel_f))
c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1_c,
self.recurrent_kernel_c))
o = self.recurrent_activation(x_o + K.dot(h_tm1_o,
self.recurrent_kernel_o))
else:
if 0. < self.dropout < 1.:
inputs *= dp_mask[0]
z = K.dot(inputs, self.kernel)
if 0. < self.recurrent_dropout < 1.:
h_tm1 *= rec_dp_mask[0]
z += K.dot(h_tm1, self.recurrent_kernel)
if self.use_bias:
z = K.bias_add(z, self.bias)
z0 = z[:, :self.units]
z1 = z[:, self.units: 2 * self.units]
z2 = z[:, 2 * self.units: 3 * self.units]
z3 = z[:, 3 * self.units:]
i = self.recurrent_activation(z0)
f = self.recurrent_activation(z1)
c = f * c_tm1 + i * self.activation(z2)
o = self.recurrent_activation(z3)
h = o * self.activation(c)
if 0 < self.dropout + self.recurrent_dropout:
if training is None:
h._uses_learning_phase = True
return tf.expand_dims(tf.concat([h,c],axis=0),0), [h, c]
Sample code
# create a cell
test = Mod_LSTMCELL(100)
# Input timesteps=10, features=7
in1 = Input(shape=(10,7))
out1 = RNN(test, return_sequences=True)(in1)
M = Model(inputs=[in1],outputs=[out1])
M.compile(keras.optimizers.Adam(),loss='mse')
ans = M.predict(np.arange(7*10,dtype=np.float32).reshape(1, 10, 7))
print(ans.shape)
# state_h
print(ans[0,0,0,:])
# state_c
print(ans[0,0,1,:])
First, this is not possible do with the tf.keras.layers.LSTM. You have to use LSTMCell instead or subclass LSTM. Second, there is no need to subclass LSTMCell to get the sequence of cell states. LSTMCell already returns a list of the hidden state (h) and cell state (c) everytime you call it.
For those not familiar with LSTMCell, it takes in the current [h, c] tensors, and the input at the current timestep (it cannot take in a sequence of times) and returns the activations, and the updated [h,c].
Here is an example of showing how to use LSTMCell to process a sequence of timesteps and to return the accumulated cell states.
# example inputs
inputs = tf.convert_to_tensor(np.random.rand(3, 4), dtype='float32') # 3 timesteps, 4 features
h_c = [tf.zeros((1,2)), tf.zeros((1,2))] # must initialize hidden/cell state for lstm cell
h_c = tf.convert_to_tensor(h_c, dtype='float32')
lstm = tf.keras.layers.LSTMCell(2)
# example of how you accumulate cell state over repeated calls to LSTMCell
inputs = tf.unstack(inputs, axis=0)
c_states = []
for cur_inputs in inputs:
out, h_c = lstm(tf.expand_dims(cur_inputs, axis=0), h_c)
h, c = h_c
c_states.append(c)
You can access the states of any RNN by setting return_sequences = True in the initializer. You can find more information about this parameter here.

stacked sigmoids: why training the second layer alters the first layer?

I am training a NN with sigmoid layers stacked one on top of the other. I have labels associated with each layer and I would like to alternate between training towards minimizing the loss for the first layer and minimizing the loss on the second layer. I expect that the result I get on the first layer would not change regardless whether I train for the second layer or not. However, I do get significant difference. What am I missing?
Here is the code:
dim = Xtrain.shape[1]
output_dim = Ytrain.shape[1]
categories_dim = Ctrain.shape[1]
features = C.input_variable(dim, np.float32)
label = C.input_variable(output_dim, np.float32)
categories = C.input_variable(categories_dim, np.float32)
b = C.parameter(shape=(output_dim))
w = C.parameter(shape=(dim, output_dim))
adv_w = C.parameter(shape=(output_dim, categories_dim))
adv_b = C.parameter(shape=(categories_dim))
pred_parameters = (w, b)
adv_parameters = (adv_w, adv_b)
z = C.tanh(C.times(features, w) + b)
adverse = C.tanh(C.times(z, adv_w) + adv_b)
pred_loss = C.cross_entropy_with_softmax(z, label)
pred_error = C.classification_error(z, label)
adv_loss = C.cross_entropy_with_softmax(adverse, categories)
adv_error = C.classification_error(adverse, categories)
pred_learning_rate = 0.5
pred_lr_schedule = C.learning_rate_schedule(pred_learning_rate, C.UnitType.minibatch)
pred_learner = C.adam(pred_parameters, pred_lr_schedule, C.momentum_as_time_constant_schedule(0.9))
pred_trainer = C.Trainer(adverse, (pred_loss, pred_error), [pred_learner])
adv_learning_rate = 0.5
adv_lr_schedule = C.learning_rate_schedule(adv_learning_rate, C.UnitType.minibatch)
adv_learner = C.adam(adverse.parameters, adv_lr_schedule, C.momentum_as_time_constant_schedule(0.9))
adv_trainer = C.Trainer(adverse, (adv_loss, adv_error), [adv_learner])
minibatch_size = 50
num_of_epocs = 40
# Run the trainer and perform model training
training_progress_output_freq = 50
def permute (x, y, c):
rr = np.arange(x.shape[0])
np.random.shuffle(rr)
x = x[rr, :]
y = y[rr, :]
c = c[rr, :]
return (x, y, c)
for e in range(0,num_of_epocs):
(x, y, c) = permute(Xtrain, Ytrain, Ctrain)
for i in range (0, x.shape[0], minibatch_size):
m_features = x[i:min(i+minibatch_size, x.shape[0]),]
m_labels = y[i:min(i+minibatch_size, x.shape[0]),]
m_cat = c[i:min(i+minibatch_size, x.shape[0]),]
if (e % 2 == 0):
pred_trainer.train_minibatch({features : m_features, label : m_labels, categories : m_cat, diagonal : m_diagonal})
else:
adv_trainer.train_minibatch({features : m_features, label : m_labels, categories : m_cat, diagonal : m_diagonal})
I am surprised that if I comment out the last two lines (else: adv_training.train...) the train and test error of z in predicting the label alters. Since adv_trainer is supposed to modify only adv_w and adv_b which are not used in computing z or its loss, I can't see why should that happen. I appreciate the assistance.
You shouldn’t do
adv_learner = C.adam(adverse.parameters, adv_lr_schedule, C.momentum_as_time_constant_schedule(0.9))
but:
adv_learner = C.adam(adv_parameters, adv_lr_schedule, C.momentum_schedule(0.9))
adverse.parameters contains all the parameters and you don’t want that. On a different note, you will want to replace momentum_as_time_constant_schedule with momentum_schedule. The former takes as parameter the number of samples after which the contribution of the gradient will have decayed by exp(-1).