How to normalize a specific dimension of a 3D array - numpy

sklearn.preprocessing.normalize only supports a 2D array normalization.
However, I currently have a 3D array for LSTM model training (batch, step, features) and I wish to normalize the features.
I have tried tf.keras.utils.normalize(X_train, axis=-1, order=2 ) But it is not correct.
Another way is to fold the 3D array into a 2D array
print(X_train.shape)
print(max(X_train[0][0]))
output
(1883, 100, 68)
6.028588763956215
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train.reshape(X_train.shape[0], -1)).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(X_test.shape[0], -1)).reshape(X_test.shape)
print(X_train.shape)
print(max(X_train[0][0]))
print(min(X_train[0][0]))
output
(1883, 100, 68)
3.2232538993444533
-1.9056918449890343
The value is still not within 1 and -1.
How should I approach it?

As suggested in the comments, I provided the answer
you can scale a 3D array with sklearn preprocessing methods. you simply have to reconduct to 2D data to fit them and then reverse back to 3D. This can be done easily with a few lines of code.
if you want the scaled data to be in range (-1,1), you can simply use MinMaxScaler specifying feature_range=(-1,1)
X_train = np.random.uniform(-20,100, (1883, 100, 68))
X_test = np.random.uniform(-20,100, (100, 100, 68))
print(X_train.shape)
print(X_train.min().round(5), X_train.max().round(5)) # -20, 100
scaler = MinMaxScaler(feature_range=(-1,1))
X_train = scaler.fit_transform(X_train.reshape(X_train.shape[0], -1)).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(X_test.shape[0], -1)).reshape(X_test.shape)
print(X_train.shape)
print(X_train.min().round(5), X_train.max().round(5)) # -1, 1

Related

Tensorflow Macro F1 Score for multiclass and also for binary classification

I am trying to train 2 1D Conv neural networks - one for a multiclass classification problem and second for a binary classification problem. One of my metrics has to be Macro F1 score for both problems. However I am having a problem using tfa.metrics.F1Score from tensorflow addons.
Multiclass classification
I have 3 classes encoded as 0, 1, 2.
The last layer of the network and the compile method look like this (int_sequeces_input is the input layer):
preds = layers.Dense(3, activation="softmax")(x)
model = keras.Model(int_sequences_input, preds)
f1_macro = F1Score(num_classes=3, average='macro')
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy',f1_macro])
However when I run model.fit(), I get the following error:
ValueError: Dimension 0 in both shapes must be equal, but are 3 and 1. Shapes are [3] and [1]. for '{{node AssignAddVariableOp_7}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_7/resource, Sum_6)' with input shapes: [], [1].
shapes of data:
X_train - (23658, 150)
y_train - (23658,)
Binary classification
I have 2 classes encoded as 0,1
The last layer of the network and the compile method look like this (int_sequeces_input is the input layer):
preds = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(int_sequences_input, preds)
print(model.summary())
f1_macro = F1Score(num_classes=2, average='macro')
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy',f1_macro])
Again, when I run model.fit() I get error:
ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1. Shapes are [2] and [1]. for '{{node AssignAddVariableOp_4}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_4/resource, Sum_3)' with input shapes: [], [1].
shapes of data:
X_train - (15770, 150)
y_train - (15770,)
So my question is: how to evaluate both of my models using macro F1 score? How can I fix my implementation to make it work with tfa.metrics.F1Score? Or is there any other way to calculate macro F1 score without using tfa.metrics.F1Score? Thanks.
Have a look at the usage example from its doc page.
metric = tfa.metrics.F1Score(num_classes=3, threshold=0.5)
y_true = np.array([[1, 1, 1],
[1, 0, 0],
[1, 1, 0]], np.int32)
y_pred = np.array([[0.2, 0.6, 0.7],
[0.2, 0.6, 0.6],
[0.6, 0.8, 0.0]], np.float32)
metric.update_state(y_true, y_pred)
You can see that it expects the label to be in one-hot format.
But given the shapes you mentioned above:
shapes of data:
X_train - (23658, 150)
y_train - (23658,)
It looks like your labels are in index format. Try converting them to one hot with tf.one_hot(y_train, num_classes). You'll also need to change your loss to loss='categorical_crossentropy'.

Prediction in non classified answer

I have create neuronetwork in Kerars, program is runing but there is problem of result, it is Forexforcast network in forcast it should return 0 or 1 , as provided in traing dataset but result is showing in between 0 and 1 in float like "[[0.47342286]]"
I have tried to use numpy athmax but it only result in 1 answer
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import datetime
from sklearn.preprocessing import MinMaxScaler
from ta import *
dataset = pd.read_csv('C:/Users/SIGMA COM/PycharmProjects/deep/GBP_JPY Historical Data.csv',index_col="Date",parse_dates=True)
dataset = dataset[::-1]
print(dataset.head())
print(dataset.isna().any())
print(dataset.info())
dataset['Open'].plot(figsize=(16,6))
# initial value
step_size = 4
batch_sizes = 1
dataset['Diff'] = dataset['Open'] - dataset['Price']
dataset['Range'] = dataset['High'] - dataset['Low']
dataset['Rsi'] = rsi(close=dataset['Price'],n=4,fillna=True)
dataset['Macd'] = macd(close=dataset['Price'],n_fast=12,n_slow=26,fillna=True)
dataset['Cci'] = cci(high=dataset['High'],low=dataset['Low'],close=dataset['Price'],n=20,fillna=True)
# dataset['Rsi'] = dataset['Rsi'] /100.0
# # dataset['Macd'] = dataset['Macd'] /2.0
# dataset['Cci'] = dataset['Cci'] / 500.0
training_set = dataset[['Rsi','Macd','Cci','Price','Low','High','Open','Signal']]
sc = MinMaxScaler()
training_set_scaled = sc.fit_transform(training_set)
# Creating a data structure with 60 timesteps and 1 output
X_train = []
y_train = []
for i in range(60, 1258):
X_train.append(training_set_scaled[i-60:i, 0])
y_train.append(training_set_scaled[i, -1:])
X_train, y_train = np.array(X_train), np.array(y_train)
# Reshaping
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
print(X_train.shape)
print(X_train)
plt.show()
# Part 2 - Building the RNN
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
print((X_train.shape[1], 1))
print(X_train.shape)
# Initialising the RNN
regressor = Sequential()
# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))
# Adding the output layer
regressor.add(Dense(units = 1,activation='sigmoid'))
# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fitting the RNN to the Training set
regressor.fit(X_train, y_train, epochs = 10, batch_size = 32)
result = regressor.predict(np.reshape(X_train[100],(1,60,1)))
print(result)
I want to make model to make predication in class 0 and 1
This behavior is expected, because the sigmoid function is going to return a number between zero and one, like so:
So if your class labels are either 0 or 1, which seems to be the case here, for a binary classification problem you can just round the resultant output for your class prediction. Let's make a distinction between a classification vs. a regression problem here: regression is like finding the "line of best fit;" that is, the model is being trained to approximate the data. This appears to be what you're doing here: you're minimizing the mean squared error and searching for the model that best approximates your data, but that doesn't make a prediction.
If you want to actually make a classification, you can just round all elements of the result of regressor.predict to 0 or 1, and then compare your predictions with the true labels. This can actually be done easily in numpy like so: numpy.around(your_predictions, decimals=0). Note the decimals argument is not strictly required since it defaults to a value of 0, it's nice for clarity.
As for using numpy.argmax (I'm going to assume that's what you meant by athmax since I can't find a function with that spelling), it will give you the same label for everything because it returns the index of the largest element in an array. Since your output array has length one (because it's simply a single neuron that calculates the logistic function), it will always return index zero! However, you're sort of on the right track: if your last layer was instead Dense(units=n_classes, activation='softmax') — softmax outputs a probability distribution that a particular row of data will produce each label. In that case, numpy.argmax is correct.
Here's a Tensorflow tutorial on classification that I found super helpful when I was just learning it myself. It uses softmax instead of sigmoid like you, but I think it's fairly adaptable to your needs: https://www.tensorflow.org/tutorials/keras/basic_classification
Hope this helps!

Link prediction with input data

I have a list of files, an I use the KNN algorithm to classify these files.
dataset = pd.read_csv(file)
training_samples = get_sample_number(dataset)
X_train = dataset.iloc[:training_samples, 5:9]
y_train = dataset.iloc[:training_samples, 9]
X_test = dataset.iloc[training_samples:, 5:9]
# Feature Scaling
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)
# Fitting classifier to the training set
classifier = KNeighborsClassifier(n_neighbors=5, metric='minkowski', p=2)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
Now I have my categories in my y_pred array. But I want to save the result in the file where I read the dataset. How can I link a prediction to the right row in the file (or dataset)?
Your predictions in y_pred have a length of X_test.shape[0], which is obviously less than the length of the original dataset. If you want to attach the predictions to the original dataset that you read from file, you would need to make predictions on the whole dataset, and then do a simple concat to get it all together:
y_pred_all = classifier.predict(dataset.iloc[:, 5:9])
dataset = pd.concat([dataset, y_pred_all], axis=1)

Resize MNIST in Tensorflow

I have been working on MNIST dataset to learn how to use Tensorflow and Python for my deep learning course.
I want to resize MNIST as 22 & 22 using tensorflow, then I train it, but I do not how to do?
Could you help me?
TheRevanchist's answer is correct. However, for the mnist dataset, you first need to reshape the mnist array before you send it to tf.image.resize_images():
import tensorflow as tf
import numpy as np
import cv2
mnist = tf.contrib.learn.datasets.load_dataset("mnist")
batch = mnist.train.next_batch(10)
X_batch = batch[0]
batch_tensor = tf.reshape(X_batch, [10, 28, 28, 1])
resized_images = tf.image.resize_images(batch_tensor, [22,22])
The code above takes out a batch of 10 mnist images and reshapes them from 28x28 images to 22x22 tensorflow images.
If you want to display the images, you can use opencv and the code below. The resized_images.eval() converts the tensorflow image to a numpy array!
with tf.Session() as sess:
numpy_imgs = resized_images.eval(session=sess) # mnist images converted to numpy array
for i in range(10):
cv2.namedWindow('Resized image #%d' % i, cv2.WINDOW_NORMAL)
cv2.imshow('Resized image #%d' % i, numpy_imgs[i])
cv2.waitKey(0)
Did you try tf.image.resize_image?
The method:
resize_images(images, size, method=ResizeMethod.BILINEAR,
align_corners=False)
where images is a batch of images, and size is a vector tensor which determines the new height and width. You can look at the full documentation here: https://www.tensorflow.org/api_docs/python/tf/image/resize_images
Updated: TensorFlow 2.4.1
Short Answer
Use tf.image.resize (instead of resize_images). The link other provided no longer exits. Updated link.
Long Answer
MNIST in tf.keras.datasets.mnist is the following shape
(batch_size, 28 , 28)
Here is the full implementation. Please read the comment which attach with the code.
(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
# expand new axis, channel axis
x_train = np.expand_dims(x_train, axis=-1)
# [optional]: we may need 3 channel (instead of 1)
x_train = np.repeat(x_train, 3, axis=-1)
# it's always better to normalize
x_train = x_train.astype('float32') / 255
# resize the input shape , i.e. old shape: 28, new shape: 32
x_train = tf.image.resize(x_train, [32,32]) # if we want to resize
print(x_train.shape)
# (60000, 32, 32, 3)
You can use cv2.resize() function of opencv
Use a for loop to go iterate through every image
And inside for loop for every image add this line cv2.resize(source_image, (22, 22))
def resize(mnist):
train_data = []
for img in mnist.train._images:
resized_img = cv2.resize(img, (22, 22))
train_data.append(resized_img)
return train_data

Visualization of Keras Convolution Layer Outputs

I have written the following code for this question where there are two convolution layers (Conv1 and Conv2 for short) and I would like to plot all the outputs of each layer (it's self-contained). Everything is fine for Conv1, but I am missing something about Conv2.
I am feeding a 1x1x25x25 (num images, num channels, height, width (my convention, neither TF or Theano convention)) image to Conv1 which has FOUR 5x5 filters. That means its output shape is 4x1x1x25x25 (num filters, num images, num channels, height, width), resulting in 4 plots.
Now, this output is being fed to Conv1 which has SIX 3x3 filters. Hence, the output of Conv2 should be 6x(4x1x1x25x25), but it is not! It's rather 6x1x1x25x25. That means, there are only 6 plots rather than 6x4, but why? The following functions also prints the shape of each output which they are
(1, 1, 25, 25, 4)
-------------------
(1, 1, 25, 25, 6)
-------------------
but should be
(1, 1, 25, 25, 4)
-------------------
(1, 4, 25, 25, 6)
-------------------
Right?
import numpy as np
#%matplotlib inline #for Jupyter ONLY
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Conv2D
from keras import backend as K
model = Sequential()
# Conv1
conv1_filter_size = 5
model.add(Conv2D(nb_filter=4, nb_row=conv1_filter_size, nb_col=conv1_filter_size,
activation='relu',
border_mode='same',
input_shape=(25, 25, 1)))
# Conv2
conv2_filter_size = 3
model.add(Conv2D(nb_filter=6, nb_row=conv2_filter_size, nb_col=conv2_filter_size,
activation='relu',
border_mode='same'))
# The image to be sent through the model
img = np.array([
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.]],
[[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[0.],[1.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.]],
[[1.],[1.],[0.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[1.],[1.]],
[[1.],[1.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.]],
[[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[0.],[0.],[0.],[0.],[0.],[0.],[0.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]],
[[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.],[1.]]])
def get_layer_outputs(image):
'''This function extracts the numerical output of each layer.'''
outputs = [layer.output for layer in model.layers]
comp_graph = [K.function([model.input] + [K.learning_phase()], [output]) for output in outputs]
# Feeding the image
layer_outputs_list = [op([[image]]) for op in comp_graph]
layer_outputs = []
for layer_output in layer_outputs_list:
print(np.array(layer_output).shape, end='\n-------------------\n')
layer_outputs.append(layer_output[0][0])
return layer_outputs
def plot_layer_outputs(image, layer_number):
'''This function handels plotting of the layers'''
layer_outputs = get_layer_outputs(image)
x_max = layer_outputs[layer_number].shape[0]
y_max = layer_outputs[layer_number].shape[1]
n = layer_outputs[layer_number].shape[2]
L = []
for i in range(n):
L.append(np.zeros((x_max, y_max)))
for i in range(n):
for x in range(x_max):
for y in range(y_max):
L[i][x][y] = layer_outputs[layer_number][x][y][i]
for img in L:
plt.figure()
plt.imshow(img, interpolation='nearest')
plot_layer_outputs(img, 1)
The output of a convolution layer is bundled as one image with multiple channels. These could be thought as feature channels, in contrast with color channels. For example, if a convolution layer has F number of filters, it will output an image with F number of channels, no matter how many (color or feature) channels the input image had. This is why Conv2 produces 6 feature maps rather than 6x4.
In more details, a convolution filter will convolve over all input channels and the linear combination of its convolution would be fed to its activation function.