Wrong Number of LSTM Dimensions in Keras - tensorflow

I am trying to classify texts using Recurrent Neural Networks and am facing issue while implementing the following Python Code (backend is on TensorFlow):
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import Embedding
import numpy
import pandas as pd
seed = 7
numpy.random.seed(seed)
raw_data = pd.read_excel('Data.xlsx',sep=',')
df = raw_data.iloc[:,0:2]
df['Input']=df['Input'].astype(str)
X = df['Input'].values
X = pd.get_dummies(X).as_matrix() # Convert Dataframe to Nd-Araay
Y = df['Output'].values
Y = pd.get_dummies(Y).as_matrix() # Convert Dataframe to Nd-Araay
model = Sequential()
model.add(LSTM(450, input_shape =(X.shape[0], X.shape[1])))
model.add(Dense(Y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, Y, nb_epoch=100, batch_size=100,verbose=0)
loss, accuracy = model.evaluate(X, Y)
print("\nLoss: %.2f, Accuracy: %.2f%%" % (loss, accuracy*100))
My input ie X is an Nd Array consisting of 411 rows & 407 columns while Y consists of 411 rows & 7 columns and following is the error I am getting:
expected lstm_input_43 to have 3 dimensions, but got array with shape (411, 407)

Related

Why is the use of return_sequences giving different results across different environments?

When I use return_sequences = true, for a LSTM layer, before adding a dense layer it sometimes results in an error depending upon the environment. I believe it mainly depends on the version of tensorflow and keras. If I am using tensorflow 2.1.0 and Keras 2.3.0, I get the following error -
standardize_input_data 'with shape ' + str(data_shape)) ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (7000, 1)
However, if I use tensorflow 2.9.1 and keras 2.9.0 I do not get any error.
Here is some minimal working sample code -
import os
import pandas as pd
from sklearn import preprocessing
from collections import deque
import random
import numpy as np
import time
import random
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, BatchNormalization, Input
from keras.models import load_model
import keras
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import pickle
epochs = 10
batch_size = 64
X, y = make_classification(n_samples=10000, n_features=3, n_classes=3, n_informative=3, n_redundant=0, n_repeated=0 ,weights=[0.5,0.5,0.5])
X = X.reshape(X.shape[0], 1, 3)
y = y.reshape(-1, 1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
inputs = Input(shape=(X_train.shape[1:]))
outputs = LSTM(128, input_shape=(X_train.shape[1:]), return_sequences=True)(inputs)
outputs = Dropout(0.2)(outputs)
outputs = BatchNormalization()(outputs)
outputs = LSTM(128, input_shape=(X_train.shape[1:]), return_sequences=True)(outputs)
outputs = Dropout(0.2)(outputs)
outputs = BatchNormalization()(outputs)
outputs = Dense(32, activation="relu", kernel_initializer="glorot_uniform")(outputs)
outputs = Dropout(0.2)(outputs)
outputs = Dense(3, activation="softmax", kernel_initializer="glorot_uniform")(outputs)
model = keras.Model(inputs, outputs)
opt = keras.optimizers.Adam(lr=0.0001, decay=1e-6)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(X_test, y_test))

Time-Series LSTM Model wrong prediction

I am practicing how to create an LSTM model on a univariate series using this dataset from Kaggle: https://www.kaggle.com/sumanthvrao/daily-climate-time-series-data
My issue is that I am unable to get an accurate prediction of the temperature and my loss seems to be going all over the place. I have tried multiple methods including
Ensuring that time series data is stationary
Changing the time steps
Changing the hyperparameters
Using a stacked LSTM model
I am really curious as to what is wrong with my code although I do have a few hypothesis:
I made an error when preprocessing the data
I introduced stationarity wrongly
This dataset requires a multivariate approach
%tensorflow_version 2.x # this line is not required unless you are in a notebook
import tensorflow as tf
from numpy import array
from numpy import argmax
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import os
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
# preparing independent and dependent features
def prepare_data(timeseries_data, n_features):
X, y =[],[]
for i in range(len(timeseries_data)):
# find the end of this pattern
end_ix = i + n_features
# check if we are beyond the sequence
if end_ix > len(timeseries_data)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = timeseries_data[i:end_ix], timeseries_data[end_ix]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)
# preparing independent and dependent features
def prepare_x_input(timeseries_data, n_features):
x = []
for i in range(len(timeseries_data)):
# find the end of this pattern
end_ix = i + n_features
# check if we are beyond the sequence
if end_ix > len(timeseries_data):
break
# gather input and output parts of the pattern
seq_x = timeseries_data[i:end_ix]
x.append(seq_x)
x = x[-1:]
#remove non-stationerity
#x = np.log(x)
return np.array(x)
#read data and filter temperature column
df = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Weather Parameter/DailyDelhiClimateTrain.csv')
df.head()
temp_df = df.pop('meantemp')
plt.plot(temp_df)
#make data stationery
sta_temp_df = np.log(temp_df).diff()
plt.figure(figsize=(15,5))
plt.plot(sta_temp_df)
print(sta_temp_df)
time_step = 7
x, y = prepare_data(sta_temp_df, time_step)
n_features = 1
x = x.reshape((x.shape[0], x.shape[1], n_features))
model = Sequential()
model.add(LSTM(10, return_sequences=True, input_shape=(time_step, n_features)))
model.add(LSTM(10))
model.add(Dense(16, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.summary()
result = model.fit(x, y, epochs=800)
n_days = 113
pred_temp_df = list(temp_df)
test = sta_temp_df.copy()
sta_temp_df = list(sta_temp_df)
i = 0
while(i<n_days):
x_input = prepare_x_input(sta_temp_df, time_step)
print(x_input)
x_input = x_input.reshape((1, time_step, n_features))
#pass data into model
yhat = model.predict(x_input, verbose=0)
yhat.flatten
print(yhat[0][0])
sta_temp_df.append(yhat[0][0])
i = i+1
sta_temp_df[0] = np.log(temp_df[0])
cum_temp_df = np.exp(np.cumsum(sta_temp_df))
print(cum_temp_df)
My code is shown above. Would really appreciate if someone can identify what I did wrong here!

ValueError: Data cardinality is ambiguous. Make sure all arrays contain the same number of samples

This is a regression problem, where I want to generate 5 float values from each image of size 224 x 224. So I use fully connected networks with 5 nodes in the last layer. But doing so in keras gives me the following error:
import keras, os
import numpy as np
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.applications.inception_v3 import InceptionV3
## data_list = list of four 224x224 numpy arrays
inception = InceptionV3(weights='imagenet', include_top=False)
x = inception.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(5, activation='relu')(x)
y = [np.random.random(5),np.random.random(5),np.random.random(5),np.random.random(5)]
model = Model(inputs=inception.input, outputs=predictions)
opt = Adam(lr=0.001)
model.compile(optimizer=opt, loss="mae")
model.fit(data_list, y, verbose=0, epochs=100)
Error:
ValueError: Data cardinality is ambiguous:
     x sizes: 224, 224, 224, 224
     y sizes: 5, 5, 5, 5
Make sure all arrays contain the same number of samples.
What could be going wrong?
Convert data_list and y to numpy arrays or tensors.
In your code the list is treated as four inputs while your model has one input - https://keras.io/api/models/model_training_apis/
Add these lines:
import tensorflow as tf
data_list = tf.stack(data_list)
y = tf.stack(y)
Try this
model.fit(np.array(data_list), np.array(y), verbose=0, epochs=100)

ValueError: Input 0 is incompatible with layer conv1d_1: expected ndim=3, found ndim=2

When I try to give Elmo embedding layer output to conv1d layer input it giving the error
ValueError: Input 0 is incompatible with layer conv1d_1: expected ndim=3, found ndim=2
I want to add a convolution layer from the output of the Elmo embedding layer
import tensorflow as tf
import tensorflow_hub as hub
import keras.backend as K
from keras import Model
from keras.layers import Input, Lambda, Conv1D, Flatten, Dense
from keras.utils import to_categorical
from sklearn.preprocessing import LabelEncoder
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv("/home/raju/Desktop/spam.csv", encoding='latin-1')
X = df['v2']
Y = df['v1']
le = LabelEncoder()
le.fit(Y)
Y = le.transform(Y)
Y = to_categorical(Y)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.25)
elmo = hub.Module('/home/raju/models/elmo')
def embeddings(x):
return elmo(tf.squeeze(tf.cast(x, dtype=tf.string)), signature='default', as_dict=True)['default']
input_layer = Input(shape=(1,), dtype=tf.string)
embed_layer = Lambda(embeddings, output_shape=(1024,))(input_layer)
conv_layer = Conv1D(4, 2, activation='relu')(embed_layer)
fcc_layer = Flatten()(conv_layer)
output_layer = Dense(2, activation='softmax')(fcc_layer)
model = Model(inputs=[input_layer], outputs=output_layer)
A Conv1D layer expects input of the shape (batch, steps, channels). The channels dimension is missing in your case, and you need to include it even if it is equal to 1. So the output shape of your elmo module should be (1024, 1) (this does not include the batch size). You can add a dimension to the output of the elmo module with tf.expand_dims(x, axis=-1).

Categorical crossentropy and label encoding

I'm trying to code multiclass output and classes are ['A','B','C','D','E','F','G'].
Could someone elaborate more next error message:
"ValueError: You are passing a target array of shape (79, 1) while using as loss categorical_crossentropy. categorical_crossentropy expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:
from keras.utils.np_utils import to_categorical
y_binary = to_categorical(y_int)
Alternatively, you can use the loss function sparse_categorical_crossentropy instead, which does expect integer targets."
My code:
# Part 1 - Data Preprocessing
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataa = pd.read_csv('test_out.csv')
XX = dataa.iloc[:, 0:4].values
yy = dataa.iloc[:, 4].values
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_Y_1 = LabelEncoder()
yy = labelencoder_Y_1.fit_transform(yy)
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(XX, yy, test_size = 0.2,
random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Part 2 - Now let's make the ANN!
# Importing the Keras libraries and packages
import keras
from keras.models import Sequential
from keras.layers import Dense
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu',
input_dim = 4))
# Adding the second hidden layer
classifier.add(Dense(output_dim = 6, init = 'uniform', activation = 'relu'))
# Adding the output layer
classifier.add(Dense(output_dim = 1, init = 'uniform', activation =
'softmax'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy',
metrics = ['accuracy'])
# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, nb_epoch = 50)
# Part 3 - Making the predictions and evaluating the model
# Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
The problem lies in this portion of your code,
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_Y_1 = LabelEncoder()
yy = labelencoder_Y_1.fit_transform(yy)
You forgot to one-hot encode the yy, please take note that LabelEncoder only transforms your categorical data to numerical one, i.e. [A, B, C, D, E, F, G] to [1, 2, 3, 4, 5, 6, 7]. You have to one-hot encode it since you want to use softmax activation, and categorical_crossentropy (I'm over-simplifying, but it's the gist).
So, it should have been like this,
# Encoding categorical data
from keras.utils import to_categorical
from sklearn.preprocessing import LabelEncoder
labelencoder_Y_1 = LabelEncoder()
yy = labelencoder_Y_1.fit_transform(yy)
yy = to_categorical(yy)
I assume your target class that you are going to predict is binary i.e there are only 2 possible values that could occur
If your target is binary then, the last layer of the model should be activated with sigmoid activation function. Also, the model should be compiled with binary_crossentropy or sparse_categorical_crossentropy.
If the target is multi-class i.e more than 2 possible values, you must convert the target to categorical with the help of to_categorical from keras. Then you should compile your model with categorical_crossentropy and the last layer in the model should be activated with softmax activation function.!!