Accuracy/Loss doesn't change - sequence

It might be duplication of the previous posts but here is my code.
My inputs X are sequences of characters each of length 10 encoded as 1-26 numbers with added random noise.output is the next word in the sequence.
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.layers.recurrent import LSTM
import keras.optimizers
in_out_neurons = 1
hidden_neurons = 20
model = Sequential()
# n_prev = 100, 2 values per x axis
model.add(LSTM(hidden_neurons, input_shape=(10, 1)))
model.add(Activation('relu'))
model.add(Dense(in_out_neurons))
model.add(Activation("sigmoid"))
model.add(Activation("softmax"))
rms = keras.optimizers.RMSprop(lr=5, rho=0.9, epsilon=1e-08, decay=0.0)
sgd = keras.optimizers.SGD(lr=0.01, momentum=0.0, decay=0.001, nesterov=False)
model.compile(loss="binary_crossentropy",
optimizer='adam',
metrics=['accuracy'])
(X_train, y_train), (X_test, y_test) = train_test_split(data)
model.fit(X_train, y_train, batch_size=100, nb_epoch=50, validation_data=(X_test, y_test), verbose=1)
score = model.evaluate(X_test, y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
predicted = model.predict(X_test, batch_size=700)
# and maybe plot it
pd.DataFrame(predicted).to_csv("predicted.csv")
pd.DataFrame(y_test).to_csv("test_data.csv")
Tried changing different loss functions and optimizers. No luck.

Encoding characters by number is not a good way. It will be interpreted as numbers so it's like saying that Y and Z are close together which doesn't make sense. This is why the Embedding() layers exist. Or you might consider one-hot encoding . Characters are then one-hot vectors of length 26.
"a" would become [1 0 0 0 0 0 0 0 0 ... 0] for example.
That being said, the reason it's not working is because you put a Softmax on a layer which has only one value... Softmax on one value will always give output 1, so your network can't learn since output is 1 whatever happens before.
Softmax is used to make a probability density out of a tensor, if there is only one possible value, it will get probability 1. If you want that one neuron to be a probability (between 0 and 1) use only the sigmoid, not the softmax.
I hope this helps :)

Related

Keras Model for Float Input

If I wanted to make a model that would take a single number and then just output a single number (not a linear relationship, not sure what kind), how would I shape the input and output layers, and what kind of loss/optimizer/activation functions should I use? Thanks.
Your question includes many things. What i will highly recommand you to
understand
Regression based problem
Classification based problem
Based on that you need to figure out which activation function or loss function or optimizer you need to use because for regression and classification those are different. Try to figure out things one after another.
For input/ouput see THIS
You have only one feature as input then the model based on,
Classification based Problem,
Loss Function - categorical_crossentropy || sparse_categorical_crossentropy
optimizer - Adam
output layer - number of class need to predict
output activation - softmax
model = tf.keras.Sequential()
model.add(layers.Dense(8, activation='relu', input_shape = (1, ))) #input shape as 1
model.add(layers.Dense(3, activation='softmax')) #3 is number of class
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
Regression based Problem,
Loss Function - mean_square_error
optimizer - Adam
output layer - 1
output activation - default (relu)
model = tf.keras.Sequential()
model.add(layers.Dense(8, activation='relu', input_shape = (1, ))) #input shape as 1
model.add(layers.Dense(1)) #1 is number of output
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='mean_square_error', metrics=['accuracy'])
Binary based Problem (0 or 1),
Loss Function - binary_crossentropy
optimizer - Adam
output activation - sigmoid
output layer - 1
model = tf.keras.Sequential()
model.add(layers.Dense(8, activation='relu', input_shape = (1, ))) #input shape as 1
model.add(layers.Dense(1, activation='sigmoid')) #1 is number of output
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

How to Structure Three-Dimensional Lag TimeSteps for an LSTM in Keras?

I understand LSTMS require a three-dimensional dataset to function following this format, N_samples x TimeSteps x Variables. I want to restructure my data from a single timestep for all of my rows into Lag timesteps by hours. The idea is that the LSTM would then batch train from hour to hour (from 310033 rows x 1 Timestep x 83 Variables to 310033 rows x 60 Timestep x 83 Variables).
However, the losses of my model were weird (increasing training loss with epochs) and training accuracy decreased from the single time step to the lagged time steps. This makes me believe I did this transformation wrong. Is this the correct way to restructure the data or is there a better way to do so?
The data is time series data in 1 sec recordings and has already been preprocessed to be within a range of 0-1, One-Hot encoded, cleaned, etc...
Current Transformation in Python:
X_train, X_test, y_train, y_test = train_test_split(scaled, target, train_size=.7, shuffle = False)
#reshape input to be 3D [samples, timesteps, features]
#X_train = X_train.reshape((X_train.shape[0], 1, X_train.shape[1])) - Old method for 1 timestep
#X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1])) - Old method for 1 timestep
#Generate Lag time Steps 3D framework for LSTM
#As required for LSTM networks, we must reshape the input data into N_samples x TimeSteps x Variables
hours = len(X_train)/3600
hours = math.floor(hours) #Most 60 min hours availible in subset of data
temp =[]
# Pull hours into the three dimensional feild
for hr in range(hours, len(X_train) + hours):
temp.append(scaled[hr - hours:hr, 0:scaled.shape[1]])
X_train = np.array(temp) #Export Train Features
hours = len(X_test)/3600
hours = math.floor(hours) #Most 60 min hours availible in subset of data
temp =[]
# Pull hours into the three dimensional feild
for hr in range(hours, len(X_test) + hours):
temp.append(scaled[hr - hours:hr, 0:scaled.shape[1]])
X_test = np.array(temp) #Export Test Features
Data Shape after Transformation:
Model Injection:
model.add(LSTM(128, return_sequences=True,
input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dropout(0.15)) #15% drop out layer
#model.add(BatchNormalization())
#Layer 2
model.add(LSTM(128, return_sequences=False))
model.add(Dropout(0.15)) #15% drop out layer
#Layer 3 - return a single vector
model.add(Dense(32))
#Output of 2 because we have 2 classes
model.add(Dense(2, activation= 'sigmoid'))
# Define optimiser
opt = tf.keras.optimizers.Adam(learning_rate=1e-5, decay=1e-6)
# Compile model
model.compile(loss='sparse_categorical_crossentropy', # Mean Square Error Loss = 'mse'; Mean Absolute Error = 'mae'; sparse_categorical_crossentropy
optimizer=opt,
metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=epoch, batch_size=batch, validation_data=(X_test, y_test), verbose=2, shuffle=False)
Any input on how to improve performance or fix the Lag Timesteps?
Since you are trying to predict y against lagged and current values of x variables your y_train needs to start after 1st set of lagged values or y_train needs to be y_train[59:] and also your X_train needs to end withing training period and last observation of y_train should correspond to X_train which has latest data time point same as y_train. So take X_train[:y_train[59:].shape[0], 60, 83]
To elaborate a bit more, you need to fit:
X(t), X(t-1), X(t-2), ..., X(t-59) ---- > y(t)
X(t+1), X(t), X(t-1),..., X(t-58) ------> y(t+1)
The code you have written, if I am not wrong, is probably fitting the opposite:
X(t), X(t-1), X(t-2), ..., X(t-59) ---- > y(t-59)

ValueError: expected dense_1_input to have shape (None, 4) but got (78,2)

I don't fundamentally understand the shapes of arrays or how to determine the epochs and batch sizes of training data. My data has 6 columns, column 0 is the independent variable - a string, columns 1-4 are the Deep Neural Network inputs and column 5 is the binary outcome due to the inputs. I have 99 rows of data.
I want to understand how to get rid of this error.
#Importing Datasets
dataset=pd.read_csv('TestDNN.csv')
x = dataset.iloc[:,[1,5]].values # lower bound independent variable to upper bound in a matrix (in this case up to not including column5)
y = dataset.iloc[:,5].values # dependent variable vector
#Splitting data into Training and Test Data
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2, random_state=0)
#Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test=sc.transform(x_test)
# PART2 - Making ANN, deep neural network
#Importing the Keras libraries and packages
import keras
from keras.models import Sequential
from keras.layers import Dense
#Initialising ANN
classifier = Sequential()
#Adding the input layer and first hidden layer
classifier.add(Dense(activation= 'relu', input_dim =4, units=2,
kernel_initializer="uniform"))#rectifier activation function
#Adding second hidden layer
classifier.add(Dense(activation= 'relu', units=2,
kernel_initializer="uniform")) #rectifier activation function
#Adding the Output Layer
classifier.add(Dense(activation= 'sigmoid', units=1,
kernel_initializer="uniform"))
#Compiling ANN - stochastic gradient descent
classifier.compile(optimizer='adam', loss='binary_crossentropy',metrics=
['accuracy'])
#Fit ANN to training set
#PART 3 - Making predictions and evaluating the model
#Fitting classifier to the training set
classifier.fit(x_train, y_train, batch_size=32, epochs=5)#original batch is
10 and epoch is 100
The problem is with x definition. This line:
x = dataset.iloc[:,[1,5]].values
... tells pandas to take the columns 1 and 5 only, so it has shape [78, 2]. You probably meant taking all columns before the 5-th:
x = dataset.iloc[:,:5].values

How to calculate input_dim for a keras sequential model?

Keras Dense layer needs an input_dim or input_shape to be specified. What value do I put in there?
My input is a matrix of 1,000,000 rows and only 3 columns. My output is 1,600 classes.
What do I put there?
dimensionality of the inputs (1000000, 1600)
2 because it's a 2D matrix
input_dim is the number of dimensions of the features, in your case that is just 3. The equivalent notation for input_shape, which is an actual dimensional shape, is (3,)
In your case
lets assume x and y=target variable and are look like as follows after feature engineering
x.shape
(1000000, 3)
y.shape
((1000000, 1600)
# as first layer in a sequential model:
model = Sequential()
model.add(Dense(32, input_shape=x.shape[1])) # Input layer
# now the model will take as input arrays of shape (*, 3)
# and output arrays of shape (*, 32)
...
...
model.add(Dense(y.shape[1],activation='softmax')) # Output layer
y.shape[1]= 1600, the number of output which is the number of classes you have, since you are dealing with Classification.
X = dataset.iloc[:, 3:13]
meaning the X parameter having all the rows and 3rd column till 12th column inclusive and 13th column exclusive.
We will also have a X0 parameter to be given to the neural network, so total
input layers becomes 10+1 = 11.
Dense(input_dim = 11, activation = 'relu', kernel_initializer = 'he_uniform')

keras RNN 3d tensor input and 2d tensor output: Error when checking model target

I am trying to use LSTM to model multi-sample time series data. My input data has shape (100, 93, 6) - 100 independent time series (from the same/similar process), 93 time steps, 6 dimensions at each observation. Output shape is (100, 93) - one bool output per time step for each independent time series. (This is a small sample of real data, of course). However, I can't figure out how to construct such a Network in Keras:
from keras.models import Sequential
from keras.layers import LSTM, core, Activation, Dense
import numpy as np
data = np.load('sample.npz')
X = data['x']
y = data['y']
print('X shape: ',X.shape)
print('{} samples, {} time steps, {} observations at each time step, per sample\n'.format(*X.shape))
print('y shape: ',y.shape)
print('{} samples, {} time steps, boolean outcome per observation\n'.format(*y.shape))
print(X[0][2], X[0][55])
print(y[0][2], y[0][92])
X shape: (100, 93, 6) 100 samples, 93 time steps, 6 observations at
each time step, per sample
y shape: (100, 93) 100 samples, 93 time steps, boolean outcome per
observation
[ 1.80000000e+01 1.56000000e+05 2.00000000e+03 1.00000000e+04
3.00000000e+00 5.94000000e+04] [ 0. 0. 0. 0. 0. 0.]
1.0 0.0
model = Sequential()
model.add(LSTM(output_dim=4, input_shape=(93, 6), return_sequences=False))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(X, y, verbose=2)
Exception: Error when checking model target: expected dense_2 to have
shape (None, 1) but got array with shape (100, 93)
I believe Keras assumes that I have one output (Y) per timeseries, while I have one output per time step per time series. How do I make it work in Keras?
I was missing TimeDistributed Layer..
This works:
model = Sequential()
model.add(LSTM(output_dim=4, input_shape=(93, 6), return_sequences=True))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='binary_crossentropy', optimizer='adam')