I am trying to use the Functional API in TensorFlow (https://keras.io/guides/functional_api/) to build a deep learning model. So, this is my model:
first_inputs = Input(shape=(100, ))
first_dense = Dense(1, )(first_inputs)
second_input = Input(shape=(1, ))
merge = concatenate([first_dense, second_input])
output = Dense(1, )(merge)
model = Model(inputs=[first_inputs, second_input], outputs=output)
model.compile(optimizer=ada_grad, loss='binary_crossentropy',
metrics=['accuracy'])
I use train_test_split as you see:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.01, random_state=42)
How can I use model.fit here and say first_inputs and second_input are these columns in x_train? How can I use model.evaluate and say first_inputs and second_input are these columns in x_train?
You cannot say that. Multiple inputs should be presented to fit as lists of arrays. E.g:
X = np.random.randn(1234, 101)
X1, X2 = X[:,:100], X[:, 100]
Y = np.random.randn(1234, 1)
model.fit([X1, X2], Y)
Related
I have a machine learning model that i feel is still getting false positives. It can largely detect attacks that i produce separately from the training / test set, maybe at a 80% rate? But for me that is not enough. I also tried to drop columns with high correlation. My biggest problem is my understanding of whether to use one-hot-encoding or not. I can switch between both one hot and sparse and i don't notice a difference at all in my dataset.
The dataset is like this:
column 1 - column 2 - column 3 - etc, all containing stuff like packet properties, and then at the end, the class. So, class 1, class 2 or class 3. Any one row can only belong to one class, it can't be two attack types, it has to distinguish between all attack types and then assign this particular row the best attack type match! This is different from one-hot where it's meant that, if i understand right, a row can belong to multiple attack types. I however notice that nobody ever uses sparse_categorical_crossentropy when it comes to even the iris dataset which is very similar to mine, as it has more than 3 classes.
I can paste my code here and if somebody knows where i am going wrong! :Z
label_encoder = preprocessing.LabelEncoder()
y = ConcatenateAttackList['Label']
encoded_y = label_encoder.fit_transform(y)
y = np_utils.to_categorical(encoded_y)
x = ConcatenateAttackList.drop(['Label', ], axis = 1).astype(float)
sc = MinMaxScaler()
print('x_train, y_train, fitting and transforming.')
x = sc.fit_transform(x)
x,y = oversample.fit_resample(x,y)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42,stratify=y,
shuffle=True)
len(x_train)
len(y_train)
X = pd.DataFrame(x_train)
print('x_train, y_train, fitted and transformed.')
with tf.device("CPU"):
train = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(4*256).batch(256)
validate = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(256)
model = Sequential()
print('Model initialized.')
model.add(Dense(64,input_dim=len(X.columns),activation='relu')) # input layer
model.add(tf.keras.layers.BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(Dense(6, activation='softmax'))
print('Nodes added to layers.')
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics='categorical_accuracy')
print('Compiled.')
callback=tf.keras.callbacks.EarlyStopping(monitor='val_loss', mode='auto',
patience=50, min_delta=0, restore_best_weights=True, verbose=2)
print('EarlyStopping CallBack executed.')
print('Beginning fitting...')
model_hist = model.fit(x_train, y_train, epochs=231, batch_size=256, verbose=1,
callbacks=[callback],validation_data=validate)
print('Fitting completed.')
model.save("sets/mymodel5.h5")
dump(sc, 'sets/scaler_transformTCPDCV5.joblib')
print('Model saved.')
# loss history
plt.plot(model_hist.history['loss'], label="Training Loss")
plt.plot(model_hist.history['val_loss'], label="Validation Loss")
plt.legend()
#------------PREDICTION
tester = pd.read_csv('AttackTestFile.csv', sep=r'\s*,\s*', engine='python')
ColumnsForWindowsCIC = pd.read_csv('ColumnsForWindowsCIC.csv')
tester.columns = ColumnsForWindowsCIC.columns
tester = deleteRedudancy(tester)
x = tester.drop(['Label', ], axis = 1)
fit_new_input = sc.transform(x)
predict_y=model.predict(fit_new_input)
predict_y
classes_y=np.argmax(predict_y,axis=1)
classes_y
predict = label_encoder.inverse_transform(classes_y)
predict
I wanted to fit simple LSTM model to perform binary classification on multivariate time series data. Since my data is severely imbalanced, I have integrated class_weight argument from sklearn in my model. However, I have got pretty high loss value, and it was not decreasing with each epoch. My f1 score was 0.018 which is extremely low as well. I appreciate your suggestions!
Sample data:
sequence_length = 10
def generate_data(X, y, sequence_length = 10, step = 1):
X_local = []
y_local = []
for start in range(0, len(data) - sequence_length, step):
end = start + sequence_length
X_local.append(X[start:end])
y_local.append(y[end-1])
return np.array(X_local), np.array(y_local)
X_sequence, y = generate_data(data.loc[:, "V1":"V4"].values, data.Class)
model = keras.Sequential()
model.add(LSTM(100, input_shape = (10, 4)))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss="binary_crossentropy"
, metrics=[keras.metrics.binary_accuracy]
, optimizer="adam")
model.summary()
training_size = int(len(X_sequence) * 0.7)
X_train, y_train = X_sequence[:training_size], y[:training_size]
X_test, y_test = X_sequence[training_size:], y[training_size:]
from sklearn.utils import class_weight
class_weights = dict(zip(np.unique(y_train), class_weight.compute_class_weight('balanced', np.unique(y_train),
y_train)))
model.fit(X_train, y_train, batch_size=64, epochs=50,class_weight=class_weights)
model.evaluate(X_test, y_test)
y_test_prob = model.predict(X_test, verbose=1)
y_test_pred = np.where(y_test_prob > 0.5, 1, 0)
from sklearn.metrics import f1_score
f1_score(y_test, y_test_pred)
Is it possible to use Keras tuner for tuning a NN using Time Series Split , similar to sklearn.model_selection.TimeSeriesSplit in sklearn.
For example consider a sample tuner class from https://towardsdatascience.com/hyperparameter-tuning-with-keras-tuner-283474fbfbe
from kerastuner import HyperModel
class SampleModel(HyperModel):
def __init__(self, input_shape):
self.input_shape = input_shape
def build(self, hp):
model = Sequential()
model.add(
layers.Dense(
units=hp.Int('units', 8, 64, 4, default=8),
activation=hp.Choice(
'dense_activation',
values=['relu', 'tanh', 'sigmoid'],
default='relu'),
input_shape=input_shape
)
)
model.add(layers.Dense(1))
model.compile(
optimizer='rmsprop',loss='mse',metrics=['mse']
)
return model
tuner:
tuner_rs = RandomSearch(
hypermodel,
objective='mse',
seed=42,
max_trials=10,
executions_per_trial=2)
tuner_rs.search(x_train_scaled, y_train, epochs=10, validation_split=0.2, verbose=0)
So instead of validation_split = 0.2, in the above line is it possible to do the following
from sklearn.model_selection import TimeSeriesSplit
#defining a time series split object
tscv = TimeSeriesSplit(n_splits = 5)
#using that in Keras Tuner
tuner_rs.search(x_train, y_train, epochs=10, validation_split=tscv, verbose=0)
I solved in this way:
First I have istanciated a class that allows to perform Blocking Time Series Split. I found out that it might be better to use this time series split rather than Sklearn TimeSeriesSplit because we won't make our model train on instances with already seen data. As you can see from the picture, if number of splits is 5, BTSS will divide your training data in 5 parts with only the validation data in common across the splits. (Since StackOverflow doesn't allow me to upload images i'll post a reference link: https://hub.packtpub.com/cross-validation-strategies-for-time-series-forecasting-tutorial/)
class BlockingTimeSeriesSplit():
def __init__(self, n_splits):
self.n_splits = n_splits
def get_n_splits(self, X, y, groups):
return self.n_splits
def split(self, X, y=None, groups=None):
n_samples = len(X)
k_fold_size = n_samples // self.n_splits
indices = np.arange(n_samples)
margin = 0
for i in range(self.n_splits):
start = i * k_fold_size
stop = start + k_fold_size
mid = int(0.8 * (stop - start)) + start
yield indices[start: mid], indices[mid + margin: stop]
Then you will proceed by creating your own model:
def build_model(hp):
pass
Finally you can create your CVtuner as a class which will call back BlockingTimeSeriesSplit.
class CVTuner(kt.engine.tuner.Tuner):
def run_trial(self, trial, x, y, *args, **kwargs):
cv = BlockingTimeSeriesSplit(n_splits=5)
val_accuracy_list = []
batch_size = trial.hyperparameters.Int('batch_size', 0, 64, step=8)
epochs = trial.hyperparameters.Int('epochs', 10, 100, step=10)
for train_indices, test_indices in cv.split(x):
x_train, x_test = x[train_indices], x[test_indices]
y_train, y_test = y[train_indices], y[test_indices]
model = self.hypermodel.build(trial.hyperparameters)
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs)
val_loss, val_accuracy, val_auc = model.evaluate(x_test, y_test)
val_accuracy_list.append(val_accuracy)
self.oracle.update_trial(trial.trial_id, {'val_accuracy': np.mean(val_accuracy_list)})
self.save_model(trial.trial_id, model)
tuner = CVTuner(oracle=kt.oracles.BayesianOptimization(objective='val_accuracy',max_trials=1), hypermodel=create_model)
stop_early = tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=10)
tuner.search(X, Y, callbacks=[stop_early])
best_model = tuner.get_best_models()[0]
best_model.summary()
best_model.evaluate(x_out_of_sample, y_out_of_sample)
i followed the guide found here(regression):
https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/
using this dataset:
https://drive.google.com/file/d/1mVmGNx6cbfvRHC_DvF12ZL3wGLSHD9f_/view
and ended up with this code:
data = pd.read_csv(r'path')
X = data.iloc[:, 0:4].values
y = data.iloc[:, 4].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
input_layer = Input(shape=(X.shape[1],))
dense_layer_1 = Dense(100, activation='relu')(input_layer)
dense_layer_2 = Dense(50, activation='relu')(dense_layer_1)
dense_layer_3 = Dense(25, activation='relu')(dense_layer_2)
output = Dense(1)(dense_layer_3)
model = Model(inputs=input_layer, outputs=output)
model.compile(loss="mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])
history = model.fit(X_train, y_train, batch_size=2, epochs=100, verbose=1, validation_split=0.2)
from sklearn.metrics import mean_squared_error
from math import sqrt
pred_train = model.predict(X_train)
print(np.sqrt(mean_squared_error(y_train,pred_train)))
pred = model.predict(X_test)
print(np.sqrt(mean_squared_error(y_test,pred)))
Everything works and the model gets trained, but how do i actually use it? I want to input 4 intergers and in return get the prediction. So for example take the array [9, 4554, 1950, 0.634] and then get the predicted value. No matter what i do the model won't accept the data i am using.
Thanks for the help!
Main Problem which you are facing as per my understanding is dimension Because you insert [9,...,0.634] which of shape (4,) it mean 1D while X_test,X_train require to be 2D as per documentationo you have to convert 1D to 2D.
How You Convert
import numpy as np
X_test=[9,...,0.634]
X_test=np.array(X_test)
X_test=X_test.reshape(1,4)
model.predict(X_test)
s
I am building a model to classify sequence class. firstly i build the model use keras API. As we know the keras API packed the tensorflow function, but when i convert the keras code to tensorflow API, i found the result of two framwork is different. Below is the key code.
tensorflow code
x = tf.placeholder(tf.int32, shape=[None, time_steps], name='x_input')
y = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_label')
定义网络结构
def rnn_model(x):
x = tf.one_hot(x,api_vob_size)
rnn_cell_fw = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
rnn_cell_bw = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
# 将输入送入rnn,得到输出与中间状态,输出shape为[batch_size, time_steps, rnn_size]
outputs, states = tf.nn.bidirectional_dynamic_rnn(rnn_cell_fw,rnn_cell_bw, x, dtype=tf.float32)
# 获取最后一个时刻的输出,输出shape为[batch_size, rnn_size]
outputs1 = tf.concat(outputs, 2)
output = tf.transpose(outputs1, [1, 0, 2])[-1]
# 全连接层,最终输出大小为[batch_size, num_classes]
fc_w = tf.Variable(tf.random_normal([2*rnn_size, num_classes]))
fc_b = tf.Variable(tf.random_normal([num_classes]))
return tf.matmul(output, fc_w) + fc_b `
# 构建网络
logits= rnn_model(x)
prediction = tf.nn.softmax(logits)
# 定义损失函数与优化器
loss_op = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits, name='cross_entropy'))
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
train_op = optimizer.minimize(loss_op,name='optimizer_min')
#keras API
model = Sequential()
model.add(Bidirectional(LSTM(units=150), merge_mode='concat'))
model.add(Dense(9, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=64)
so why two code block has different result. thank you for answer !!!!