I've been studying machine learning and I've become stuck on creating a code for multivariate linear regression.
Here's my training set:
And here is the current code I have at the moment
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
import matplotlib.pyplot as plt
import numpy as np
# Training set
train_x = np.numpy([[400, 180, 200], [430, 140, 305], [405, 255, 300],
[180, 180, 180], [220, 100, 160], [405, 255, 300],
[500, 350, 440], [1500, 900, 200], [1500, 900, 900],
[1000, 1000, 1000]], dtype=float)
train_y = np.array([4.20, 4.85, 6, 3.50, 2.70, 6.50, 11, 20.5, 39.8, 35.3], dtype=float)
# Create Keras model
model = Sequential()
model.add(Dense(1, input_dim=3))
# Gradient descent algorithm
sgd = SGD(0.00000005)
model.compile(loss='mse', optimizer=sgd)
history = model.fit(train_x, train_y, epochs=20000)
plt.plot(history.history['loss'])
plt.xlabel("No. of Iterations")
plt.ylabel("J(Theta1 Theta0)/Cost")
plt.show()
predict = np.array([[100, 100, 100]])
print(model.predict(predict))
When running this the cost function does decrease but doesn't seem to diverge. Also the prediction seems to be quite off as well. (The predict array has lower numbers compared to all data within the training set but it seem to get a price that's higher than some within the training set) Also for some reason I've had to lower my learning rate to a ridiculously low number.
I have a feeling that maybe I'm creating my train_x array wrongly?...
Looking at this code, I can see two problems that might result with bad predictions and the lack of divergence:
Lack of Layers:
A neural network works by optimising weights that are applied on inputs. With the lack of possible inputs to be updated, it has low flexibility and is unable to learn. In this case, there is only one neuron in the layer. I suggest adding more layers such as the one below:
model.add(Dense(25)
Low learning rate
In your example, you used stochastic gradient descent with a learning rate of 0.00000005. I believe that this value is too small for divergence, especially for an algorithm such as SGD. I suggest Adam with a learning rate of 0.1.
Putting all of this together I have a program with:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import adam
import matplotlib.pyplot as plt
import numpy as np
# Training set
train_x = np.array([[400, 180, 200], [430, 140, 305], [405, 255, 300],
[180, 180, 180], [220, 100, 160], [405, 255, 300],
[500, 350, 440], [1500, 900, 200], [1500, 900, 900],
[1000, 1000, 1000]], dtype=float)
train_y = np.array([4.20, 4.85, 6, 3.50, 2.70, 6.50, 11, 20.5, 39.8, 35.3], dtype=float)
# Create Keras model
model = Sequential()
model.add(Dense(1, input_dim=3))
model.add(Dense(25))
model.add(Dense(25))
model.add(Dense(25))
model.add(Dense(1))
# Gradient descent algorithm
adam = adam(0.1)
model.compile(loss='mse', optimizer=adam)
history = model.fit(train_x, train_y, epochs=1000)
plt.plot(history.history['loss'])
plt.xlabel("No. of Iterations")
plt.ylabel("J(Theta1 Theta0)/Cost")
plt.show()
predict = np.array([[100, 100, 100]])
print(model.predict(predict))
This program allowed for faster divergence (only 1000 epochs) and a lower final loss value as compared to the original post.
Related
I have an input Tensorflow ragged tensor structured like this [batch num_images width height channels] and I need to iterate over the dimension num_images to extract some features relevant for downstream applications.
Example code is the following:
from tensorflow.keras.applications.efficientnet import EfficientNetB7
from tensorflow.keras.layers import Input
import tensorflow as tf
eff_net = EfficientNetB7(weights='imagenet', include_top=False)
input_claim = Input(shape=(None, 600, 600, 3), name='input_1', ragged=True)
eff_out = tf.map_fn(fn=eff_net,
elems=input_claim, fn_output_signature=tf.float32)
The first Input dimension is set to None as it can differ across data points, and for this reason the input receives instances of tf.RaggedTensor.
This code breaks with a TypeError in this way TypeError: Could not build a TypeSpec for KerasTensor(type_spec=RaggedTensorSpec(TensorShape([None, None, 600, 600, 3]), tf.float32, 1, tf.int64), name='input_1', description="created by layer 'input_1'") of unsupported type <class 'keras.engine.keras_tensor.RaggedKerasTensor'>.
I suspect there is a better way to perform this type of preprocessing though
Update: num_images is needed because (although not described here) I am doing some following reduce operation on this dimension
You can use tf.ragged.map_flat_values to achieve the same
Create a model like:
def eff_net(x): #dummy eff_net for testing that returns [batch, dim]
return tf.random.normal(shape=tf.shape(x)[:2])
input_claim = keras.Input(shape=(None, 600, 600, 3), name='input_1', ragged=True)
class RaggedMapLayer(layers.Layer):
def call(self, x):
return tf.ragged.map_flat_values(eff_net, x)
outputs = RaggedMapLayer()(input_claim)
model = keras.Model(inputs=input_claim, outputs=outputs)
testing,
inputs = tf.RaggedTensor.from_row_splits( tf.random.normal(shape=(10, 600, 600, 3)), row_splits=[0, 2, 5,10])
#shape [3, None, 600, 600, 3]
model(inputs).shape
#[3, None, 600]
I'm trying to implement SVM as the last layer of a CNN for classification, I'm trying to implement this code:
def custom_loss_value(y_true, y_pred):
print(y_true)
print(y_pred)
X = y_pred
print(X)
Y = y_true
Predict = []
Prob = []
scaler = StandardScaler()
# X = scaler.fit_transform(X)
param_grid = {'C': [0.1, 1, 8, 10], 'gamma': [0.001, 0.01, 0.1, 1]}
SVM = GridSearchCV(SVC(kernel='rbf',probability=True), cv=3, param_grid=param_grid, scoring='auc', verbose=1)
SVM.fit(X, Y)
Final_Model = SVM.best_estimator_
Predict = Final_Model.predict(X)
Prob = Final_Model.predict_proba(X)
return categorical_hinge(tf.convert_to_tensor(Y, dtype=tf.float32), tf.convert_to_tensor(Predict, dtype=tf.float32))
sgd = tf.keras.optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss=custom_loss_value, optimizer=sgd, metrics=['accuracy'])
I'm getting the error: Cannot convert a symbolic Tensor (dense_2_target_2:0) to a numpy array
on the line SVM.fit(X,Y)
I also tried converting the y_true and y_pred to np array but was getting error then also
To train a neural network with gradient descent, you need a model to be differentiable. So, you need to be able to take a gradient w.r.t. every trainable parameter.
Some problems arise in your code:
You can't directly train an SVM inside a Keras loss function. It
takes a TensorFlow tensor and uses TF ops. The output is also a
Tensorflow tensor. sklearn can work with NumPy arrays or lists but
not tensors.
It is very hard and practically not useful to train SVM through backpropagation. Something about it can be read here.
You can train SVM on top of pretrained model instead of fully-connected layer.
I'm building and testing a simple MLP model, but am running into an issue with the Keras reproducibility for my results. I am trying to set up my neural network so that the prediction outputs won't change when I run the network.
I have already followed the Keras guide online as well as this post (Reproducible results using Keras with TensorFlow backend). I am running Keras on my local machine with Tensorflow backend and the following versions:
tensorflow 2.0.0-alpha0,
keras 2.2.4-tf,
numpy 1.16.0
import os
os.environ['PYTHONHASHSEED']=str(0)
import random
random.seed(0)
from numpy.random import seed
seed(1)
import tensorflow as tf
tf.compat.v1.set_random_seed(2)
from keras import backend as K
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)
import numpy as np
from tensorflow.python.keras.layers import Dropout, BatchNormalization
from tensorflow.python.keras.optimizers import Adam
class Machine_Learning_Classifier_Keras(object):
#classmethod
def _get_classifier(cls, n_input_features=None, **params):
KerasClassifier = tf.keras.wrappers.scikit_learn.KerasClassifier
Dense = tf.keras.layers.Dense
Sequential = tf.keras.models.Sequential
sk_params = {"epochs": 200, "batch_size": 128, "shuffle": False}
def create_model(optimizer='adam', init='he_normal'):
# create model
model = Sequential()
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(500, input_dim=4, kernel_initializer=init, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(250, kernel_initializer=init, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(500, kernel_initializer=init, activation='relu'))
model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=Adam(lr=3e-3, decay=0.85), metrics=['accuracy'])
return model
return KerasClassifier(build_fn=create_model, **sk_params)
if __name__ == "__main__":
X = np.asarray([[0.0, 0.0], [1.0, 1.0], [2.0, 2.5], [1.5, 1.6]])
y = np.asarray([0, 0, 1, 1])
nn = Machine_Learning_Classifier_Keras._get_classifier()
nn.fit(X, y, sample_weight=np.asarray([0, 0, 1, 1]))
values = np.asarray([[0.5, 0.5], [0.6, 0.5], [0.8, 1.0], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5]])
probas = nn.predict_proba(values)
print(probas)
I would expect my outputs for the predict_proba values to stay the same between runs; however, I am getting the following for two successive runs (results will vary):
Run 1:
[[0.9439231 0.05607685]
[0.91351616 0.08648387]
[0.06378722 0.9362128 ]
[0.9439231 0.05607685]
[0.9439231 0.05607685]
[0.9439231 0.05607685]
[0.94392323 0.05607677]
[0.94392323 0.05607677]]
Run 2:
[[0.94391584 0.05608419]
[0.91350436 0.08649567]
[0.06378281 0.9362172 ]
[0.94391584 0.05608419]
[0.94391584 0.05608419]
[0.94391584 0.05608419]
[0.94391584 0.05608416]
[0.94391584 0.05608416]]
Ended up figuring out what the issue is, but not sure how to resolve -- it has something to do with the first BatchNormalization() layer, which is supposed to standardize the inputs. If you remove that layer the results are entirely reproducible, but something in the BatchNormalization() implementation is leading to non-reproducible behavior
If you run the mentioned code twice then it will show the behavior that you have just described. Cause every-time the model is being trained and its not necessary that it may lead to same local minimum every-time.
However if you train your model only once and save the weights and use those weights to predict the output then you will get same results every-time for same data.
I am trying to make regression model for symmetrical input, hoping to model function with f(x,y)=f(y,x)=F. Suddenly, i found that trained neural network will give different outputs for f(x,y) and f(y,x).
I am using dense neural network with multiple layers with Adagrad for learning on entire training set.
The part of the problem occurs because of random (non-symmetrical) weights initialization.
But it looks like making symmetrical weights on each neuron will lose benefits of using DNN.
Is it possible to solve this with DNN or what is the way to do this
example:
from __future__ import absolute_import, division, print_function
import pathlib
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import tensorflow as tf
from tensorflow.keras import layers
print(tf.__version__)
train = pd.DataFrame([[0, 0], [0, 1], [1, 0], [1, 1]])
labels = pd.DataFrame([[0], [1], [1], [3]])
def build_model4():
model4 = tf.keras.Sequential([
layers.Dense(4, activation=tf.nn.elu, input_shape=(2,)),
layers.Dense(4, activation=tf.nn.elu),
layers.Dense(4, activation=tf.nn.elu),
layers.Dense(1, activation=tf.nn.relu)
])
optimizer = tf.keras.optimizers.Adagrad(lr=0.05, epsilon=None, decay=0.0)
model4.compile(loss='mean_squared_error',
optimizer=optimizer,
metrics=['mean_absolute_error', 'mean_squared_error'])
return model4
model4 = build_model4()
model4.summary()
EPOCHS = 500
history = model4.fit(
train, labels, epochs=EPOCHS, batch_size=4, verbose=0)
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
hist.tail()
plt.plot(history.history['mean_squared_error'], label='train')
test=pd.DataFrame([[1, 2], [2, 1]])
o=model4.predict(test)
print(o)
If your model is inherently asymmetrical, there is a simple way to force symmetry explicitly:
g(x, y) = g(y, x) = 1/2 * (f(x, y) + f(y, x))
Suppose I have a tensor of shape [None, 80, 80]. This is a batch of 80x80 images for stochastic gradient descent.
Suppose I choose the minibatch size as 50, (None will be 50), and I want to factor the None into two dimensions (5, 10), resulting in [?, ?, 80, 80].
How do I achieve this when forming the graph with None value?
You can do it with tf.reshape:
import numpy as np
import tensorflow as tf
x = tf.placeholder(tf.float32, shape=[None, 80, 80], name='x')
y = tf.reshape(x, shape=[-1, 10, 80, 80], name='y')
data = np.zeros([50, 80, 80])
with tf.Session() as session:
result = session.run(y, feed_dict={x: data})
print result.shape
Result output:
(5, 10, 80, 80)
Of course, keep in mind that passing an unsuitable batch size will result in exception at runtime.