I'm new to tensorflow. What I'm trying to do is to train a simple neural network to solve the Newton 2 problems, to guess the force value of given mass and acceleration values. The input layer consists of two neurons which are mass and acceleration values. The output layer is the force.
The program just gives a warning, prints some data which I guess the outputs and then exits with code 1. I cannot try anything to solve this problem. Because as I said before I'm new to tensorflow and there is no error message.
Here is the code:
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
import numpy as np
import pickle
X = pickle.load(open("Newton2_X.pickle", "rb"))
y = pickle.load(open("Newton2_y.pickle", "rb"))
model = Sequential()
# model.add(Flatten())
model.add(Dense(2, activation="relu"))
model.add(Dense(128, activation="relu"))
model.add(Dense(1, activation="softmax"))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X, y, epochs=3, validation_split=0.1, batch_size=100)
Here are the pickle files:
https://drive.google.com/drive/folders/1FkKmY4px8oQJkbHYb_Z4y4Lnb1EazkvP?usp=sharing
After this part of the code I've some additional lines to make the network to guess a new value and some print lines. These lines are not executed. In fact, I've found that the 'problem' must be in model.fit(...) part. Because no lines after that line are executed.
Here is the full warning msg that I got from the program:
WARNING: Logging before flag parsing goes to stderr.
W0816 07:02:05.292823 17652 deprecation.py:506] From C:\Users\SABA\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
6, 0.2142802901764338, 0.26114980919201514, 0.2451221454091551, 0.19920049739052853, ...
A couple of things to tweak.
Firstly, I don't think the data is the shape that you think it is. You have:
X.shape # (45000, 2, 2, 1)
y is a flat list with 90,000 elements.
Secondly, you are predicting a number (so a regression) but you were trying to use 'sparse_categorical_crossentropy' as a loss function which is for classification problems.
I can get your code to run by simply slicing the data down to the shape we need but obviously it won't train as I haven't paired up the correct Xs and ys. You'll need to sort this out properly in the data
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
import numpy as np
import pickle
### TODO - sort this out!
X = pickle.load(open("Newton2_X.pickle", "rb"))[:,0,:,0]
y = np.array(pickle.load(open("Newton2_y.pickle", "rb")))[:45000]
####
model = Sequential()
# model.add(Flatten())
model.add(Dense(2, activation="relu"))
model.add(Dense(128, activation="relu"))
model.add(Dense(1, activation="softmax"))
model.compile(optimizer='adam',
loss='mse')
model.fit(X, y, epochs=3, validation_split=0.1, batch_size=100)
Related
I am trying to solve the XOR problem using the following code:
import numpy as np
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Input, Concatenate
from tensorflow.keras.utils import plot_model
from tensorflow.keras.optimizers import SGD, Adam
# input data
x = np.array([[0,0], [0,1], [1,0], [1,1]], 'float32')
y = np.array([[0], [1], [1], [0]], 'float32')
### Model
model = Sequential()
# add layers (architecture)
model.add(Dense(2, activation = 'relu')
model.add(Dense(1, activation = 'sigmoid'))
# compile
model.compile(loss = 'mean_squared_error',
optimizer = SGD(learning_rate = 0.1, momentum=0.8),
metrics = ['accuracy'])
# train
model.fit(x, y, epochs = 25000, batch_size = 1)
# evaluate
ev = model.evaluate(x, y)
I already tested:
using different activation functions in the hidden layer (sigmoid and tanh)
using different learning rates and momentum
Also, I am running with a high number of epochs (25000). Still, it only accurately predicts all outputs a few times. Most of the times accuracy is equal to 0.5 or 0.75.
I have read that this is the minimum configuration to solve this problem. However, it also seems that the error surface presents a number of regions with local minima.
My question is:
Should I assume that the model is correct and can learn the problem, although sometimes it gets 'stuck' in a local minima, OR do I still need to improve my model somehow to solve the XOR more accurately and consistently?
I'm working on a project where I have 3 inputs (v, f, n) and 1 output (delta(t)).
I'm trying to test the effect of the inputs on the output and to figure out which input is the most effective in different situations, therefore I would like to predict new output values that depend on new inputs values.
I have been testing this system and I got the following data table:
This table contains 1000 rows.
I'm new to this whole Neural Network thing, so I don't know what should be the Activation function, the loss function, etc.
I've been trying use some Keras models, but I'm getting wrong predictions when trying model.predict() some inputs values.
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(16, activation='relu', input_shape=(3,)))
model.add(Dense(16, activation='relu'))
model.add(Dense(1))
model.compile(optimizer=Adam(), loss='mse')
data = np.array(pd.read_excel(r'Data.xlsx'))
x = data[:, :3]
y = data[:, 3]
target = model.fit(x, y, validation_split=0.2, epochs=15000,
batch_size=256)
# check some predictions:
print(model.predict([[0.9, 840370875, 240]]))
I am using optimizer.get_config() to get the final state of my adam optimizer (as in https://stackoverflow.com/a/60077159/607528) however .get_config() is returning the initial state. I assume this means one of the following
.get_config() is supposed to return the initial state
my optimizer is not updating because I've set something up wrong
my optimizer is not updating tf's adam is broken (highly unlikely)
my optimizer is updating but is being reset somewhere before I call .get_config()
something else?
Of course I originally noticed the issue in a proper project with training and validation sets etc, but here is a really simple snippet that seems to reproduce the issue:
import tensorflow as tf
import numpy as np
x=np.random.rand(100)
y=(x*3).round()
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x, y, epochs=500)
model.evaluate(x, y)
model.optimizer.get_config()
If you want to restore your training - you should save optimizer weights, not config:
weight_values = optimizer.get_weights()
with open(self.output_path+'optimizer.pkl', 'wb') as f:
pickle.dump(weight_values, f)
And then load them:
model.fit(dummy_x, dummy_y, epochs=500) # build optimizer by fitting model with dummy input - e.g. random tensors with simpliest shape
with open(self.path_to_saved_model+'optimizer.pkl', 'rb') as f:
weight_values = pickle.load(f)
optimizer.set_weights(weight_values)
WARNING:tensorflow:Sequential models without an input_shape passed to the first layer cannot reload their optimizer state. As a result, your model is starting with a freshly initialized optimizer.
while trying to load a saved model i encountered this warning from tensorflow
import tensorflow.keras as keras
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
model.save('epic_num_reader.model')
new_model = tf.keras.models.load_model('epic_num_reader.model')
predictions = new_model.predict(x_test)
Had the same problem after upgrading to TF 1.14, I fixed it changing the definition of the first layer from this:
model.add(tf.keras.layers.Flatten())
to this
model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))
where 28 is the size of the input map to be flattened (mnist pixels in our case)
As the warning suggest, your first layer need the argument input_shape. In your case this would be the layer Flatten.
In the keras Documentation there is an extra section about the sequential API. See here for further information.
model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))
for the first layer after tf 1.14 it is require to use input type which is the dimensions for the particular image.
Or you might get warning while retrieving model to not get proper working for your optimizer
I'm trying to train a classifier on Google QuickDraw drawings using Keras:
import numpy as np
from tensorflow.keras.layers import Conv2D, Dense, Flatten, MaxPooling2D
from tensorflow.keras.models import Sequential
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=5, data_format="channels_last", activation="relu", input_shape=(28, 28, 1)))
model.add(MaxPooling2D(data_format="channels_last"))
model.add(Conv2D(filters=16, kernel_size=3, data_format="channels_last", activation="relu"))
model.add(MaxPooling2D(data_format="channels_last"))
model.add(Flatten(data_format="channels_last"))
model.add(Dense(units=128, activation="relu"))
model.add(Dense(units=64, activation="relu"))
model.add(Dense(units=4, activation="softmax"))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
x = np.load("./x.npy")
y = np.load("./y.npy")
model.fit(x=x, y=y, batch_size=100, epochs=40, validation_split=0.2)
The input data is a 4d array with 12000 normalized images (28 x 28 x 1) per class. The output data is an array of one hot encoded vectors.
If I train this model on four classes, it produces convincing results:
(red is training data, blue is validation data)
I know the model is slightly overfitted. However, I want to keep the architecture as simple as possible, so I accepted that.
My problem is that as soon as I add just one arbitrary class, the model starts to overfit extremely:
I tried many different things to prevent it from overfitting such as Batch Normalization, Dropout, Kernel Regularizers, much more training data and different batch sizes, none of which caused any significant improvement.
What could be the reason why my CNN overfits so much?
EDIT: This is the code I used to create x.npy and y.npy:
import numpy as np
from tensorflow.keras.utils import to_categorical
files = ['cat.npy', 'dog.npy', 'apple.npy', 'banana.npy', 'flower.npy']
SAMPLES = 12000
x = np.concatenate([np.load(f'./data/{f}')[:SAMPLES] for f in files]) / 255.0
y = np.concatenate([np.full(SAMPLES, i) for i in range(len(files))])
# (samples, rows, cols, channels)
x = x.reshape(x.shape[0], 28, 28, 1).astype('float32')
y = to_categorical(y)
np.save('./x.npy', x)
np.save('./y.npy', y)
The .npy files come from here.
The problem lies with how the data split is done. Notice that there are 5 classes and you do 0.2 validation split. By default there's no shuffling and in your code you feed the data in a sequential order. What that means:
Training data consists entirely of 4 classes: 'cat.npy', 'dog.npy', 'apple.npy', 'banana.npy'. That's the 0.8 training split.
Test data is 'flower.npy'. That's your 0.2 validation split. The model was never trained on this so it gets terrible accuracy.
Such results are only possible thanks to the fact that the validation_split=0.2, so you get close to perfect class separation.
Solution
x = np.load("./x.npy")
y = np.load("./y.npy")
# Shuffle the data!
p = np.random.permutation(len(x))
x = x[p]
y = y[p]
model.fit(x=x, y=y, batch_size=100, epochs=40, validation_split=0.2)
if my hypothesis is correct, setting the validation_split to e.g. 0.5 should also get you much better results (though it's not a solution).