How do I discover the values for variables of an equation with keras/tensorflow? - tensorflow

I have an equation that describes a curve in two dimensions. This equation has 5 variables. How do I discover the values of them with keras/tensorflow for a set of data? Is it possible? Someone know a tutorial of something similar?
I generated some data to train the network that has the format:
sample => [150, 66, 2] 150 sets with 66*2 with the data something like "time" x "acceleration"
targets => [150, 5] 150 sets with 5 variable numbers.
Obs: I know the range of the variables. I know too, that 150 sets of data are too few sample, but I need, after the code work, to train a new network with experimental data, and this is limited too. Visually, the curve is simple, it has a descendent linear part at the beggining and at the end it gets down "like an exponential".
My code is as follows:
def build_model():
model = models.Sequential()
model.add(layers.Dense(512, activation='relu', input_shape=(66*2,)))
model.add(layers.Dense(5, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['mae'])
return model
def smooth_curve(points, factor=0.9):
[...]
return smoothed_points
#load the generated data
train_data = np.load('samples00.npy')
test_data = np.load('samples00.npy')
train_targets = np.load('labels00.npy')
test_targets = np.load('labels00.npy')
#normalizing the data
mean = train_data.mean()
train_data -= mean
std = train_data.std()
train_data /= std
test_data -= mean
test_data /= std
#k-fold validation:
k = 3
num_val_samples = len(train_data)//k
num_epochs = 100
all_mae_histories = []
for i in range(k):
val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]
val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]
partial_train_data = np.concatenate(
[train_data[:i * num_val_samples],
train_data[(i + 1) * num_val_samples:]],
axis=0)
partial_train_targets = np.concatenate(
[train_targets[:i * num_val_samples],
train_targets[(i + 1) * num_val_samples:]],
axis=0)
model = build_model()
#reshape the data to get the format (100, 66*2)
partial_train_data = partial_train_data.reshape(100, 66 * 2)
val_data = val_data.reshape(50, 66 * 2)
history = model.fit(partial_train_data,
partial_train_targets,
validation_data = (val_data, val_targets),
epochs = num_epochs,
batch_size = 1,
verbose = 1)
mae_history = history.history['val_mean_absolute_error']
all_mae_histories.append(mae_history)
average_mae_history = [
np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)]
smooth_mae_history = smooth_curve(average_mae_history[10:])
plt.plot(range(1, len(smooth_mae_history) + 1), smooth_mae_history)
plt.xlabel('Epochs')
plt.ylabel('Validation MAE')
plt.show()
Obviously as it is, I need to get the best accuracy possible, but I am getting an "median absolute error(MAE)" like 96%, and this is inaceptable.

I see some basic bugs in this methodology. Your final layer of the network has a softmax layer. This would mean it would output 5 values, which sum to 1, and behave as a probability distribution. What you actually want to predict is true numbers, or rather floating point values (under some fixed precision arithmetic).
If you have a range, then probably using a sigmoid and rescaling the final layer would to match the range (just multiply with the max value) would help you. By default sigmoid would ensure you get 5 numbers between 0 and 1.
The other thing should be to remove the cross entropy loss and use a loss like RMS, so that you predict your numbers well. You could also used 1D convolutions instead of using Fully connected layers.
There has been some work here: https://julialang.org/blog/2017/10/gsoc-NeuralNetDiffEq which tries to solve DEs and might be relevant to your work.

Related

setting bias for multiclass classification python tensorflow keras

Attached model shows how to add bias in case of the unbalanced classification problem initial_bias = np.log([pos/neg]). Is there a way to add bias if you have multi-class classification with unbalanced data, Say 5 classes where classes are have distribution (0.4,0.3,0.2.0.08 and 0.02)
2) also how to calculate and use class weights in such case?
update 1
I found a way to apply weights, still not sure how to use bias
#####adding weights 20 Feb
weight_for_0 = ( 1/ 370)*(370+ 977+ 795)/3
weight_for_1 = ( 1/ 977)*(370+ 977+ 795)/3
weight_for_2 = (1 / 795)*(370+ 977+ 795)/3
#array([0, 1, 2]), array([370, 977, 795])
class_weights_dict = {0: weight_for_0, 1: weight_for_1, 2:weight_for_2}
class_weights_dict
Dcnn.fit(train_dataset,
epochs=NB_EPOCHS,
callbacks=[MyCustomCallback()],verbose=2,validation_data=test_dataset, class_weight=class_weights_dict)
Considering that you're using 'softmax':
softmax = exp(neurons) / sum(exp(neurons))
And that you want the results of the classes to be:
frequency = [0.4 , 0.3 , 0.2 , 0.08 , 0.02]
Biases should be given by the equation (elementwise):
frequency = exp(biases) / sum(exp(biases))
This forms a system of equations:
f1 = e^b1 / (e^b1 + e^b2 + ... + e^b5)
f2 = e^b2 / (e^b1 + e^b2 + ... + e^b5)
...
f5 = e^b5 / (e^b1 + e^b2 + ... + e^b5)
If you can solve this system of equations, you get the biases you want.
I used excel and test-error method to determine that for the frequencies you wanted, your biases should be respectively:
[1.1 , 0.81 , 0.4 , -0.51 , -1.9]
I don't really know how to solve that system easily, but you can keep experimenting with excel or another thing until you reach the solution.
Adding the biases to the layer - method 1.
Use a name when defining the layer, like:
self.last_dense = layers.Dense(units=3, activation="softmax", name='last_layer')
You may need to build the model first, so:
dummy_predictions = model.predict(np.zeros((1,) + input_shape))
Then you get the weights:
weights_and_biases = model.get_layer('last_layer').get_weights()
w, b = weights_and_biases
new_biases = np.array([-0.45752, 0.51344, 0.30730])
model.get_layer('last_layer').set_weights([w, new_biases])
Method 2
def bias_init(bias_shape):
return K.variable([-0.45752, 0.51344, 0.30730])
self.last_dense = layers.Dense(units=3, activation="softmax", bias_initializer=bias_init)
Just in addition to #Daniel Möller's answer, to solve the system of equations
f1 = e^b1 / (e^b1 + e^b2 + ... + e^b5)
...
f5 = e^b5 / (e^b1 + e^b2 + ... + e^b5)
You don't need excel or anything. Just compute bi = ln(fi).
To calculate fi = e^bi / (sum of e^bj), note that fi/fj = e^(bi-bj). Suppose the lowest frequency is fk. You can set bk= 0 and then compute every other class bias with bi = bj + ln(fi/fj).
A complete answer is here:
### To solve that set of nonlinear equations, use scipy fsolve
from scipy.optimize import fsolve
from math import exp
# define the frequency of different classes
f=(0.4, 0.3, 0.2, 0.08, 0.02)
# define the equation
def eqn(x, frequency):
sum_exp = sum([exp(x_i) for x_i in x])
return [exp(x[i])/sum_exp - frequency[i] for i in range(len(frequency))]
# calculate bias init
bias_init = fsolve(func=eqn,
x0=[0]*len(f),
).tolist()
bias_init
To put all things together
def init_imbalanced_class_weight_bias(df:pd.DataFrame, lable:str):
"""To handle imbalanced classification, provide initial bias list and class weight dictionary to 2 places in a tf classifier
1) In the last layer of classifier: tf.keras.layers.Dense(..., bias_initializer = bias_init)
2) model.fit(train_ds, #x=dict(X_train), y=y_train,
batch_size=batch_size,
validation_data= valid_ds, #(dict(X_test), y_test),
epochs=epochs,
callbacks=callbacks,
class_weight=class_weight,
)
Args:
df:pd.DataFrame=train_df
label:str
Returns:
class_weight:dict, e.g. {0: 1.6282051282051282, 1: 0.7604790419161677, 2: 0.9338235294117647}
bias_init:list e.g. [0.3222079660508266, 0.1168690393701237, -0.43907701967633633]
Examples:
class_weight, bias_init = init_imbalanced_class_weight_bias(df=train_df, lable=label)
References:
1. https://www.tensorflow.org/tutorials/structured_data/imbalanced_data
2. https://stackoverflow.com/questions/60307239/setting-bias-for-multiclass-classification-python-tensorflow-keras#new-answer
"""
from scipy.optimize import fsolve
from math import exp
# to deal with imbalance classification, calculate class_weight
d = dict(df[label].value_counts())
m = np.mean(list(d.values()))
class_weight = {k:m/v for (k,v) in d.items()} #e.g. {0: 1.6282051282051282, 1: 0.7604790419161677, 2: 0.9338235294117647}
# define classes frequency list
frequency = list(list(d.values())/sum(d.values()))
# define equations to solve initial bias
def eqn(x, frequency=frequency):
sum_exp = sum([exp(x_i) for x_i in x])
return [exp(x[i])/sum_exp - frequency[i] for i in range(len(frequency))]
# calculate init bias
bias_init = fsolve(func=eqn,
x0=[0]*len(frequency),
).tolist()
return class_weight, bias_init
class_weight, bias_init = init_imbalanced_class_weight_bias(df=train_df, lable=label)
I will post a colab notebook if anyone interested.
In case your tf classifier complains about ValueError: ('Could not interpret initializer identifier:', then add the tf.keras.initializers.Constant() around bias_init:
def init_imbalanced_class_weight_bias(...)
...
return class_weight, tf.keras.initializers.Constant(bias_init)

How to display the convolution filters used on a CNN with Tensorflow?

I would like to produce figures similar to this one:
To do that, with Tensorflow I load my model and then, using this code I am about to select the variable with filters from one layer :
# search for the name of the specific layer with the filters I want to display
for v in tf.trainable_variables():
print(v.name)
# store the filters into a variable
var = [v for v in tf.trainable_variables() if v.name == "model/center/kernel:0"][0]
doing var.eval() I am able to store var into a numpy array.
This numpy array have this shape: (3, 3, 512, 512) which correspond to the kernel size: 3x3 and the number of filters: 512.
My problem is the following: How can I extract 1 filter from this 3,3,512,512 array to display it ? If I understand how to do that, I will find how to display the 512 filters
Since you are using Tensorflow, you might be using tf.keras.Sequential for building the CNN Model, and model.summary() gives the names of all the Layers, along with Shapes, as shown below:
Once you have the Layer Name, you can Visualize the Convolutional Filters of that Layer of CNN as shown in the code below:
#-------------------------------------------------
#Utility function for displaying filters as images
#-------------------------------------------------
def deprocess_image(x):
x -= x.mean()
x /= (x.std() + 1e-5)
x *= 0.1
x += 0.5
x = np.clip(x, 0, 1)
x *= 255
x = np.clip(x, 0, 255).astype('uint8')
return x
#---------------------------------------------------------------------------------------------------
#Utility function for generating patterns for given layer starting from empty input image and then
#applying Stochastic Gradient Ascent for maximizing the response of particular filter in given layer
#---------------------------------------------------------------------------------------------------
def generate_pattern(layer_name, filter_index, size=150):
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
grads = K.gradients(loss, model.input)[0]
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
iterate = K.function([model.input], [loss, grads])
input_img_data = np.random.random((1, size, size, 3)) * 20 + 128.
step = 1.
for i in range(80):
loss_value, grads_value = iterate([input_img_data])
input_img_data += grads_value * step
img = input_img_data[0]
return deprocess_image(img)
#------------------------------------------------------------------------------------------
#Generating convolution layer filters for intermediate layers using above utility functions
#------------------------------------------------------------------------------------------
layer_name = 'conv2d_4'
size = 299
margin = 5
results = np.zeros((8 * size + 7 * margin, 8 * size + 7 * margin, 3))
for i in range(8):
for j in range(8):
filter_img = generate_pattern(layer_name, i + (j * 8), size=size)
horizontal_start = i * size + i * margin
horizontal_end = horizontal_start + size
vertical_start = j * size + j * margin
vertical_end = vertical_start + size
results[horizontal_start: horizontal_end, vertical_start: vertical_end, :] = filter_img
plt.figure(figsize=(20, 20))
plt.savefig(results)
The above code Visualizes only 64 filters of a Layer. You can change it accordingly.
For more information, you can refer this article.

How to implement Gaussian Mixture for VAE?

I feel like I don't really know what I'm doing so I will describe what I think I'm doing and what I want to do and where that fails.
Given a normal variational autoencoder:
...
net = tf.layers.dense(net, units=code_size * 2, activation=None)
mean = net[:, :code_size]
std = net[:, code_size:]
posterior = tfd.MultivariateNormalDiagWithSoftplusScale(mean, std)
net = posterior.sample()
net = tf.layers.dense(net, units=input_size, ...)
...
What I think I'm doing: Let the neural network find a "mean" and "std" value and use it to create a Normal distribution (Gaussian).
Sample from that distribution and use that for the decoder.
In other words: learn a Gaussian distribution of the encoding
Now I would like to do the same for a mixture of Gaussians.
...
net = tf.layers.dense(net, units=code_size * 2 * code_size, activation=None)
means, stds = tf.split(net, 2, axis=-1)
means = tf.split(means, code_size, axis=-1)
stds = tf.split(stds, code_size, axis=-1)
components = [tfd.MultivariateNormalDiagWithSoftplusScale(means[i], stds[i]) for i in range(code_size)]
probs = [1.0 / code_size] * code_size
gauss_mix = tfd.Mixture(cat=tfd.Categorical(probs=probs), components=components)
net = gauss_mix.sample()
net = tf.layers.dense(net, units=input_size, ...)
...
That seemed relatively straight forward for me except that it fails with the following error:
Shapes () and (?,) are not compatible
This seems to come from probs that doesn't have the batch dimension (I didn't thought it would need that).
I thought that probs defines the probability between the components.
If I define a probs that also has the batch dimension I get the following cryptic error I don't know what it should mean:
Dimension -1796453376 must be >= 0
Do I generally misunderstand some concepts?
Or what do I need to do differently?

How to set weights in multi-class classification in xgboost for imbalanced data?

I know that you can set scale_pos_weight for an imbalanced dataset. However, How to deal with the multi-classification problem in the imbalanced dataset. I have gone through https://datascience.stackexchange.com/questions/16342/unbalanced-multiclass-data-with-xgboost/18823 but don't quite understand how to set weight parameter in Dmatrix.
Can anyone please explain in detail?
For imbalanced dataset, I used the "weights" parameter in Xgboost where weights is an array of weight assigned according to the class the data belongs to.
def CreateBalancedSampleWeights(y_train, largest_class_weight_coef):
classes = np.unique(y_train, axis = 0)
classes.sort()
class_samples = np.bincount(y_train)
total_samples = class_samples.sum()
n_classes = len(class_samples)
weights = total_samples / (n_classes * class_samples * 1.0)
class_weight_dict = {key : value for (key, value) in zip(classes, weights)}
class_weight_dict[classes[1]] = class_weight_dict[classes[1]] *
largest_class_weight_coef
sample_weights = [class_weight_dict[y] for y in y_train]
return sample_weights
Just pass the target column and the occurance rate of most frequent class (if most frequent class has 75 out of 100 samples, then its 0.75)
largest_class_weight_coef =
max(df_copy['Category'].value_counts().values)/df.shape[0]
#pass y_train as numpy array
weight = CreateBalancedSampleWeights(y_train, largest_class_weight_coef)
#And then use it like this
xg = XGBClassifier(n_estimators=1000, weights = weight, max_depth=20)
Thats it :)

Visualizing output of convolutional layer in tensorflow

I'm trying to visualize the output of a convolutional layer in tensorflow using the function tf.image_summary. I'm already using it successfully in other instances (e. g. visualizing the input image), but have some difficulties reshaping the output here correctly. I have the following conv layer:
img_size = 256
x_image = tf.reshape(x, [-1,img_size, img_size,1], "sketch_image")
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
So the output of h_conv1 would have the shape [-1, img_size, img_size, 32]. Just using tf.image_summary("first_conv", tf.reshape(h_conv1, [-1, img_size, img_size, 1])) Doesn't account for the 32 different kernels, so I'm basically slicing through different feature maps here.
How can I reshape them correctly? Or is there another helper function I could use for including this output in the summary?
I don't know of a helper function but if you want to see all the filters you can pack them into one image with some fancy uses of tf.transpose.
So if you have a tensor that's images x ix x iy x channels
>>> V = tf.Variable()
>>> print V.get_shape()
TensorShape([Dimension(-1), Dimension(256), Dimension(256), Dimension(32)])
So in this example ix = 256, iy=256, channels=32
first slice off 1 image, and remove the image dimension
V = tf.slice(V,(0,0,0,0),(1,-1,-1,-1)) #V[0,...]
V = tf.reshape(V,(iy,ix,channels))
Next add a couple of pixels of zero padding around the image
ix += 4
iy += 4
V = tf.image.resize_image_with_crop_or_pad(image, iy, ix)
Then reshape so that instead of 32 channels you have 4x8 channels, lets call them cy=4 and cx=8.
V = tf.reshape(V,(iy,ix,cy,cx))
Now the tricky part. tf seems to return results in C-order, numpy's default.
The current order, if flattened, would list all the channels for the first pixel (iterating over cx and cy), before listing the channels of the second pixel (incrementing ix). Going across the rows of pixels (ix) before incrementing to the next row (iy).
We want the order that would lay out the images in a grid.
So you go across a row of an image (ix), before stepping along the row of channels (cx), when you hit the end of the row of channels you step to the next row in the image (iy) and when you run out or rows in the image you increment to the next row of channels (cy). so:
V = tf.transpose(V,(2,0,3,1)) #cy,iy,cx,ix
Personally I prefer np.einsum for fancy transposes, for readability, but it's not in tf yet.
newtensor = np.einsum('yxYX->YyXx',oldtensor)
anyway, now that the pixels are in the right order, we can safely flatten it into a 2d tensor:
# image_summary needs 4d input
V = tf.reshape(V,(1,cy*iy,cx*ix,1))
try tf.image_summary on that, you should get a grid of little images.
Below is an image of what one gets after following all the steps here.
In case someone would like to "jump" to numpy and visualize "there" here is an example how to display both Weights and processing result. All transformations are based on prev answer by mdaoust.
# to visualize 1st conv layer Weights
vv1 = sess.run(W_conv1)
# to visualize 1st conv layer output
vv2 = sess.run(h_conv1,feed_dict = {img_ph:x, keep_prob: 1.0})
vv2 = vv2[0,:,:,:] # in case of bunch out - slice first img
def vis_conv(v,ix,iy,ch,cy,cx, p = 0) :
v = np.reshape(v,(iy,ix,ch))
ix += 2
iy += 2
npad = ((1,1), (1,1), (0,0))
v = np.pad(v, pad_width=npad, mode='constant', constant_values=p)
v = np.reshape(v,(iy,ix,cy,cx))
v = np.transpose(v,(2,0,3,1)) #cy,iy,cx,ix
v = np.reshape(v,(cy*iy,cx*ix))
return v
# W_conv1 - weights
ix = 5 # data size
iy = 5
ch = 32
cy = 4 # grid from channels: 32 = 4x8
cx = 8
v = vis_conv(vv1,ix,iy,ch,cy,cx)
plt.figure(figsize = (8,8))
plt.imshow(v,cmap="Greys_r",interpolation='nearest')
# h_conv1 - processed image
ix = 30 # data size
iy = 30
v = vis_conv(vv2,ix,iy,ch,cy,cx)
plt.figure(figsize = (8,8))
plt.imshow(v,cmap="Greys_r",interpolation='nearest')
you may try to get convolution layer activation image this way:
h_conv1_features = tf.unpack(h_conv1, axis=3)
h_conv1_imgs = tf.expand_dims(tf.concat(1, h_conv1_features_padded), -1)
this gets one vertical stripe with all images concatenated vertically.
if you want them padded (in my case of relu activations to pad with white line):
h_conv1_features = tf.unpack(h_conv1, axis=3)
h_conv1_max = tf.reduce_max(h_conv1)
h_conv1_features_padded = map(lambda t: tf.pad(t-h_conv1_max, [[0,0],[0,1],[0,0]])+h_conv1_max, h_conv1_features)
h_conv1_imgs = tf.expand_dims(tf.concat(1, h_conv1_features_padded), -1)
I personally try to tile every 2d-filter in a single image.
For doing this -if i'm not terribly mistaken since I'm quite new to DL- I found out that it could be helpful to exploit the depth_to_space function, since it takes a 4d tensor
[batch, height, width, depth]
and produces an output of shape
[batch, height*block_size, width*block_size, depth/(block_size*block_size)]
Where block_size is the number of "tiles" in the output image. The only limitation to this is that the depth should be the square of block_size, which is an integer, otherwise it cannot "fill" the resulting image correctly.
A possible solution could be of padding the depth of the input tensor up to a depth that is accepted by the method, but I sill havn't tried this.
Another way, which I think very easy, is using the get_operation_by_name function. I had hard time visualizing the layers with other methods but this helped me.
#first, find out the operations, many of those are micro-operations such as add etc.
graph = tf.get_default_graph()
graph.get_operations()
#choose relevant operations
op_name = '...'
op = graph.get_operation_by_name(op_name)
out = sess.run([op.outputs[0]], feed_dict={x: img_batch, is_training: False})
#img_batch is a single image whose dimensions are (1,n,n,1).
# out is the output of the layer, do whatever you want with the output
#in my case, I wanted to see the output of a convolution layer
out2 = np.array(out)
print(out2.shape)
# determine, row, col, and fig size etc.
for each_depth in range(out2.shape[4]):
fig.add_subplot(rows, cols, each_depth+1)
plt.imshow(out2[0,0,:,:,each_depth], cmap='gray')
For example below is the input(colored cat) and output of the second conv layer in my model.
Note that I am aware this question is old and there are easier methods with Keras but for people who use an old model from other people (such as me), this may be useful.