What is the difference between functions from keras.backend and Tensorflow - tensorflow

I am trying to customize the loss function of keras. I saw the example:
import tensorflow as tf
import keras.backend as K
def mean_pred(y_true, y_pred):
return K.mean(y_pred)
Can I use something like:
def mean_pred(y_true, y_pred):
return tf.mean(y_pred)
Is there any difference?

Both are same and computes the mean of elements across dimensions of a tensor and Equivalent to Numpy mean i.e np.mean. Also it is tf.math.reduce_mean.

Related

How to dilate y_true inside a custom metric in keras/tensorflow?

I am trying to code a custom metric for U-net model implemented using keras/tensorflow. In the metric, I need to use the opencv function, 'cv2.dilate' on the ground truth. When I tried to use it, it gave the error as y_true is a tensor and cv2.dilate expects a numpy array.
Any idea on how to implement this?
I tried to convert tensor to numpy array but it is not working.
I searched for the tensorflow implementation of cv2.dilate but couldnt find one.
One possibility, if you are using a simple rectangular kernel in your dilation, is to use tf.nn.max_pool2d as a replacement.
import numpy as np
import tensorflow as tf
import cv2
image = np.random.random((28,28))
kernel_size = 3
# OpenCV dilation works with grayscale image, with H,W dimensions
dilated_cv = cv2.dilate(image, np.ones((kernel_size, kernel_size), np.uint8))
# TensorFlow maxpooling works with batch and channels: B,H,W,C dimenssions
image_w_batch_and_channels = image[None,...,None]
dilated_tf = tf.nn.max_pool2d(image_w_batch_and_channels, kernel_size, 1, "SAME")
# checking that the results are equal
np.allclose(dilated_cv, dilated_tf[0,...,0])
However, given that you mention that you are applying dilation on the ground truth, this dilation does not need to be differentiable. In that case, you can wrap your dilation in a tf.numpy_function
from functools import partial
# be sure to put the correct output type, tf.float64 is working in that specific case because numpy defaults to float64, but it might be different in your case
dilated_tf_npfunc = tf.numpy_function(
partial(cv2.dilate, kernel=np.ones((kernel_size, kernel_size), np.uint8)), [image]
)

Learning a Categorical Variable with TensorFlow Probability

I would like to use TFP to write a neural network where the output are the probabilities of a categorical variable with 3 classes, and train it using the negative log-likelihood.
As I'm moving my first steps with TF and TFP, I started with a toy model where the input layer has only 1 unit receiving a null input, and the output layer has 3 units with softmax activation function. The idea is that the biases should learn (up to an additive constant) the log of the probabilities.
Here below is my code, true_p are the true parameters I use to generate the data and I would like to learn, while learned_p is what I get from the NN.
import numpy as np
import tensorflow as tf
from tensorflow import keras
from functions import nll
from tensorflow.keras.optimizers import SGD
import tensorflow.keras.layers as layers
import tensorflow_probability as tfp
tfd = tfp.distributions
# params
true_p = np.array([0.1, 0.7, 0.2])
n_train = 1000
# training data
x_train = np.array(np.zeros(n_train)).reshape((n_train,))
y_train = np.array(np.random.choice(len(true_p), size=n_train, p=true_p)).reshape((n_train,))
# model
input_layer = layers.Input(shape=(1,))
p_layer = layers.Dense(len(true_p), activation=tf.nn.softmax)(input_layer)
p_y = tfp.layers.DistributionLambda(tfd.Categorical)(p_layer)
model_p = keras.models.Model(inputs=input_layer, outputs=p_y)
model_p.compile(SGD(), loss=nll)
# training
hist_p = model_p.fit(x=x_train, y=y_train, batch_size=100, epochs=3000, verbose=0)
# check result
learned_p = np.round(model_p.layers[1].call(tf.constant([0], shape=(1, 1))).numpy(), 3)
learned_p
With this setup, I get the result:
>>> learned_p
array([[0.005, 0.989, 0.006]], dtype=float32)
I over-estimate the second category, and can't really distinguish between the first and the third one. What's worst, if I plot the probabilities at the end of each epoch, it looks like they are converging monotonically to the vector [0,1,0], which doesn't make sense (it seems to me the gradient should push in the opposite direction once I start to over-estimate).
I really can't figure out what's going on here, but have the feeling I'm doing something plain wrong. Any idea? Thank you for your help!
For the record, I also tried using other optimizers like Adam or Adagrad playing with the hyper-params, but with no luck.
I'm using Python 3.7.9, TensorFlow 2.3.1 and TensorFlow probability 0.11.1
I believe the default argument to Categorical is not the vector of probabilities, but the vector of logits (values you'd take softmax of to get probabilities). This is to help maintain precision in internal Categorical computations like log_prob. I think you can simply eliminate the softmax activation function and it should work. Please update if it doesn't!
EDIT: alternatively you can replace the tfd.Categorical with
lambda p: tfd.Categorical(probs=p)
but you'll lose the aforementioned precision gains. Just wanted to clarify that passing probs is an option, just not the default.

How to get the scalar value of a function parameter which is a tensor in Tensorflow?

I want to get the scalar value of a function parameter like following code does:
import tensorflow as tf
#tf.function
def test(key_value):
tf.print(key_value.numpy())
a = tf.constant(0)
test(a)
But there is no numpy function when running in autograph.
numpy is only available outside of tf.function, where Tensors have actual values. Within tf.function, you have access to a restricted API. As long as you pass the tensor to a TensorFlow op, you don't need to call numpy:
import tensorflow as tf
#tf.function
def test(key_value):
tf.print(key_value)
a = tf.constant(0)
test(a)
Have a look at this guide for more info.

Keras model.evaluate and model.predict give different results

When I build a model using Lambda layer to calculate the sum along an axis, model.evaluate() gives different result in comparison to manually calculating from model.predict(). I have checked the output data types and shapes and it did not help. The problem only occurs when I use tensorflow as Keras backend. It works fine when I use CNTK as backend.
Here is a minimal code that can reproduce this behavior.
from keras.layers import Input,Lambda
import keras.backend as K
from keras.models import Model
import numpy as np
inp=Input((2,))
out=Lambda(lambda x:K.sum(x,axis=-1),output_shape=(1,))(inp)
model=Model(input=inp,output=out)
model.compile(loss="mse",optimizer="sgd")
xs=np.random.random((3,2))
ys=np.sum(xs,axis=1)
print(np.mean((model.predict(xs)-ys)**2)) # This is zero
print(model.evaluate(xs,ys)) # This is not zero
I establish a network with no parameters and should simply calculate the sum for each x, and ys are constructed so that it should be the same with the output from the model. model.predict() gives the same results as ys, but model.evaluate() gives something non-zero when I use tensorflow as backend.
Any idea about why this should happen? Thanks!

Getting TypeError while training a classifier for iris flower dataset

I am trying to experiment by taking the output layer as a linear layer for classifying the iris flower dataset and use regression ,with target values
ranging from 0,1 and 2.
I am using 1 hidden tanh activation layer and the another linear layer. I have by motive tried using this instead of one hot encoding for the labels as I want to compare the score from the 'model' function of my code as I am new to tensorflow .On running below code...
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import tensorflow as tf
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
data=load_iris()
X=data['data']
Y=data['target']
pca=PCA(n_components=2)
X=pca.fit_transform(X)
#visualise the data
#plt.figure(figsize=(12,12))
#plt.scatter(X[:,0],X[:,1],c=Y,alpha=0.4)
#plt.show()
labels=Y.reshape(-1,1)
x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.3,random_state=42)
y_train=y_train.reshape(-1,1)
y_test=y_test.reshape(-1,1)
hidden_nodes=5
batch_size=100
num_features=2
lr=0.01
g=tf.Graph()
with g.as_default():
tf_train_dataset=tf.placeholder(tf.float32,shape=[None,num_features])
tf_train_labels=tf.placeholder(tf.float32,shape=[None,1])
tf_test_dataset=tf.constant(x_test,dtype=tf.float32)
layer1_weights=tf.Variable(tf.truncated_normal([num_features,hidden_nodes]),dtype=tf.float32)
layer1_biases=tf.Variable(tf.zeros([hidden_nodes]),dtype=tf.float32)
layer2_weights=tf.Variable(tf.truncated_normal([hidden_nodes,1]),dtype=tf.float32)
layer2_biases=tf.Variable(tf.zeros([1]),dtype=tf.float32)
def model(data):
Z1=tf.matmul(data,layer1_weights)+layer1_biases
A1=tf.nn.relu(Z1)
Z2=tf.matmul(A1,layer2_weights)+layer2_biases
return Z2
model_scores=model(tf_train_dataset)
loss=tf.reduce_mean(tf.losses.mean_squared_error(model_scores,tf_train_labels))
optimizer=tf.train.GradientDescentOptimizer(lr).minimize(loss)
#train_prediction=model_scores
test_prediction=(tf_test_dataset)
num_steps=10001
with tf.Session() as sess:
init=tf.global_variables_initializer()
sess.run(init)
for step in range(num_steps):
offset=(step*batch_size)%(y_train.shape[0]-batch_size)
minibatch_data=x_train[offset:(offset+batch_size),:]
minibatch_labels=y_train[offset:(offset+batch_size)]
feed_dict={tf_train_dataset:minibatch_data,tf_train_labels:minibatch_labels}
ll,loss,scores=sess.run([optimizer,loss,model_scores],feed_dict=feed_dict)
if step%1000==0:
print('Minibatch loss at step {}:{}'.format(step,loss))
I get an error on line
ll,loss,scores=sess.run([optimizer,loss,model_scores],feed_dict=feed_dict)
TypeError: Fetch argument 14.686994 has invalid type , must be a string or Tensor. (Can not convert a float32 into a Tensor or Operation.)
Why is error coming, is it because of this line
model_scores=model(tf_train_dataset)
How should I go about solving this issue and can't the return value of model function be tensor or casted to tensor.
Thanks.
That is because of this line:
ll,loss,scores=sess.run([optimizer,loss,model_scores],feed_dict=feed_dict)
You replace loss tensor with loss value returned by sess.run. Just use a different variable to store loss value.