I'm trying to calculate the gradients of the samples from a Bernoulli distribution w.r.t. the probabilities p (of a sample being 1).
I tried using both the implementation of the Bernoulli distribution provided in tensorflow.contrib.distributions and my own simple implementation based on this discussion. However both methods fail when I try to calculate the gradients.
Using the Bernoulli implementation:
import tensorflow as tf
from tensorflow.contrib.distributions import Bernoulli
p = tf.constant([0.2, 0.6])
b = Bernoulli(p=p)
s = b.sample()
g = tf.gradients(s, p)
with tf.Session() as session:
print(session.run(g))
The above code gives me the following error:
TypeError: Fetch argument None has invalid type <class 'NoneType'>
Using my implementation:
import tensorflow as tf
p = tf.constant([0.2, 0.6])
shape = [1, 2]
s = tf.select(tf.random_uniform(shape) - p > 0.0, tf.ones(shape), tf.zeros(shape))
g = tf.gradients(s, p)
with tf.Session() as session:
print(session.run(g))
Same error:
TypeError: Fetch argument None has invalid type <class 'NoneType'>
Is there a way to calculate the gradients of Bernoulli samples?
(My TensorFlow version is 0.12).
You cannot backprop through a discrete stochastic node for obvious reasons. As gradients are not defined.
However if you approximate the Bernoulli with a continuos distribution controlled by a temperature parameter, yes you can.
This idea is called reparametrization trick and is implemented in the RelaxedBernoulli in Tensorflow Probability (or also in TF.contrib library)
Relaxed bernoulli
You can specify the probability p of your Bernoulli, which is your random variable, et voilĂ .
Related
I had already implemented an optimization (gradient descent) algorithm by using the Tensorflow-probability built-in KL-Divergence as a loss function. Theoretically it worked well, but then I found out that the list of registered distributions, which you are able to compare in the KL-Divergence, is quite limited. I tried to minimize the KL-Divergence of a Gaussian Mixture Model (as the true distribution) and a Normal Distribution (optimize Mean and Std, such that KL-Divergence becomes minimal), which was not possible.
So I tried to implement my own approach, which did not work:
import numpy as np
from scipy.stats import norm
import tensorflow as tf
The idea I had was to create densities of the needed distributions via scipy.stats (lets say Normal distributions) and transform the density-variables to Tensors:
x = np.arange(-10,10,0.001)
mu_train = tf.Variable(2.0)
p_pdf = norm.pdf(x, 0, 1)
q_pdf = norm.pdf(x, mu_train,1)
p = tf.convert_to_tensor(p_pdf)
q = tf.convert_to_tensor(q_pdf, dtype=tf.float64)
Now I defined the KL-Divergence as a function that only depends on q.
def kl_loss(q):
return tf.reduce_sum(
tf.where(p == 0, tf.zeros(p.shape, tf.float64), p * tf.math.log(p / q))
)
Then I calculated the gradient of kl_loss with respect to mu_train, but the output I get from this a "None".
with tf.GradientTape() as tape:
tape.watch(mu_train)
loss = kl_loss(q)
d_loss_d_mu = tape.gradient(loss, mu_train)
print(d_loss_d_mu)
Now that I have thought about it.. to get a "None" as output makes sense to me, since kl_loss(q) is a function that does only depend on the values "q(x)", that are generated by the density q, but it does not depend on mu_train directly, since this is just a parameter of the Normal Distribution but the input for the kl_loss is an array/tensor of values of the normal distribution..
Does anyone know how I can find a workaround for this or does anyone else have a completely different solution to get the KL-Divergence as a loss function with arbitrary distributions, such that I can compute gradients with respect to parameters and run a GradientDescent Minimizer.
The following code:
import tensorflow as tf
tfd = tf.contrib.distributions
mean = [0.0, 0.0]
scale = [1.0, 1.0]
dist = tfd.MultivariateNormalDiag(loc=mean, scale_diag=scale)
samp = dist.sample([None])
Gives the error:
TypeError: Expected int32, got None of type '_Message' instead.
But generates n samples from the distribution if None is replaced with an integer n. Is there any way to get an unknown number of samples from the distribution?
EDIT: The original question may be badly phrased; I want to sample a tensor of shape (None, ...) to combine with other tensors of this shape. Clearly somewhere in there an input is needed to fix the size at runtime.
You could do
num_samples = tf.placeholder(dtype=tf.int32, shape=())
sampl = dist.sample(num_samples)
and then feed in the number of samples. Likewise, if you have a scalar tensor representing the number of samples, you can pass that in.
I'm beginner in tensorflow and i want to apply Frobenius normalization on a tensor but when i searched i didn't find any function related to it in tensorflow and i couldn't implement it using tensorflow ops, i can implement it with numpy operations, but how can i do this using tensorflow ops only ??
My implementation using numpy in python
def Frobenius_Norm(tensor):
x = np.power(tensor,2)
x = np.sum(x)
x = np.sqrt(x)
return x
def frobenius_norm_tf(M):
return tf.reduce_sum(M ** 2) ** 0.5
This is very similar to the question skflow regression predict multiple values. However, later versions of TensorFlow seem to rendered the answer from this question obsolete.
I would like to be able to have multiple output neurons in a TensorFlow Learn regression neural network (DNNRegressor). I upgraded the code from the referenced question to account for breaking changes in TensorFlow, but still get an error.
import numpy as np
import tensorflow.contrib.learn as skflow
import tensorflow as tf
from sklearn.metrics import mean_squared_error
# Create random dataset.
rng = np.random.RandomState(1)
X = np.sort(200 * rng.rand(100, 1) - 100, axis=0)
y = np.array([np.pi * np.sin(X).ravel(), np.pi * np.cos(X).ravel()]).T
# Fit regression DNN model.
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=X.shape[0])]
regressor = skflow.DNNRegressor(hidden_units=[5, 5],feature_columns=feature_columns)
regressor.fit(X, y)
score = mean_squared_error(regressor.predict(X), y)
print("Mean Squared Error: {0:f}".format(score))
But this results in:
ValueError: Shapes (?, 1) and (?, 2) are incompatible
I don't seen any release notes about breaking changes that indicate that the method for multiple outputs have changed. Is there another way to do this?
As mentioned in the tf.contrib.learn.DNNRegressor docs, you may use the label_dimension parameter, which is exactly what you are looking for.
Your code line with this param will do what you want:
regressor = skflow.DNNRegressor(hidden_units=[5, 5],
feature_columns=feature_columns,
label_dimension=2)
The standard predict() returns an generator object. To get an array, you have to add as_iterable=False:
score = metrics.mean_squared_error(regressor.predict(X, as_iterable=False), y)
I'm trying to write my own cost function in tensor flow, however apparently I cannot 'slice' the tensor object?
import tensorflow as tf
import numpy as np
# Establish variables
x = tf.placeholder("float", [None, 3])
W = tf.Variable(tf.zeros([3,6]))
b = tf.Variable(tf.zeros([6]))
# Establish model
y = tf.nn.softmax(tf.matmul(x,W) + b)
# Truth
y_ = tf.placeholder("float", [None,6])
def angle(v1, v2):
return np.arccos(np.sum(v1*v2,axis=1))
def normVec(y):
return np.cross(y[:,[0,2,4]],y[:,[1,3,5]])
angle_distance = -tf.reduce_sum(angle(normVec(y_),normVec(y)))
# This is the example code they give for cross entropy
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
I get the following error:
TypeError: Bad slice index [0, 2, 4] of type <type 'list'>
At present, tensorflow can't gather on axes other than the first - it's requested.
But for what you want to do in this specific situation, you can transpose, then gather 0,2,4, and then transpose back. It won't be crazy fast, but it works:
tf.transpose(tf.gather(tf.transpose(y), [0,2,4]))
This is a useful workaround for some of the limitations in the current implementation of gather.
(But it is also correct that you can't use a numpy slice on a tensorflow node - you can run it and slice the output, and also that you need to initialize those variables before you run. :). You're mixing tf and np in a way that doesn't work.
x = tf.Something(...)
is a tensorflow graph object. Numpy has no idea how to cope with such objects.
foo = tf.run(x)
is back to an object python can handle.
You typically want to keep your loss calculation in pure tensorflow, so do the cross and other functions in tf. You'll probably have to do the arccos the long way, as tf doesn't have a function for it.
just realized that the following failed:
cross_entropy = -tf.reduce_sum(y_*np.log(y))
you cant use numpy functions on tf objects, and the indexing my be different too.
I think you can use "Wraps Python function" method in tensorflow. Here's the link to the documentation.
And as for the people who answered "Why don't you just use tensorflow's built in function to construct it?" - sometimes the cost function people are looking for cannot be expressed in tf's functions or extremely difficult.
This is because you have not initialized your variable and because of this it does not have your Tensor there right now (can read more in my answer here)
Just do something like this:
def normVec(y):
print y
return np.cross(y[:,[0,2,4]],y[:,[1,3,5]])
t1 = normVec(y_)
# and comment everything after it.
To see that you do not have a Tensor now and only Tensor("Placeholder_1:0", shape=TensorShape([Dimension(None), Dimension(6)]), dtype=float32).
Try initializing your variables
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
and evaluate your variable sess.run(y). P.S. you have not fed your placeholders up till now.