I'm trying to find a way to update a slice of tensorflow variable iteratively. An example is the computation of all fibonacci numbers up to N. I want to end up with an tensor of size N such that F[0] = 0, F[1]=1 and F[n] = F[n-1] + F[n-2].
Below is one way to implement the Fibonacci Sequence in TensorFlow. Following code will print sequence as 2x1 2D tensor Variable.
Hope this help!!
import tensorflow as tf
with tf.Graph().as_default() as g:
fib_matrix = tf.constant([[0.0, 1.0],
[1.0, 1.0]])
fib_sequence = tf.Variable([[0.0], [1.0]])
# Multiply fib_matrix and fib_sequence.
next_fib = tf.matmul(fib_matrix, fib_sequence)
# Assign result back to fig_sequence.
assign_op = tf.assign(fib_sequence, next_fib)
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
for step in range(10):
sess.run(assign_op)
print(sess.run(fib_sequence))
Related
My goal is to build a script to change an operation into another one using TF's graph editor. So far I tried making a script that just changes the input kernel weights of a Conv2D, but to no avail, as the interface is pretty confusing.
with tf.Session() as sess:
model_filename = sys.argv[1]
with gfile.FastGFile(model_filename, 'r') as f:
graph_def = graph_pb2.GraphDef()
text_format.Merge(f.read(), graph_def)
importer.import_graph_def(graph_def)
#my_sgv = ge.sgv("Conv2D", graph=tf.get_default_graph())
#print my_sgv
convs = find_conv2d_ops(tf.get_default_graph())
print convs
my_sgv = ge.sgv(convs)
print my_sgv
conv_tensor = tf.get_default_graph().get_tensor_by_name(convs[0].name + ':0')
conv_weights_input = tf.get_default_graph().get_tensor_by_name(convs[0].inputs[1].name)
weights_new = tf.Variable(tf.truncated_normal([1, 1, 1, 8], stddev=0.03),
name='Wnew')
ge.graph_replace(conv_tensor, {conv_weights_input: weights_new})
The error is "input needs to be a Tensor: ". Can someone please provide some insights?
Since you are dealing with a tf.Variable you don't need to use graph editor. tf.assign will be sufficient.
You can use it like the following:
assign_op = tf.assign(conv_weights_input, weights_new)
with tf.Session() as sess:
sess.run(assign_op)
If you are looking to sub out operations and not weights. Consider the following example (modified from this example):
import tensorflow as tf
import tensorflow.contrib.graph_editor as ge
def build():
a_pl = tf.placeholder(dtype=tf.float32, name="a")
b_pl = tf.placeholder(dtype=tf.float32, name="b")
c = tf.add(a_pl, b_pl, name="c")
build() #or load graph from disc
a = tf.constant(1.0, shape=[2, 3], name="a_const")
b = tf.constant(2.0, shape=[2, 3], name="b_const")
a_pl = tf.get_default_graph().get_tensor_by_name("a:0")
b_pl = tf.get_default_graph().get_tensor_by_name("b:0")
c = tf.get_default_graph().get_tensor_by_name("c:0")
c_ = ge.graph_replace(c, {a_pl: a, b_pl: b})
with tf.Session() as sess:
#no need for placeholders
print(sess.run(c_))
#will give error since a_pl and b_pl have no value
print(sess.run(c))
The issue with your code is that you're dealing with wights, and not tensors. The crux of the above example is that the first argument is the target tensor (output tensor) that have the to be replaced tensors as dependencies. The second argument are the actual tensors you want to replace.
It's also worth noting that conv_weights_input is actually a tensor, where weights_new is a tf.Variable. I believe what you want is to replace weights_new with a new conv operation with random weight initialisation.
Is it possible to get samples from a tensor that depends on a random variable in tensorflow? I need to get an approximate sample distribution to use in a loss function to be optimized. Specifically, in the example below, I want to be able to obtain samples of Y_output in order to be able to calculate the mean and variance of the output distribution and use these parameters in a loss function.
def sample_weight(mean, phi, seed=1):
P_epsilon = tf.distributions.Normal(loc=0., scale=1.0)
epsilon_s = P_epsilon.sample([1])
s = tf.multiply(epsilon_s, tf.log(1.0+tf.exp(phi)))
weight_sample = mean + s
return weight_sample
X = tf.placeholder(tf.float32, shape=[None, 1], name="X")
Y_labels = tf.placeholder(tf.float32, shape=[None, 1], name="Y_labels")
sw0 = sample_weight(u0,p0)
sw1 = sample_weight(u1,p1)
Y_output = sw0 + tf.multiply(sw1,X)
loss = tf.losses.mean_squared_error(labels=Y_labels, predictions=Y_output)
train_op = tf.train.AdamOptimizer(0.5e-1).minimize(loss)
init_op = tf.global_variables_initializer()
losses = []
predictions = []
Fx = lambda x: 0.5*x + 5.0
xrnge = 50
xs, ys = build_toy_data(funcx=Fx, stdev=2.0, num=xrnge)
with tf.Session() as sess:
sess.run(init_op)
iterations=1000
for i in range(iterations):
stat = sess.run(loss, feed_dict={X: xs, Y_labels: ys})
Not sure if this answers your question, but: when you have a Tensor downstream from a sampling Op (e.g., the Op created by your call to P_epsilon.sample([1]), anytime you call sess.run on the downstream Tensor, the sample op will be re-run, and produce a new random value. Example:
import tensorflow as tf
from tensorflow_probability import distributions as tfd
n = tfd.Normal(0., 1.)
s = n.sample()
y = s**2
sess = tf.Session() # Don't actually do this -- use context manager
print(sess.run(y))
# ==> 0.13539088
print(sess.run(y))
# ==> 0.15465781
print(sess.run(y))
# ==> 4.7929106
If you want a bunch of samples of y, you could do
import tensorflow as tf
from tensorflow_probability import distributions as tfd
n = tfd.Normal(0., 1.)
s = n.sample(100)
y = s**2
sess = tf.Session() # Don't actually do this -- use context manager
print(sess.run(y))
# ==> vector of 100 squared random normal values
We also have some cool tools in tensorflow_probability to do the kind of thing you're driving at here. Namely the Bijector API and, somewhat simpler, the trainable_distributions API.
(Another minor point: I'd suggest using tf.nn.softplus, or at a minimum tf.log1p(tf.exp(x)) instead of tf.log(1.0 + tf.exp(x)). The latter has poor numerical properties due to floating point imprecision, which the former are optimized for).
Hope this is some help!
I have some very simple tensorflow code to rotate a vector:
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=(2, 1))
angle = tf.placeholder(tf.float32)
s_a = tf.sin(angle)
c_a = tf.cos(angle)
R = tf.Variable([[c_a, s_a], [-s_a, c_a]], tf.float32, expected_shape=(2,2))
#R = tf.Variable([[1.0, 0.0], [0.0, 1.0]], tf.float32)
rotated_v = tf.matmul(R,x)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
res = sess.run([init,rotated_v], feed_dict={x:np.array([[1.0],[1.0]]), angle:1.0})
print(res)
The code works fine when I hand-code the identity matrix. However, in its current form I get this error:
ValueError: initial_value must have a shape specified: Tensor("Variable/initial_value:0", dtype=float32)
I've tried specifying the shape in multiple ways, but I can't make this work.
What am I doing wrong?
I have figured out a way to achieve this (might not be the best way, but it works).
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=(2, 1))
angle = tf.placeholder(tf.float32)
s_a = tf.sin(angle)
c_a = tf.cos(angle)
R = tf.Variable([[1.0, 0.0], [0.0, 1.0]], tf.float32)
assignR = tf.assign(R, [[c_a, s_a], [-s_a, c_a]])
rotated_v = tf.matmul(R,x)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
newR = sess.run(assignR, feed_dict={angle:1.0})
print(newR)
print()
res = sess.run([rotated_v], feed_dict={x:np.array([[1.0],[1.0]])})
print(res)
This approach won't work, because s_a and c_a are the ops outputs, which values are uniquely determined by angle. You can't assign or update these nodes, so training them doesn't make any sense.
This line, on the other hand...
R = tf.Variable([[1.0, 0.0], [0.0, 1.0]], tf.float32)
... is a definition of an independent variable with initial value equal to identity matrix. This is perfectly valid. Since this variable is independent, you can assign a new value to it, which consists of s_a and c_a. Note that you can't initialize it with s_a and c_a, because the initializer is run before the values are fed into a session (so angle is unknown).
Below is the code that I am running where in I am implementing a paper. I take two matrices, multiply them and then perform clustering. What am I doing wrong?
import tensorflow as tf
from sklearn.cluster import KMeans
import numpy as np
a = np.random.rand(10,10)
b = np.random.rand(10,5)
F = tf.placeholder("float", [None, 10], name='F')
mask = tf.placeholder("float", [None, 5], name='mask')
def getZfeature(F,mask):
return tf.matmul(F,mask)
def cluster(zfeature):
#km = KMeans(n_clusters=3)
#km.fit(x)
#mu = km.cluster_centers_
return zfeature
def computeQ(zfeature,mu):
print "computing q matrix"
print type(zfeature), type(mu)
#construct model
zfeature = getZfeature(F,mask)
mu = cluster(zfeature)
q = computeQ(zfeature,mu)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
sess.run(q, feed_dict={F: a, mask: b})
Working code below. Your problem is that q and mu don't do anything. q is a reference to the function computeQ as it doesn't return anything. mu doesn't do anything so in this answer I have evaluated zfeature. You can do more tensor operations in these two functions but you need to return a tensor for it to work.
import tensorflow as tf
from sklearn.cluster import KMeans
import numpy as np
a = np.random.rand(10,10)
b = np.random.rand(10,5)
F = tf.placeholder("float", [None, 10], name='F')
mask = tf.placeholder("float", [None, 5], name='mask')
def getZfeature(F,mask):
return tf.matmul(F,mask)
def cluster(zfeature):
#km = KMeans(n_clusters=3)
#km.fit(x)
#mu = km.cluster_centers_
return zfeature
def computeQ(zfeature,mu):
print ("computing q matrix")
print (type(zfeature), type(mu))
#construct model
zfeature = getZfeature(F,mask)
mu = cluster(zfeature)
q = computeQ(zfeature,mu)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
result=sess.run(zfeature, feed_dict={F: a, mask: b})
print(result)
Here is the code for k-means clustering using tensorflow from https://codesachin.wordpress.com/2015/11/14/k-means-clustering-with-tensorflow/
import tensorflow as tf
from random import choice, shuffle
from numpy import array
def TFKMeansCluster(vectors, noofclusters):
"""
K-Means Clustering using TensorFlow.
'vectors' should be a n*k 2-D NumPy array, where n is the number
of vectors of dimensionality k.
'noofclusters' should be an integer.
"""
noofclusters = int(noofclusters)
assert noofclusters < len(vectors)
#Find out the dimensionality
dim = len(vectors[0])
#Will help select random centroids from among the available vectors
vector_indices = list(range(len(vectors)))
shuffle(vector_indices)
#GRAPH OF COMPUTATION
#We initialize a new graph and set it as the default during each run
#of this algorithm. This ensures that as this function is called
#multiple times, the default graph doesn't keep getting crowded with
#unused ops and Variables from previous function calls.
graph = tf.Graph()
with graph.as_default():
#SESSION OF COMPUTATION
sess = tf.Session()
##CONSTRUCTING THE ELEMENTS OF COMPUTATION
##First lets ensure we have a Variable vector for each centroid,
##initialized to one of the vectors from the available data points
centroids = [tf.Variable((vectors[vector_indices[i]]))
for i in range(noofclusters)]
##These nodes will assign the centroid Variables the appropriate
##values
centroid_value = tf.placeholder("float64", [dim])
cent_assigns = []
for centroid in centroids:
cent_assigns.append(tf.assign(centroid, centroid_value))
##Variables for cluster assignments of individual vectors(initialized
##to 0 at first)
assignments = [tf.Variable(0) for i in range(len(vectors))]
##These nodes will assign an assignment Variable the appropriate
##value
assignment_value = tf.placeholder("int32")
cluster_assigns = []
for assignment in assignments:
cluster_assigns.append(tf.assign(assignment,
assignment_value))
##Now lets construct the node that will compute the mean
#The placeholder for the input
mean_input = tf.placeholder("float", [None, dim])
#The Node/op takes the input and computes a mean along the 0th
#dimension, i.e. the list of input vectors
mean_op = tf.reduce_mean(mean_input, 0)
##Node for computing Euclidean distances
#Placeholders for input
v1 = tf.placeholder("float", [dim])
v2 = tf.placeholder("float", [dim])
euclid_dist = tf.sqrt(tf.reduce_sum(tf.pow(tf.sub(
v1, v2), 2)))
##This node will figure out which cluster to assign a vector to,
##based on Euclidean distances of the vector from the centroids.
#Placeholder for input
centroid_distances = tf.placeholder("float", [noofclusters])
cluster_assignment = tf.argmin(centroid_distances, 0)
##INITIALIZING STATE VARIABLES
##This will help initialization of all Variables defined with respect
##to the graph. The Variable-initializer should be defined after
##all the Variables have been constructed, so that each of them
##will be included in the initialization.
init_op = tf.initialize_all_variables()
#Initialize all variables
sess.run(init_op)
##CLUSTERING ITERATIONS
#Now perform the Expectation-Maximization steps of K-Means clustering
#iterations. To keep things simple, we will only do a set number of
#iterations, instead of using a Stopping Criterion.
noofiterations = 100
for iteration_n in range(noofiterations):
##EXPECTATION STEP
##Based on the centroid locations till last iteration, compute
##the _expected_ centroid assignments.
#Iterate over each vector
for vector_n in range(len(vectors)):
vect = vectors[vector_n]
#Compute Euclidean distance between this vector and each
#centroid. Remember that this list cannot be named
#'centroid_distances', since that is the input to the
#cluster assignment node.
distances = [sess.run(euclid_dist, feed_dict={
v1: vect, v2: sess.run(centroid)})
for centroid in centroids]
#Now use the cluster assignment node, with the distances
#as the input
assignment = sess.run(cluster_assignment, feed_dict = {
centroid_distances: distances})
#Now assign the value to the appropriate state variable
sess.run(cluster_assigns[vector_n], feed_dict={
assignment_value: assignment})
##MAXIMIZATION STEP
#Based on the expected state computed from the Expectation Step,
#compute the locations of the centroids so as to maximize the
#overall objective of minimizing within-cluster Sum-of-Squares
for cluster_n in range(noofclusters):
#Collect all the vectors assigned to this cluster
assigned_vects = [vectors[i] for i in range(len(vectors))
if sess.run(assignments[i]) == cluster_n]
#Compute new centroid location
new_location = sess.run(mean_op, feed_dict={
mean_input: array(assigned_vects)})
#Assign value to appropriate variable
sess.run(cent_assigns[cluster_n], feed_dict={
centroid_value: new_location})
#Return centroids and assignments
centroids = sess.run(centroids)
assignments = sess.run(assignments)
return centroids, assignments
I am running the following code:
import tensorflow as tf
sess = tf.InteractiveSession()
y = tf.Variable(initial_value=[1,2])
sess.run(y, feed_dict={y: [100,2]})
Gives:
[100,2]
However, after that:
sess.run(y)
Gives the origianl value of y: [1,2].
Why doesn't the:
sess.run(y, feed_dict={y: [100,2]})
updates the value of y, and saves it?
Because feed_dict overrides the values of the keys of the dictionary.
With the statement:
sess.run(y, feed_dict={y: [100,2]})
you're telling tensorflow to replace the values of y with [100, 2] for the current computation. This is not an assignment.
Therefore, the next call
sess.run(y)
fetches the original variables and uses it.
If you want to assign a value to a variable, you have to define this operation in the computational graph, using tf.assing
If you want to use a feed dictionary, initialize a placeholder instead of a variable and define the output.
As an example (in the same style as your question code),
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
inputs = tf.placeholder(tf.int32, shape = (2,2))
output = tf.matmul(inputs, tf.transpose(inputs))
test_input = np.array([[10,2], [4,4]])
print test_input.shape
# (2,2)
sess.run(output, feed_dict = {inputs : test_input})
# array([[104, 48], [48, 32]], dtype=int32)
If you just want to change the value of a variable look to nessuno's answer.