How to create a recurrent variable with TensorFlow - tensorflow

This sounds super easy but I cannot find any info on the internet. I am probably lacking some fundamental understanding.
I would like to do something simple: a recurrent variable. Say:
Z(t) = W * Z(t-1)
with some fixed (but trainable) W.
I tried things like:
initializer = tf.random_uniform_initializer(0., 1.)
with tf.variable_scope('recurrent', initializer=initializer):
Z = tf.get_variable('Z', shape=[...])
Z = tf.matmul(W, Z)
But of course, within a session, if I do Z.eval(), it gives a coherent value of Z, but Z itself is not updated.
Hence my question: how do you create a recurrent variable that gets updated when running the graph with TensorFlow?
Thank you very much for your help!

When you write a statement like
Z = tf.matmul(W, Z)
you are updating the python variable Z and not the TensorFlow's internal storage associated with the TensorFlow variable Z. Please have a look at the section on stateful operations in TensorFlow to get an idea of how TensorFlow manages state. To answer your specific question, you have to use the tf.assign operation to update TensorFlow's Z variable as follows :
Z = tf.assign(Z, tf.matmul(W, Z))

Related

pyomo matrix product

I would like to use pyomo to solve a multiple linear regression under constraint in pyomo.
to do so I have 3 matrices :
X (noted tour1 in the following code) My inputs (600x13)(bureaux*t1 in pyomo)
Y (noted tour2 in the following code) the matrix I want to predict (6003)(bureauxt2 inpyomo)
T (noted transfer In the code) (13x3)(t1*t2 in pyomo)
I would like to do the following
ypred = XT
minimize (ypred-y)**2
subject to
0<T<1
and Sum_i(Tij)=1
To that effect, I started the following code
from pyomo.environ import *
tour1=pd.DataFrame(np.random.random(size=(60,13)),columns=["X"+str(i) for i in range(13)],index=["B"+str(i) for i in range(60)])
tour2=pd.DataFrame(np.random.random(size=(60,3)),columns=["Y"+str(i) for i in range(3)],index=["B"+str(i) for i in range(60)])
def gettour1(model,i,j):
return tour1.loc[i,j]
def gettour2(model,i,j):
return tour2.loc[i,j]
def cost(model):
return sum((sum(model.tour1[i,k] * model.transfer[k,j] for k in model.t1) - model.tour2[i,j] )**2 for i in model.bureaux for j in model.tour2)
model = ConcreteModel()
model.bureaux = Set(initialize=tour1.index.tolist())
model.t1 = Set(initialize=tour1.columns)
model.t2 = Set(initialize=tour2.columns)
model.tour1 = Param(model.bureaux, model.t1,initialize=gettour1)
model.tour2 = Param(model.bureaux, model.t2,initialize=gettour2)
model.transfer = Var(model.t1,model.t2,bounds=[0,1])
model.obj=Objective(rule=cost, sense=minimize)
I unfortunately get an error at this stage :
KeyError: "Index '('X0', 'B0', 'Y0')' is not valid for indexed component 'transfer'"
anyone knows how I can calculate the objective ?
furthermore any help for the constrains would be appreciated :-)
A couple things...
First, the error you are getting. There is information in that error statement that should help identify the problem. The construction appears to be trying to index transfer with a 3-part index (x, b, y). That clearly is outside of t1 x t2. If you look at the sum equation you have, you are mistakenly using model.tour2 instead of model.t2.
Also, your bounds parameter needs to be a tuple.
While building the model, you should be pprint()-ing the model very frequently to look for these types of issues. That only works well if you have "small" data. 60 x 13 may be the normal problem size, but it is a total pain to troubleshoot. So, start with something tiny, maybe 3 x 4. Make a Set, pprint(). Make a Constraint, pprint()... Once the model computes/solves with "tiny" data, just pop in the real stuff.

tf.function property in pytorch

I'm a beginner in pytorch, and I have some functions that are needed to implement in network.
My question is: is there any way like tf.function, or should I use "class(nn.Module)" with variable?
For example, let X be a 10x2 matrix . In pseudo-code:
a = Variable(1.0)
b = Variable(1.0)
Y = a*X[:,0]**2 + b*X[:,1]
In PyTorch you don't need things like tf.function, you just use normal Python code (because of the dynamic graph).
Please give more detailed example (with code) of what you're trying to do if the above doesn't answer your question.

Tensorflow: iterating over a Tensor for embedding lookup?

Suppose I have a matrix of N users, and each user is associated with a vector of words (translated to integers). So for example for N = 2 I'd have:
user 0 corresponds to words['20','56']
user 1 corresponds to words ['58','10','105']
So I have a list
user_words = [['20','56'],['58','10','105']]
Suppose further I created a 100-column embedding matrix (word_emb) for these words. I'd like to look up the (mean) embeddings of each of the user vectors and create a new Tensor, whose shape I would expect to be [2,100]. I tried doing this:
word_vec = []
for word_sequence_i in tf.map_fn(lambda x: x, user_words):
all_word_vecs = tf.nn.embedding_lookup(word_emb, word_sequence_i)
word_vec.append( tf.reduce_mean(all_word_vecs, 1))
But this gives me an error:
TypeError: `Tensor` objects are not iterable when eager execution is not enabled. To iterate over this tensor use `tf.map_fn`.
I thought I already was using tf.map_fn above! So what is Tensorflow complaining about? Is there even a way to do what I am trying to do?
Thanks so much!
tf.map_fn returns a Tensor object itself, which is a symbolic reference to a value that will be computed at Session.run() time. You can see this with type(tf.map_fn(lambda x: x, user_words)). So, it's the iteration implied in for word_sequence_i in tf.map_fn(...) that is generating the error.
Perhaps what you're looking for is something like:
all_word_vecs = tf.map_fn(lambda x: tf.nn.embedding_lookup(word_emb, x), user_words)
word_vec = tf.reduce_mean(all_word_vecs, axis=1)
On a related note, if this distinction between graph construction and execution is getting bothersome, you might want to give TensorFlow's eager execution a spin. See getting started and the programmer's guide.
Hope that helps.

Does tensorflow recompute these values?

If I call x,y = sess.run([X,f(X)]), is X computed once or twice? I'm asking because in my case the value of X is not deterministic, and it's necessary that f be evaluated on the same 'instance' of X.
To make sure that f uses the current X you can set up dependencies.
with tf.control_dependencies([X]):
y = f(X)
x, y_ = sess.run([X, y])
It will only compute it once. It would not make sense if it recomputed the dependent variables. Just about all variables in a tensorflow model are dependent on one another.

How to fetch gradients with respect to certain occurrences of variables in tensorflow?

Since tensorflow supports variable reuse, some part of computing graph may occur multiple times in both forward and backward process. So my question is, is it possible to update variables with respect their certain occurrences in the compute graph?
For example, in X_A->Y_B->Y_A->Y_B, Y_B occurs twice, how to update them respectively? I mean, at first, we take the latter occurrence as constant, and update the previous one, then do opposite.
A more simple example is, say X_A, Y_B, Y_A are all scalar variable, then let Z = X_A * Y_B * Y_A * Y_B, here the gradient of Z w.r.t both occurrences of Y_B is X_A * Y_B * Y_A, but actually the gradient of Z to Y_B is 2*X_A * Y_B * Y_A. In this example computing gradients respectively may seems unnecessary, but not always are those computation commutative.
In the first example, gradients to the latter occurrence may be computed by calling tf.stop_gradient on X_A->Y_B. But I could not think of a way to fetch the previous one. Is there a way to do it in tensorflow's python API?
Edit:
#Seven provided an example on how to deal with it when reuse a single variable. However often it's a variable scope that is reused, which contains many variables and functions that manage them. As far as I know, their is no way to reuse a variable scope with applying tf.stop_gradient to all variables it contains.
With my understanding, when you use A = tf.stop_gradient(A), A will be considered as a constant. I have an example here, maybe it can help you.
import tensorflow as tf
wa = tf.get_variable('a', shape=(), dtype=tf.float32,
initializer=tf.constant_initializer(1.5))
b = tf.get_variable('b', shape=(), dtype=tf.float32,
initializer=tf.constant_initializer(7))
x = tf.placeholder(tf.float32, shape=())
l = tf.stop_gradient(wa*x) * (wa*x+b)
op_gradient = tf.gradients(l, x)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print sess.run([op_gradient], feed_dict={x:11})
I have a workaround for this question. Define a custom getter for the concerning variable scope, which wraps the default getter with tf.stop_gradient. This could set all variables returned in this scope as a Tensor contributing no gradients, though sometimes things get complicated because it returns a Tensor instead of a variable, such as when using tf.nn.batch_norm.