Initialise tensors with ones in tensorflow 1 - tensorflow

I want to initialise a variable declared using get_variable function with 1s .
I tried the following methods :
1.tf.get_variable(name = 'yd1', shape = shape_t, dtype = tf.float32,initializer = tf.ones())
Error received -> TypeError: ones() takes at least 1 argument (0 given)
tf.get_variable(name = 'yd1', shape = shape_t ,dtype = tf.float32,initializer = tf.ones(shape=shape_t))
Error received -> ValueError("If initializer is a constant, do not specify shape.")
What is the best way to initialise a variable with ones?
tf.zeros_initializer can be used to initialise with 0s, but there is no equivalent for ones in tf 1

You need to use tf.ones_initializer:
tf.get_variable(name='yd1', shape=shape_t, dtype=tf.float32,
initializer=tf.ones_initializer())
Alternatively, as the second error message says, you can use a constant value, but then do not pass a shape:
tf.get_variable(name='yd1', initializer=tf.ones(shape=shape_t, dtype=tf.float32))

tf.ones_initializer is not available in tf 1, It was introduced in tf 2.
The following code does the work
tf.get_variable(name = 'yd1', shape = shape_t ,dtype = tf.float32,initializer = tf.constant_initializer(1))

Related

How to apply a function to the value of a tensor and then assigning the output to the same tensor

I want to project the updated weights of my network (after performing optimization) to a special space in which I need the value of that tensor to be passed. The function which applies projection gets a numpy array as an input. Is there a way I can do this?
I used tf.assign() as a solution but since my function accepts arrays and not tensors it failed.
Here is a sketch of what I want to do:
W = tf.Variable(...)
...
opt = tf.train.AdamOptimizer(learning_rate).minimize(loss, var_list=['W'])
W = my_function(W)
It seems that tf.control_dependencies is what you need
one simple exmaple:
import tensorflow as tf
var = tf.get_variable('var', initializer=0.0)
# replace `tf.add` with your custom function
addop = tf.add(var, 1)
with tf.control_dependencies([addop]):
updateop = tf.assign(var, addop)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # pylint: disable=no-member
with tf.Session(config=config) as sess:
sess.run(tf.global_variables_initializer())
updateop.eval()
print(var.eval())
updateop.eval()
print(var.eval())
updateop.eval()
print(var.eval())
output:
1.0
2.0
3.0

Store RNN states using graph collections

I frequently use tf.add_to_collection to have Tensorflow automatically serialize intermediary results into a checkpoint. I find this the most convenient way to later fetch pointers to interesting tensors when a model was restored from a checkpoint. However, I realized that RNN state tuples cannot easily be added to a graph collection. Consider the following dummy example in TF 1.3:
import tensorflow as tf
import numpy as np
in_ = tf.placeholder(tf.float32, shape=[None, 5, 1])
batch_size = tf.shape(in_)[0]
cell1 = tf.nn.rnn_cell.BasicLSTMCell(num_units=128)
cell2 = tf.nn.rnn_cell.BasicLSTMCell(num_units=256)
cell = tf.nn.rnn_cell.MultiRNNCell([cell1, cell2])
outputs, last_state = tf.nn.dynamic_rnn(cell=cell,
inputs=in_,
initial_state=cell.zero_state(batch_size, dtype=tf.float32))
tf.add_to_collection('states', last_state)
loss = tf.reduce_mean(in_ - outputs)
loss_s = tf.summary.scalar('loss', loss)
writer = tf.summary.FileWriter('.', tf.get_default_graph())
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
l, s = sess.run([loss, loss_s], feed_dict={in_: np.ones([1, 5, 1])})
writer.add_summary(s)
This will produce the following warning:
WARNING:tensorflow:Error encountered when serializing states.
Type is unsupported, or the types of the items don't match field type in CollectionDef.
'tuple' object has no attribute 'name'
It seems that the serialization cannot handle tuples, and of course the last_state variable is a tuple. May be one could loop through the tuple and add each element individually to the collection, but that seems too complicated. What's a better way of handling this? In the end, I would like to access last_state again when the model is restored, ideally without needing access to the original code that created the model.
Actually, looping through every element of the state is not too complicated, and straight-forward to implement:
def add_to_collection_rnn_state(name, rnn_state):
for layer in rnn_state:
tf.add_to_collection(name, layer.c)
tf.add_to_collection(name, layer.h)
And then to load it:
def get_collection_rnn_state(name):
layers = []
coll = tf.get_collection(name)
for i in range(0, len(coll), 2):
state = tf.nn.rnn_cell.LSTMStateTuple(coll[i], coll[i+1])
layers.append(state)
return tuple(layers)
Note that this assumes that one collection only stores on state, i.e. use a different collection for every state you want to store, e.g. like this:
add_to_collection_rnn_state('states', last_state)
add_to_collection_rnn_state('init_state', init_state)
Edit
As pointed out correctly in the comments, the proposed solution only works for LSTMCells (that are represented as tuples as well). A more general solution that can handle GRU cells or potentially custom cells and mixes thereof, could look like this:
import tensorflow as tf
import numpy as np
def add_to_collection_rnn_state(name, rnn_state):
# store the name of each cell type in a different collection
coll_of_names = name + '__names__'
for layer in rnn_state:
n = layer.__class__.__name__
tf.add_to_collection(coll_of_names, n)
try:
for l in layer:
tf.add_to_collection(name, l)
except TypeError:
# layer is not iterable so just add it directly
tf.add_to_collection(name, layer)
def get_collection_rnn_state(name):
layers = []
coll = tf.get_collection(name)
coll_of_names = tf.get_collection(name + '__names__')
idx = 0
for n in coll_of_names:
if 'LSTMStateTuple' in n:
state = tf.nn.rnn_cell.LSTMStateTuple(coll[idx], coll[idx+1])
idx += 2
else: # add more cell types here
state = coll[idx]
idx += 1
layers.append(state)
return tuple(layers)
in_ = tf.placeholder(tf.float32, shape=[None, 5, 1])
batch_size = tf.shape(in_)[0]
cell1 = tf.nn.rnn_cell.BasicLSTMCell(num_units=128)
cell2 = tf.nn.rnn_cell.GRUCell(num_units=256)
cell3 = tf.nn.rnn_cell.BasicRNNCell(num_units=256)
cell = tf.nn.rnn_cell.MultiRNNCell([cell1, cell2, cell3])
outputs, last_state = tf.nn.dynamic_rnn(cell=cell,
inputs=in_,
initial_state=cell.zero_state(batch_size, dtype=tf.float32))
add_to_collection_rnn_state('last_state', last_state)
last_state_r = get_collection_rnn_state('last_state')
Comparing last_state and last_state_r reveals that both are identical (which they should be). Note that I am using a different collection to store the names because tensorflow can only serialize a collection when all elements in the collection are of the same type. E.g. mixing strings with Tensors in the same collection does not work.

Tensorflow: Bug when using `tf.contrib.metrics.streaming_mean_iou`

I'm getting a strange error when trying to compute the intersection over union using tensorflows tf.contrib.metrics.streaming_mean_iou.
This was the code I was using before which works perfectly fine
tensorflow as tf
label = tf.image.decode_png(tf.read_file('/path/to/label.png'),channels=1)
label_lin = tf.reshape(label, [-1,])
weights = tf.cast(tf.less_equal(label_lin, 10), tf.int32)
mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(label_lin, label_lin,num_classes = 11,weights = weights)
init = tf.local_variables_initializer()
sess.run(init)
sess.run([update_op])
However when I use a mask like this
mask = tf.image.decode_png(tf.read_file('/path/to/mask_file.png'),channels=1)
mask_lin = tf.reshape(mask, [-1,])
mask_lin = tf.cast(mask_lin,tf.int32)
mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(label_lin, label_lin,num_classes = 11,weights = mask_lin)
init = tf.local_variables_initializer()
sess.run(init)
sess.run([update_op])
It keeps on failing after an irregular number of iterations showing this error:
*** Error in `/usr/bin/python': corrupted double-linked list: 0x00007f29d0022fd0 ***
I checked the shape and data type of both mask_lin and weights. They are the same, so I cannot really see what is going wrong here.
Also the fact that the error comes after calling update_op an irregular number of times is strange. Maybe TF empties the mask_lin object after calling several sess.run()'s ?
Or is this some TF bug? But then again why would it work with weights...

Tensorflow: constructing the params tensor for tf.map_fn

import tensorflow as tf
import numpy as np
def lineeqn(slope, intercept, y, x):
return np.sign(y-(slope*x) - intercept)
# data size
DS = 100000
N = 100
x1 = tf.random_uniform([DS], -1, 0, dtype=tf.float32, seed=0)
x2 = tf.random_uniform([DS], 0, 1, dtype=tf.float32, seed=0)
# line representing the target function
rand1 = np.random.randint(0, DS)
rand2 = np.random.randint(0, DS)
T_x1 = x1[rand1]
T_x2 = x1[rand2]
T_y1 = x2[rand1]
T_y2 = x2[rand2]
slope = (T_y2 - T_y1)/(T_x2 - T_x1)
intercept = T_y2 - (slope * T_x2)
# extracting training samples from the data set
training_indices = np.random.randint(0, DS, N)
training_x1 = tf.gather(x1, training_indices)
training_x2 = tf.gather(x2, training_indices)
training_x1_ex = tf.expand_dims(training_x1, 1)
training_x2_ex = tf.expand_dims(training_x2, 1)
slope_tensor = tf.fill([N], slope)
slope_ex = tf.expand_dims(slope_tensor, 1)
intercept_tensor = tf.fill([N], intercept)
intercept_ex = tf.expand_dims(intercept_tensor, 1)
params = tf.concat(1, [slope_ex, intercept_ex, training_x2_ex, training_x1_ex])
training_y = tf.map_fn(lineeqn, params)
The lineeqn function requires 4 parameters, so params should be a tensor where each element is 4-element tensor. When I try to run the above code, I get the error TypeError: lineeqn() takes exactly 4 arguments (1 given). Can someone please explain what is wrong with the way I have constructed the params tensor? What does tf.map_fn do to the params tensor?
A similar question has been asked here. The reason you are getting this error is because the function called by map_fn - lineeqn in your case - is required to take exactly one tensor argument.
Rather than a list of arguments to the function, the parameter elems is expected to be a list of items, where the mapped function is called for each item contained in the list.
So in order to take multiple arguments to your function, you would have to unpack them yourself from each item, e.g.
def lineeqn(item):
slope, intercept, y, x = tf.unstack(item, num=4)
return np.sign(y - (slope * x) - intercept)
and call it as
training_y = tf.map_fn(lineeqn, list_of_parameter_tensors)
Here, you call the line equation for each tensor in the list_of_parameter_tensors, where each tensor would describe a tuple (slope, intercept, y, x) of packed arguments.
(Note that depending on the shape of the actual argument tensors, it might also be that instead of tf.concat you could have to use tf.pack.)

TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')

Strange error from numpy via matplotlib when trying to get a histogram of a tiny toy dataset. I'm just not sure how to interpret the error, which makes it hard to see what to do next.
Didn't find much related, though this nltk question and this gdsCAD question are superficially similar.
I intend the debugging info at bottom to be more helpful than the driver code, but if I've missed something, please ask. This is reproducible as part of an existing test suite.
if n > 1:
return diff(a[slice1]-a[slice2], n-1, axis=axis)
else:
> return a[slice1]-a[slice2]
E TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')
../py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py:1567: TypeError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) bt
[...]
py2.7.11-venv/lib/python2.7/site-packages/matplotlib/axes/_axes.py(5678)hist()
-> m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(606)histogram()
-> if (np.diff(bins) < 0).any():
> py2.7.11-venv/lib/python2.7/site-packages/numpy/lib/function_base.py(1567)diff()
-> return a[slice1]-a[slice2]
(Pdb) p numpy.__version__
'1.11.0'
(Pdb) p matplotlib.__version__
'1.4.3'
(Pdb) a
a = [u'A' u'B' u'C' u'D' u'E']
n = 1
axis = -1
(Pdb) p slice1
(slice(1, None, None),)
(Pdb) p slice2
(slice(None, -1, None),)
(Pdb)
I got the same error, but in my case I am subtracting dict.key from dict.value. I have fixed this by subtracting dict.value for corresponding key from other dict.value.
cosine_sim = cosine_similarity(e_b-e_a, w-e_c)
here I got error because e_b, e_a and e_c are embedding vector for word a,b,c respectively. I didn't know that 'w' is string, when I sought out w is string then I fix this by following line:
cosine_sim = cosine_similarity(e_b-e_a, word_to_vec_map[w]-e_c)
Instead of subtracting dict.key, now I have subtracted corresponding value for key
I had a similar issue where an integer in a row of a DataFrame I was iterating over was of type numpy.int64. I got the
TypeError: ufunc 'subtract' did not contain a loop with signature matching types dtype('<U1') dtype('<U1') dtype('<U1')
error when trying to subtract a float from it.
The easiest fix for me was to convert the row using pd.to_numeric(row).
Why is it applying diff to an array of strings.
I get an error at the same point, though with a different message
In [23]: a=np.array([u'A' u'B' u'C' u'D' u'E'])
In [24]: np.diff(a)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-24-9d5a62fc3ff0> in <module>()
----> 1 np.diff(a)
C:\Users\paul\AppData\Local\Enthought\Canopy\User\lib\site-packages\numpy\lib\function_base.pyc in diff(a, n, axis)
1112 return diff(a[slice1]-a[slice2], n-1, axis=axis)
1113 else:
-> 1114 return a[slice1]-a[slice2]
1115
1116
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'numpy.ndarray'
Is this a array the bins parameter? What does the docs say bins should be?
I am fairly new to this myself, but I had a similar error and found that it is due to a type casting issue. I was trying to concatenate rather than take the difference but I think the principle is the same here. I provided a similar answer on another question so I hope that is OK.
In essence you need to use a different data type cast, in my case I needed str not float, I suspect yours is the same so my suggested solution is. I am sorry I cannot test it before suggesting but I am unclear from your example what you were doing.
return diff(str(a[slice1])-str(a[slice2]), n-1, axis=axis)
Please see my example code below for the fix to my code, the change occurs on the third to last line. The code is to produce a basic random forest model.
import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation
Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"
npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay = npArray.shape
names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)
XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)
# Predictions results initialised
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain) # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)
with open(test_name,'a') as fpred :
lenpredictions = len(RFpreds)
lentrue = yTest.shape[0]
if lenpredictions == lentrue :
fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
for i in range(0,lenpredictions) :
fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
else :
print "ERROR - names, prediction and true value array size mismatch."
This leads to an error of;
Traceback (most recent call last):
File "min_example.py", line 40, in <module>
fpred.write(RFpreds[i]+",,"+yTest[i]+",\n")
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('S32') dtype('S32') dtype('S32')
The solution is to make each variable a str() type on the third to last line then write to file. No other changes to then code have been made from the above.
import scipy
import math
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn import preprocessing, metrics, cross_validation
Data = pd.read_csv("Free_Energy_exp.csv", sep=",")
Data = Data.fillna(Data.mean()) # replace the NA values with the mean of the descriptor
header = Data.columns.values # Ues the column headers as the descriptor labels
Data.head()
test_name = "Test.csv"
npArray = np.array(Data)
print header.shape
npheader = np.array(header[1:-1])
print("Array shape X = %d, Y = %d " % (npArray.shape))
datax, datay = npArray.shape
names = npArray[:,0]
X = npArray[:,1:-1].astype(float)
y = npArray[:,-1] .astype(float)
X = preprocessing.scale(X)
XTrain, XTest, yTrain, yTest = cross_validation.train_test_split(X,y, random_state=0)
# Predictions results initialised
RFpredictions = []
RF = RandomForestRegressor(n_estimators = 10, max_features = 5, max_depth = 5, random_state=0)
RF.fit(XTrain, yTrain) # Train the model
print("Training R2 = %5.2f" % RF.score(XTrain,yTrain))
RFpreds = RF.predict(XTest)
with open(test_name,'a') as fpred :
lenpredictions = len(RFpreds)
lentrue = yTest.shape[0]
if lenpredictions == lentrue :
fpred.write("Names/Label,, Prediction Random Forest,, True Value,\n")
for i in range(0,lenpredictions) :
fpred.write(str(RFpreds[i])+",,"+str(yTest[i])+",\n")
else :
print "ERROR - names, prediction and true value array size mismatch."
These examples are from a larger code so I hope the examples are clear enough.
I think #James is right. I got stuck by same error while working on Polyval(). And yeah solution is to use the same type of variabes. You can use typecast to cast all variables in the same type.
BELOW IS A EXAMPLE CODE
import numpy
P = numpy.array(input().split(), float)
x = float(input())
print(numpy.polyval(P,x))
here I used float as an output type. so even the user inputs the INT value (whole number). the final answer will be typecasted to float.
I ran into the same issue, but in my case it was just a Python list instead of a Numpy array used. Using two Numpy arrays solved the issue for me.