Tensorflow: Bug when using `tf.contrib.metrics.streaming_mean_iou` - tensorflow

I'm getting a strange error when trying to compute the intersection over union using tensorflows tf.contrib.metrics.streaming_mean_iou.
This was the code I was using before which works perfectly fine
tensorflow as tf
label = tf.image.decode_png(tf.read_file('/path/to/label.png'),channels=1)
label_lin = tf.reshape(label, [-1,])
weights = tf.cast(tf.less_equal(label_lin, 10), tf.int32)
mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(label_lin, label_lin,num_classes = 11,weights = weights)
init = tf.local_variables_initializer()
sess.run(init)
sess.run([update_op])
However when I use a mask like this
mask = tf.image.decode_png(tf.read_file('/path/to/mask_file.png'),channels=1)
mask_lin = tf.reshape(mask, [-1,])
mask_lin = tf.cast(mask_lin,tf.int32)
mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(label_lin, label_lin,num_classes = 11,weights = mask_lin)
init = tf.local_variables_initializer()
sess.run(init)
sess.run([update_op])
It keeps on failing after an irregular number of iterations showing this error:
*** Error in `/usr/bin/python': corrupted double-linked list: 0x00007f29d0022fd0 ***
I checked the shape and data type of both mask_lin and weights. They are the same, so I cannot really see what is going wrong here.
Also the fact that the error comes after calling update_op an irregular number of times is strange. Maybe TF empties the mask_lin object after calling several sess.run()'s ?
Or is this some TF bug? But then again why would it work with weights...

Related

How to avoid memory leakage in an autoregressive model within tensorflow

Recently, I am training a LSTM with attention mechanism for regressionin tensorflow 2.9 and I met an problem during training with model.fit():
At the beginning, the training time is okay, like 7s/step. However, it was increasing during the process and after several steps, like 1000, the value might be 50s/step. Here below is a part of the code for my model:
class AttentionModel(tf.keras.Model):
def __init__(self, encoder_output_dim, dec_units, dense_dim, batch):
super().__init__()
self.dense_dim = dense_dim
self.batch = batch
encoder = Encoder(encoder_output_dim)
decoder = Decoder(dec_units,dense_dim)
self.encoder = encoder
self.decoder = decoder
def call(self, inputs):
# Creat a tensor to record the result
tempt = list()
encoder_output, encoder_state = self.encoder(inputs)
new_features = np.zeros((self.batch, 1, 1))
dec_initial_state = encoder_state
for i in range(6):
dec_inputs = DecoderInput(new_features=new_features, enc_output=encoder_output)
dec_result, dec_state = self.decoder(dec_inputs, dec_initial_state)
tempt.append(dec_result.logits)
new_features = dec_result.logits
dec_initial_state = dec_state
result=tf.concat(tempt,1)
return result
In the official documents for tf.function, I notice: "Don't rely on Python side effects like object mutation or list appends".
Since I use a dynamic python list with append() to record the intermediate variables, I guess each time during training, a new tf.graph was added. Is the reason my training is getting slower and slower?
Additionally, what should I use instead of python list to avoid this? I have tried with a numpy.zeros matrix but it will lead to another problem:
tempt = np.zeros(shape=(1,6))
...
for i in range(6):
dec_inputs = DecoderInput(new_features=new_features, enc_output=encoder_output)
dec_result, dec_state = self.decoder(dec_inputs, dec_initial_state)
tempt[i]=(dec_result.logits)
...
Cannot convert a symbolic tf.Tensor (decoder/dense_3/BiasAdd:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported.

why keras layers initialization doesn't work

when i run my small keras model i got this error
FailedPreconditionError: Attempting to use uninitialized value bn6/beta
[[{{node bn6/beta/read}} = IdentityT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
full traceback error
code:
"input layer"
command_input = keras.layers.Input(shape=(1,1))
image_measurements_features = keras.layers.Input(shape=(1, 640))
"command module"
command_module_layer1=keras.layers.Dense(128,activation='relu')(command_input)
command_module_layer2=keras.layers.Dense(128,activation='relu')(command_module_layer1)
"concatenation layer"
j=keras.layers.concatenate([command_module_layer2,image_measurements_features])
"desicion module"
desicion_module_layer1=keras.layers.Dense(512,activation='relu')(j)
desicion_module_layer2=keras.layers.Dense(256,activation='relu')(desicion_module_layer1)
desicion_module_layer3=keras.layers.Dense(128,activation='relu')(desicion_module_layer2)
desicion_module_layer4=keras.layers.Dense(3,activation='relu')(desicion_module_layer3)
initt = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(initt)
big_hero_4=keras.models.Model(inputs=[command_input, image_measurements_features], outputs=desicion_module_layer4)
big_hero_4.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
"train the model"
historyy=big_hero_4.fit([x, y],z,batch_size=None, epochs=1,steps_per_epoch=1000)
do you have any solutions for this error ?
Why keras doesn't initialize the layers automatically without using global variables initializer (the error exists before and after adding the global initializer)
You initialize your model and then make and compile it. That's the wrong order, first define your model, compile it and then initialize. Same code, just different order
I got this to work. Forget about the session when using keras, it only complicates things.
import keras
import tensorflow as tf
import numpy as np
command_input = keras.layers.Input(shape=(1,1))
image_measurements_features = keras.layers.Input(shape=(1, 640))
command_module_layer1 = keras.layers.Dense(128 ,activation='relu')(command_input)
command_module_layer2 = keras.layers.Dense(128 ,activation='relu')(command_module_layer1)
j = keras.layers.concatenate([command_module_layer2, image_measurements_features])
desicion_module_layer1 = keras.layers.Dense(512,activation='relu')(j)
desicion_module_layer2 = keras.layers.Dense(256,activation='relu')(desicion_module_layer1)
desicion_module_layer3 = keras.layers.Dense(128,activation='relu')(desicion_module_layer2)
desicion_module_layer4 = keras.layers.Dense(3,activation='relu')(desicion_module_layer3)
big_hero_4 = keras.models.Model(inputs=[command_input, image_measurements_features], outputs=desicion_module_layer4)
big_hero_4.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
# Mock data
x = np.zeros((1, 1, 1))
y = np.zeros((1, 1, 640))
z = np.zeros((1, 1, 3))
historyy=big_hero_4.fit([x, y], z, batch_size=None, epochs=1,steps_per_epoch=1000)
This code should start training with no issues. If you still have the same error it might be due to some other part of your code if there is more.

tensorflow error - you must feed a value for placeholder tensor 'in'

I'm trying to implement queues for my tensorflow prediction but get the following error -
you must feed a value for placeholder tensor 'in' with dtype float and shape [1024,1024,3]
The program works fine if I use the feed_dict, Trying to replace feed_dict with queues.
The program basically takes a list of positions and passes the image np array to the input tensor.
for each in positions:
y,x = each
images = img[y:y+1024,x:x+1024,:]
a = images.astype('float32')
q = tf.FIFOQueue(capacity=200,dtypes=dtypes)
enqueue_op = q.enqueue(a)
qr = tf.train.QueueRunner(q, [enqueue_op] * 1)
tf.train.add_queue_runner(qr)
data = q.dequeue()
graph=load_graph('/home/graph/frozen_graph.pb')
with tf.Session(graph=graph,config=tf.ConfigProto(log_device_placement=True)) as sess:
p_boxes = graph.get_tensor_by_name("cat:0")
p_confs = graph.get_tensor_by_name("sha:0")
y = [p_confs, p_boxes]
x = graph.get_tensor_by_name("in:0")
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord,sess=sess)
confs, boxes = sess.run(y)
coord.request_stop()
coord.join(threads)
How can I make sure the input data that I populated to the queue is recognized while running the graph in the session.
In my original run I call the
confs, boxes = sess.run([p_confs, p_boxes], feed_dict=feed_dict_testing)
I'd suggest not using queues for this problem, and switching to the new tf.data API. In particular tf.data.Dataset.from_generator() makes it easier to feed in data from a Python function. You can rewrite your code to be much simpler, as follows:
def generator():
for y, x in positions:
images = img[y:y+1024,x:x+1024,:]
yield images.astype('float32')
dataset = tf.data.Dataset.from_generator(
generator, tf.float32, [1024, 1024, img.shape[3]])
# Add any extra transformations in here, like `dataset.batch()` or
# `dataset.repeat()`.
# ...
iterator = dataset.make_one_shot_iterator()
data = iterator.get_next()
Note that in your program, there's no connection between the data tensor and the graph you loaded in load_graph() (at least, assuming that load_graph() doesn't grab data from the global state!). You will probably need to use tf.import_graph_def() and the input_map argument to associate data with one of the tensors in your frozen graph (possibly "in:0"?) to complete the task.

RNN Slow-down phenomenon of Tensorflow

I found a peculiar property of lstm cell(not limited to lstm but I only examined with this) of tensorflow which has not been reported as far as I know.
I don't know whether it actually has, so I left this post in SO. Below is a toy code for this problem:
import tensorflow as tf
import numpy as np
import time
def network(input_list):
input,init_hidden_c,init_hidden_m = input_list
cell = tf.nn.rnn_cell.BasicLSTMCell(256, state_is_tuple=True)
init_hidden = tf.nn.rnn_cell.LSTMStateTuple(init_hidden_c, init_hidden_m)
states, hidden_cm = tf.nn.dynamic_rnn(cell, input, dtype=tf.float32, initial_state=init_hidden)
net = [v for v in tf.trainable_variables()]
return states, hidden_cm, net
def action(x, h_c, h_m):
t0 = time.time()
outputs, output_h = sess.run([rnn_states[:,-1:,:], rnn_hidden_cm], feed_dict={
rnn_input:x,
rnn_init_hidden_c: h_c,
rnn_init_hidden_m: h_m
})
dt = time.time() - t0
return outputs, output_h, dt
rnn_input = tf.placeholder("float", [None, None, 512])
rnn_init_hidden_c = tf.placeholder("float", [None,256])
rnn_init_hidden_m = tf.placeholder("float", [None,256])
rnn_input_list = [rnn_input, rnn_init_hidden_c, rnn_init_hidden_m]
rnn_states, rnn_hidden_cm, rnn_net = network(rnn_input_list)
feed_input = np.random.uniform(low=-1.,high=1.,size=(1,1,512))
feed_init_hidden_c = np.zeros(shape=(1,256))
feed_init_hidden_m = np.zeros(shape=(1,256))
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(10000):
_, output_hidden_cm, deltat = action(feed_input, feed_init_hidden_c, feed_init_hidden_m)
if i % 10 == 0:
print 'Running time: ' + str(deltat)
(feed_init_hidden_c, feed_init_hidden_m) = output_hidden_cm
feed_input = np.random.uniform(low=-1.,high=1.,size=(1,1,512))
[Not important]What this code does is to generate an output from 'network()' function containing LSTM where the input's temporal dimension is 1, so output's is also 1, and pull in&out initial state for each step of running.
[Important] Looking the 'sess.run()' part. For some reasons in my real code, I happened to put [:,-1:,:] for 'rnn_states'. What is happening is then the time spent for each 'sess.run()' increases. For some inspection by my own, I found this slow down stems from that [:,-1:,:]. I just wanted to get the output at the last time step. If you do 'outputs, output_h = sess.run([rnn_states, rnn_hidden_cm], feed_dict{~' w/o [:,-1:,:] and take 'last_output = outputs[:,-1:,:]' after the 'sess.run()', then the slow down does not occur.
I do not know why this exponential increment of time happens with that [:,-1:,:] running. Is this the nature of tensorflow hasn't been documented but particularly slows down(may be adding more graph by its own?)?
Thank you, and hope this mistake not happen for other users by this post.
I encountered the same problem, with TensorFlow slowing down for each iteration I ran it, and found this question while trying to debug it. Here's a short description of my situation and how I solved it for future reference. Hopefully it can point someone in the right direction and save them some time.
In my case the problem was mainly that I didn't make use of feed_dict to supply the network state when executing sess.run(). Instead I redeclared outputs, final_state and prediction every iteration. The answer at https://github.com/tensorflow/tensorflow/issues/1439#issuecomment-194405649 made me realize how stupid that was... I was constantly creating new graph nodes in every iteration, making it all slower and slower. The problematic code looked something like this:
# defining the network
lstm_layer = rnn.BasicLSTMCell(num_units, forget_bias=1)
outputs, final_state = rnn.static_rnn(lstm_layer, input, initial_state=rnn_state, dtype='float32')
prediction = tf.nn.softmax(tf.matmul(outputs[-1], out_weights)+out_bias)
for input_data in data_seq:
# redeclaring, stupid stupid...
outputs, final_state = rnn.static_rnn(lstm_layer, input, initial_state=rnn_state, dtype='float32')
prediction = tf.nn.softmax(tf.matmul(outputs[-1], out_weights)+out_bias)
p, rnn_state = sess.run((prediction, final_state), feed_dict={x: input_data})
The solution was of course to only declare the nodes once in the beginning, and supply the new data with feed_dict. The code went from being half slow (> 15 ms in the beginning) and becoming slower for every iteration, to execute every iteration in around 1 ms. My new code looks something like this:
out_weights = tf.Variable(tf.random_normal([num_units, n_classes]), name="out_weights")
out_bias = tf.Variable(tf.random_normal([n_classes]), name="out_bias")
# placeholder for the network state
state_placeholder = tf.placeholder(tf.float32, [2, 1, num_units])
rnn_state = tf.nn.rnn_cell.LSTMStateTuple(state_placeholder[0], state_placeholder[1])
x = tf.placeholder('float', [None, 1, n_input])
input = tf.unstack(x, 1, 1)
# defining the network
lstm_layer = rnn.BasicLSTMCell(num_units, forget_bias=1)
outputs, final_state = rnn.static_rnn(lstm_layer, input, initial_state=rnn_state, dtype='float32')
prediction = tf.nn.softmax(tf.matmul(outputs[-1], out_weights)+out_bias)
# actual network state, which we input with feed_dict
_rnn_state = tf.nn.rnn_cell.LSTMStateTuple(np.zeros((1, num_units), dtype='float32'), np.zeros((1, num_units), dtype='float32'))
it = 0
for input_data in data_seq:
encl_input = [[input_data]]
p, _rnn_state = sess.run((prediction, final_state), feed_dict={x: encl_input, rnn_state: _rnn_state})
print("{} - {}".format(it, p))
it += 1
Moving the declaration out from the for loop also got rid of the problem which the OP sdr2002 had, doing a slice outputs[-1] in sess.run() inside the for loop.
As mentioned above, no sliced output for 'sess.run()' is much appreciated for this case.
def action(x, h_c, h_m):
t0 = time.time()
outputs, output_h = sess.run([rnn_states, rnn_hidden_cm], feed_dict={
rnn_input:x,
rnn_init_hidden_c: h_c,
rnn_init_hidden_m: h_m
})
outputs = outputs[:,-1:,:]
dt = time.time() - t0
return outputs, output_h, dt

In what order does TensorFlow evaluate nodes in a computation graph?

I am having a strange bug in TensorFlow. Consider the following code, part of a simple feed-forward neural network:
output = (tf.matmul(layer_3,w_out) + b_out)
prob = tf.nn.sigmoid(output);
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits = output, targets = y_, name=None))
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(loss, var_list = model_variables)`
(Notice that prob is not used to define the loss function. This is because sigmoid_cross_entropy applies sigmoid internally in its definition)
I later run the optimizer in the following line:
result,step_loss,_ = sess.run(fetches = [output,loss,optimizer],feed_dict = {x_ : np.array([[x,y,x*x,y*y,x*y]]), y_ : [[1,0]]});
The above works just fine. However, if I instead run the following line to run the code, the network seems to perform terribly, even though there shouldn't be any difference!
result,step_loss,_ = sess.run(fetches = [prob,loss,optimizer],feed_dict = {x_ : np.array([[x,y,x*x,y*y,x*y]]), y_ : [[1,0]]});
I have a feeling it has something to do with the order in which TF computes the nodes in the graph during a session, but I'm not sure. What could the issue be?
It's not an issue with the graph, it's just that you are looking at different things.
In the first example you provide:
result,step_loss,_ = sess.run(fetches = [output,loss,optimizer],feed_dict = {x_ : np.array([[x,y,x*x,y*y,x*y]]), y_ : [[1,0]]})
you are saving the result of running the output op in the result python variable.
In the second one:
result,step_loss,_ = sess.run(fetches = [prob,loss,optimizer],feed_dict = {x_ : np.array([[x,y,x*x,y*y,x*y]]), y_ : [[1,0]]})
you are saving the result of the prob op in the result python variable.
Since both ops are different it is to be expected that the values returned by them would be different.
You could run
logits, activation, step_loss, _ = sess.run(fetches = [output, prob, loss, optimizer], ...)
to check your results.