How to print full (not truncated) tensor in tensorflow? - numpy

Whenever I try printing I always get truncated results
import tensorflow as tf
import numpy as np
np.set_printoptions(threshold=np.nan)
tensor = tf.constant(np.ones(999))
tensor = tf.Print(tensor, [tensor])
sess = tf.Session()
sess.run(tensor)
As you can see I've followed a guide I found on Print full value of tensor into console or write to file in tensorflow
But the output is simply
...\core\kernels\logging_ops.cc:79] [1 1 1...]
I want to see the full tensor, thanks.

This is solved easily by checking the Tensorflow API for tf.Print. Pass summarize=n where n is the number of elements you want displayed.

You can do it as follows in TensorFlow 2.x:
import tensorflow as tf
tensor = tf.constant(np.ones(999))
tf.print(tensor, summarize=-1)
From TensorFlow docs -> summarize: The first and last summarize elements within each dimension are recursively printed per Tensor. If set to -1, it will print all elements of every tensor.
https://www.tensorflow.org/api_docs/python/tf/print

To print all tensors without truncation in TensorFlow 2.x:
import numpy as np
import sys
np.set_printoptions(threshold=sys.maxsize)

Related

Is there a way to find the batch size for a tf.data.Dataset

I understand you can assign a batch size to a Dataset and return a new dataset object. Is there an API to interrogate the batch size given a dataset object?
I am trying to find the calls at:
https://www.tensorflow.org/api_docs/python/tf/data/Dataset
when you call the .batch(32) method , it returns an tensorflow.python.data.ops.dataset_ops.BatchDataset object. As documented in Tensorflow Documentation This kind of object has private attribute called ._batch_size which contain a tensor of batch_size.
In tensorflow 2.X you need just call .numpy() method of this tensor to convert it to numpy.int64 type.
In tensorflow 1.X you need to cal .eval() method.
I do not know if you can just get it as an attribute, but you could just iterate through the dataset once and print the shape:
# create a simple tf.data.Dataset with batchsize 3
import tensorflow as tf
f = tf.data.Dataset.range(10).batch(3) # Dataset with batch_size 3
# iterating once
for one_batch in f:
print('batch size:', one_batch.shape[0])
break
If you know your dataset has targets/labels as well, you have to iterate as follows:
# iterating once
for one_batch_x, one_batch_y in f:
print('batch size:', one_batch_x.shape[0])
break
In both cases, it will print:
batch size: 3
In Tensorflow 1.* access batch_size via dataset._dataset._batch_size:
import tensorflow as tf
import numpy as np
print(tf.__version__) # 1.14.0
dataset = tf.data.Dataset.from_tensor_slices(np.random.randint(0, 2, 100)).batch(10)
with tf.compat.v1.Session() as sess:
batch_size = sess.run(dataset._dataset._batch_size)
print(batch_size) # 10
In Tensorflow 2 you can access via dataset._batch_size:
import tensorflow as tf
import numpy as np
print(tf.__version__) # 2.0.1
dataset = tf.data.Dataset.from_tensor_slices(np.random.randint(0, 2, 100)).batch(10)
batch_size = dataset._batch_size.numpy()
print(batch_size) # 10

Keras model.evaluate and model.predict give different results

When I build a model using Lambda layer to calculate the sum along an axis, model.evaluate() gives different result in comparison to manually calculating from model.predict(). I have checked the output data types and shapes and it did not help. The problem only occurs when I use tensorflow as Keras backend. It works fine when I use CNTK as backend.
Here is a minimal code that can reproduce this behavior.
from keras.layers import Input,Lambda
import keras.backend as K
from keras.models import Model
import numpy as np
inp=Input((2,))
out=Lambda(lambda x:K.sum(x,axis=-1),output_shape=(1,))(inp)
model=Model(input=inp,output=out)
model.compile(loss="mse",optimizer="sgd")
xs=np.random.random((3,2))
ys=np.sum(xs,axis=1)
print(np.mean((model.predict(xs)-ys)**2)) # This is zero
print(model.evaluate(xs,ys)) # This is not zero
I establish a network with no parameters and should simply calculate the sum for each x, and ys are constructed so that it should be the same with the output from the model. model.predict() gives the same results as ys, but model.evaluate() gives something non-zero when I use tensorflow as backend.
Any idea about why this should happen? Thanks!

Getting TypeError while training a classifier for iris flower dataset

I am trying to experiment by taking the output layer as a linear layer for classifying the iris flower dataset and use regression ,with target values
ranging from 0,1 and 2.
I am using 1 hidden tanh activation layer and the another linear layer. I have by motive tried using this instead of one hot encoding for the labels as I want to compare the score from the 'model' function of my code as I am new to tensorflow .On running below code...
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import tensorflow as tf
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
data=load_iris()
X=data['data']
Y=data['target']
pca=PCA(n_components=2)
X=pca.fit_transform(X)
#visualise the data
#plt.figure(figsize=(12,12))
#plt.scatter(X[:,0],X[:,1],c=Y,alpha=0.4)
#plt.show()
labels=Y.reshape(-1,1)
x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.3,random_state=42)
y_train=y_train.reshape(-1,1)
y_test=y_test.reshape(-1,1)
hidden_nodes=5
batch_size=100
num_features=2
lr=0.01
g=tf.Graph()
with g.as_default():
tf_train_dataset=tf.placeholder(tf.float32,shape=[None,num_features])
tf_train_labels=tf.placeholder(tf.float32,shape=[None,1])
tf_test_dataset=tf.constant(x_test,dtype=tf.float32)
layer1_weights=tf.Variable(tf.truncated_normal([num_features,hidden_nodes]),dtype=tf.float32)
layer1_biases=tf.Variable(tf.zeros([hidden_nodes]),dtype=tf.float32)
layer2_weights=tf.Variable(tf.truncated_normal([hidden_nodes,1]),dtype=tf.float32)
layer2_biases=tf.Variable(tf.zeros([1]),dtype=tf.float32)
def model(data):
Z1=tf.matmul(data,layer1_weights)+layer1_biases
A1=tf.nn.relu(Z1)
Z2=tf.matmul(A1,layer2_weights)+layer2_biases
return Z2
model_scores=model(tf_train_dataset)
loss=tf.reduce_mean(tf.losses.mean_squared_error(model_scores,tf_train_labels))
optimizer=tf.train.GradientDescentOptimizer(lr).minimize(loss)
#train_prediction=model_scores
test_prediction=(tf_test_dataset)
num_steps=10001
with tf.Session() as sess:
init=tf.global_variables_initializer()
sess.run(init)
for step in range(num_steps):
offset=(step*batch_size)%(y_train.shape[0]-batch_size)
minibatch_data=x_train[offset:(offset+batch_size),:]
minibatch_labels=y_train[offset:(offset+batch_size)]
feed_dict={tf_train_dataset:minibatch_data,tf_train_labels:minibatch_labels}
ll,loss,scores=sess.run([optimizer,loss,model_scores],feed_dict=feed_dict)
if step%1000==0:
print('Minibatch loss at step {}:{}'.format(step,loss))
I get an error on line
ll,loss,scores=sess.run([optimizer,loss,model_scores],feed_dict=feed_dict)
TypeError: Fetch argument 14.686994 has invalid type , must be a string or Tensor. (Can not convert a float32 into a Tensor or Operation.)
Why is error coming, is it because of this line
model_scores=model(tf_train_dataset)
How should I go about solving this issue and can't the return value of model function be tensor or casted to tensor.
Thanks.
That is because of this line:
ll,loss,scores=sess.run([optimizer,loss,model_scores],feed_dict=feed_dict)
You replace loss tensor with loss value returned by sess.run. Just use a different variable to store loss value.

How to give multiple input at each time step of a sequential data to a recurrent neural network using tensorflow?

Suppose i am having a data set with: number of observations = 1000, each observation is a sequence of fixed length = 10(lets say), and each point in the sequence having 2 features(numerical). how we can input such data to an rnn in tensorflow ?
Any small suggestions also accepted. Thanks
According to your description, Your dataset is 1000x10x2
which looks something like this:
import numpy as np
data=np.random.randint(0,10,[1000,10,2])
Now as you said your sequence is fixed size so you don't need padding , now you have to just decide batch_size and then iterations
suppose batch size is 5:
batch_size=5
iterations=int(len(train_dataset)//batch_size)
Now feed your input to tensorflow lstm cell , your model would be something like this:
Here is example without batch size,
import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn
data=np.random.randint(0,10,[1000,10,2])
input_x=tf.placeholder(tf.float32,[1000,10,2])
with tf.variable_scope('encoder') as scope:
cell=rnn.LSTMCell(150)
model=tf.nn.dynamic_rnn(cell,inputs=input_x,dtype=tf.float32)
output_,(fs,fc)=model
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output = sess.run(model, feed_dict={input_x: data})
print(output)
if you want to use batch then you have to either reshape data for LSTM or you have to use embedding, because LSTM takes rank 3

Tensorflow: How to feed a placeholder variable with a tensor?

I have a placeholder variable that expects a batch of input images:
input_placeholder = tf.placeholder(tf.float32, [None] + image_shape, name='input_images')
Now I have 2 sources for the input data:
1) a tensor and
2) some numpy data.
For the numpy input data, I know how to feed data to the placeholder variable:
sess = tf.Session()
mLoss, = sess.run([loss], feed_dict = {input_placeholder: myNumpyData})
How can I feed a tensor to that placeholder variable?
mLoss, = sess.run([loss], feed_dict = {input_placeholder: myInputTensor})
gives me an error:
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
I don't want to convert the tensor into a numpy array using .eval(), since that would slow my program down, is there any other way?
This has been discussed on GitHub in 2016, and please check here. Here is the key point by concretevitamin:
One key thing to note is that Tensor is simply a symbolic object. The values of your feed_dict are the actual values, e.g. a Numpy ndarry.
The tensor as a symbolic object is flowing in the graph while the actual values are outside of it, then we can only pass the actual values into the graph and the symbolic object can not exist outside the graph.
You can use feed_dict to feed data into non-placeholders. So, first, wire up your dataflow graph directly to your myInputTensor tensor data source (i.e. don't use a placeholder). Then when you want to run with your numpy data you can effectively mask myImportTensor with myNumpyData, like this:
mLoss, = sess.run([loss], feed_dict={myImportTensor: myNumpyData})
[I'm still trying to figure out how to do this with multiple tensor data sources however.]
One way of solving the problem is to actually remove the Placeholder tensor and replace it by your "myInputTensor".
You will use the myInputTensor as the source for the other operations in the graph and when you want to infer the graph with your np array as input data, you will feed a value to this tensor directly.
Here is a quick example:
import tensorflow as tf
import numpy as np
# Input Tensor
myInputTensor = tf.ones(dtype=tf.float32, shape=1) # In your case, this would be the results of some ops
output = myInputTensor * 5.0
with tf.Session() as sess:
print(sess.run(output)) # == 5.0, using the Tensor value
myNumpyData = np.zeros(1)
print(sess.run(output, {myInputTensor: myNumpyData}) # == 0.0 * 5.0 = 0.0, using the np value
This works for me in latest version...maybe you have older version of TF?
a = tf.Variable(1)
sess.run(2*a, feed_dict={a:5}) # prints 10