PyTorch tensors element differ from numpy arrays after conversion [duplicate] - numpy

I'm trying to print torch.FloatTensor like:
a = torch.FloatTensor(3,3)
print(a)
This way I can get a value like:
0.0000e+00 0.0000e+00 3.2286e-41
1.2412e-40 1.2313e+00 1.6751e-37
2.6801e-36 3.5873e-41 9.4463e+21
But I want to get more accurate value, like 10 decimal point:
0.1234567891+01
With other python numerical objects, I could get it with:
print('{:.10f}'.format(a))
but in the case of a tensor, I get this error:
TypeError: unsupported format string passed to torch.FloatTensor.__format__
How can I print more precise values of tensors?

You can set the precision options:
torch.set_printoptions(precision=10)
There are more formatting options on the documentation page, it is very similar to numpy's.

Just as a side note, this functionality has been taken from numpy. One of the reasons why PyTorch is smart is because they took many the good ideas from numpy.
However, in numpy the default precision is 8 and in PyTorch the default is 4.

Related

Does the sklearn.ensemble.GradientBoostingRegressor support sparse input samples?

I’m using sklearn.ensemble.GradientBoostingRegressor on data that is sometimes lacking some values. I can’t easily impute these data because they have a great variance and the estimate is very sensitive to them. They are also almost never 0.
The documentation of the fit method says about the first parameter X:
The input samples. Internally, it will be converted to dtype=np.float32 and if a sparse matrix is provided to a sparse csr_matrix.
This has lead me to think that the GradientBoostingRegressor can work with sparse input data.
But internally it calls check_array with implicit force_all_finite=True (the default), so that I get the following error if I put in a csr_matrix with NaN values:
ValueError: Input contains NaN, infinity or a value too large for dtype('float32')
Does the GradientBoostingRegressor not actually support sparse data?
Update:
I’m lucky in that I don’t have any meaningful zeros. My calling code now looks like this:
predictors['foobar'] = predictors['foobar'].fillna(0) # for columns that contain NaNs
predictor_matrix = scipy.sparse.csr_matrix(
predictors.values.astype(np.float)
)
predictor_matrix.eliminate_zeros()
model.fit(predictor_matrix, regressands)
This avoids the exception above. Unfortunately there is no eliminate_nans() method. (When I print a sparse matrix with NaNs, it lists them explicitly, so spareness must be something other than containing NaNs.)
But the prediction performance hasn’t (noticeably) changed.
Perhaps you could try using LightGBM. Here is a discussion in Kaggle about how it handles missing values:
https://www.kaggle.com/c/home-credit-default-risk/discussion/57918
Good luck

How to implement tf.nn.sigmoid_cross_entropy_with_logits

I am currently learning tensorflow, and I have run into an issue with
tf.nn.sigmoid_cross_entropy_with_logits(labels=y,logits=logits). The function description says that both labels and logits must be of the same type. I have the function below that I am using to classify MNIST images. The following are key section of my code
X=tf.placeholder(tf.float32,shape=(None,n_inputs),name="X")
y=tf.placeholder(tf.int32,shape=(None),name="y")
def neuron_layer(X,W,b,n_neurons,name,activation=None):
with tf.name_scope(name):
n_inputs=int(X.get_shape()[1])
stddev=2/np.sqrt(n_inputs)
z=tf.matmul(X,W)+b
if activation=="sigmoid":
return tf.math.sigmoid(z)
else:
return z
with tf.name_scope("dnn"):
hidden1=neuron_layer(X,W1,b1,n_hidden1,"hidden",activation="sigmoid")
logits=neuron_layer(hidden1,W2,b2,n_outputs,"outputs",activation="sigmoid")
with tf.name_scope("loss"):
xentropy=tf.nn.sigmoid_cross_entropy_with_logits(labels=y,logits=logits)
loss=tf.reduce_mean(xentropy,name="loss")
I get the error: input 'y' of 'Mul' Op has type int32 that does not match type float32 of argument 'x
if I change
y=tf.placeholder(tf.float32,shape=(None),name="y"). I get the error
Value passed to parameter 'targets' has DataType float32 not in list of allowed values: int32, int64. Yet logits can only be float32 or float64. Please help me fix the issue. Thanks
As mentioned in the comments, tf.nn.sigmoid_cross_entropy_with_logits is the wrong function. In your case you should use tf.nn.softmax_cross_entropy_with_logits instead (actually, that one yields a deprecation warning, so tf.nn.softmax_cross_entropy_with_logits_v2 is the correct one). Also note, as also mentioned in the comments, that the point of these two functions is that they have a sigmoid (or softmax, respectively) built in, so your model shouldn't have any activation function on the last layer.
Regarding the issue: I just tried it with tensorflow version 1.14.0. There, the issue still occurs if y has type int32. However, it works smoothly if both, y and labels, have type float32.
It's kind of inconsistent that tf.nn.sigmoid_cross_entropy_with_logits does not perform this cast itself, while tf.nn.softmax_cross_entropy_with_logits has no issue with y being int32.

Tensorflow Object Detection API Kitti dataset Error: Tensor had NaN values

I am struggling with this problem for a couple of days.
Basically, when I start training with the object detection API of tensorflow, it does one iteration and gets an error, if I use the data from a the tutorial raccoon detection it works perfectly.
I already tried only use one class, or multiple, different images, only checked images, use everything equal to the raccoon tutorial.
Thank you for your time.
Error:
InvalidArgumentError (see above for traceback): LossTensor is inf or
nan. : Tensor had NaN values [[Node: CheckNumerics =
CheckNumericsT=DT_FLOAT, message="LossTensor is inf or nan.",
_device="/job:localhost/replica:0/task:0/cpu:0"]]
The NaN error means that some value of the tensor analyzed it is null. May be some of your images has different sizes and the input it’s getting null values because of that. It’s just a guess, I don’t even know if you are using images or video to train the system, but if the code works with one sample and don’t work with another one the problem must be at the samples.
You might want to check that object annotations are correct, the NaN error is most likely caused by incorrect calculation involving the annotations, i.e. check the following:
No NaN values in annotations
No bounding box is outside the image boundaries
Annotations are in pixel values (i.e. not normalized)
XMin < XMax and YMin < YMax
There are no bounding boxes that are too small (e.g. 1% of the image)
There is no problem due to data augmentation.
Reference: https://github.com/tensorflow/models/issues/1881

How do output shape in cntk?

I write this code:
matrix = C.softmax(model).eval(data).
But matrix.shape, matrix.size give me errors. So I'm wondering, how can I output the shape of CNTK variable?
First note that eval() will not give you a CNTK variable, it will give you a numpy array (or a list of numpy arrays, see the next point).
Second, depending on the nature of the model it is possible that what comes out of eval() is not a numpy array but a list. The reason for this is that if the output is a sequence then CNTK cannot guarrantee that all sequences will be of the same length and it therefore returns a list of arrays, each array being one sequence.
Finally, if you truly have a CNTK variable, you can get the dimensions with .shape

does tensorflow 0.10.0rc version support float16?

In order to reduce the tensor, I defined all the variables with dytpe=tf.float16 in my Model, and then defined the optimizer:
optimizer = tf.train.AdamOptimizer(self.learning_rate)
self.compute_gradients = optimizer.compute_gradients(self.mean_loss_reg)
train_adam_op = optimizer.apply_gradients(self.compute_gradients, global_step=self.global_step)
Everything works ok! but after I run the train_adam_op, the the gradients and variables are nan in python. I wander If the apply_gradients() API supports tf.float16 type? Why I got nan after apply_gradients() was called by session.run()....
The dynamic range of fp16 is fairly limited compared to that of 32-bit floats. As a result, it's pretty easy to overflow or underflow them, which often results in the NaN that you've encountered.
You can insert a few check_numerics operations in your model to help pinpoint the specific operation(s) that becomes unstable when performed on fp16.
For example, you can wrap a L2 loss operation as follow to check that its result fits in an fp16
A = tf.l2_loss(some_tensor)
becomes
A = tf.check_numerics(tf.l2_loss(some_tensor), "found the root cause")
The most common source of overflows and underflows are the exp(), the log(), as well as the various classification primitives, so I would start looking there.
Once you've figured out which sequence of operations is problematic, you can update your model to perform that sequence using 32-bit floats by using tf.cast() to convert the inputs of the sequence to 32bit floats, and cast the result back to fp16.