Pytorch tensor to numpy conversion imprecision - numpy

I am using pytorch to build a network model that outputs a probabilistic distribution, after obtaining predictions i want to transform them into a numpy array for further processing, although, when doing that, the converted array is not precise, that is, the values differ from the original (see example below). This is specially bad in my case because, considering the imprecision, the rows does not sum exactly 1 anymore, which is causing me issues further into the code, any ideas on why this is happening? Any help would be much appreciated.
>> predictions
Out[34]:
tensor([[0.0155, 0.0260, 0.2671, 0.6820, 0.0093],
[0.1231, 0.1076, 0.1660, 0.5376, 0.0658],
[0.0734, 0.0501, 0.1683, 0.6602, 0.0480],
...,
[0.0260, 0.0287, 0.0465, 0.8830, 0.0159],
[0.0251, 0.0327, 0.3148, 0.5643, 0.0632],
[0.0105, 0.0116, 0.2723, 0.6702, 0.0354]], device='cuda:0',
grad_fn=<CopySlices>)
>> predictions.cpu().detach().numpy()
Out[33]:
array([[0.01549727, 0.02597333, 0.26714763, 0.68203545, 0.00934631],
[0.12310678, 0.10758312, 0.16596784, 0.53758585, 0.06575638],
[0.07338995, 0.05007047, 0.16834012, 0.6601813 , 0.04801816],
...,
[0.02603309, 0.02868567, 0.046453 , 0.8829591 , 0.01586908],
[0.02508399, 0.03266559, 0.31480333, 0.564255 , 0.06319207],
[0.01048695, 0.01161059, 0.2722554 , 0.6702131 , 0.03543395]],
dtype=float32)

Related

tf.reshape with the tensor size raises mismatched number of values

I have the following code:
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
nelems = tf.size(tensor, out_type=tf.int64, name='num_elements')
indices = tf.transpose(
tf.unravel_index(tf.range(nelems, dtype=tf.int64), shape),
name='sparse_indices')
values = tf.reshape(tensor, [nelems], name='sparse_values')
This code snippet is simply transforming a dense tensor into a sparse tensor. However I found that the reshape op sometimes raises an error in runtime:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 906 values, but the requested shape has 1024
It's hard to write a simple demo to reproduce this bad case. So please understand that I cannot provide a reproducible demo.
But notice that my code is very simple. The reshape op is simply reshaping the tensor into a 1D tensor with the dimension size as the tensor's size, which is the number of elements of the tensor (illustrated in TensorFlow's doc). And in my mind, the number of elements here simply means the number of of values in the error message. Thus the above error should never appear.
I tried to use production of the shape as the target dimension size instead of tf.size but it was no use:
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
# use production as the number of elements
nelems = tf.reduce_prod(shape, name='num_elements')
....
values = tf.reshape(tensor, [nelems], name='sparse_values')
So my question is, why is there a possibility that, for a certain tensor tensor, tf.size(tensor) or tf.shape(tensor) does not tell the actual number of elements of tensor? Can anyone remind if I have missed anything? Thanks.
I have figured out the problem on myself.
Problem:
In my project, the problem is that, tensor is produced by a third-party library. The library called tensor.set_shape([1024]) before returning tensor. While it can't ensure that there must be 1024 elements in tensor.
According to these codes, in TensorFlow's python frontend implementation, when the shape is fully determined, tf.shape and tf.size can go a fast way to get the result without really running the ShapeOp or SizeOp, and returning a constant tensor of the determined shape dimensions as the result.
As a result, in my case, the shape is obviously fully determined as [1024], so the code goes in the fast way and returned tf.constant([1024]). However the real shape of the Tensor object in the backend is [906].
Solution
According to the previously mentioned codes, we can see that tf.shape and tf.size actually calls shape_internal and size_internal defined in tensorflow.python.ops.array_ops. The latter functions takes one more argument optimize with default value True. And if optimize is false, the fast way will be ignored.
So the solution is to replace the tf.shape or tf.size with shape_internal or size_internal, and pass optimize=False.
# internal functions are not exposed by `tensorflow` root package
# so we have to import the `array_ops` package manualy
from tensorflow.python.ops import array_ops
....
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
#nelems = tf.size(tensor, out_type=tf.int64, name='num_elements')
nelems = array_ops.size_internal(tensor, optimize=False, out_type=tf.int64, name='num_elements')
....
values = tf.reshape(tensor, [nelems], name='sparse_values')

TFP Linear Regression yhat=model(x_tst) - doesn't work for other data

I cannot see the difference between what I am doing and the working Google TFP example, whose structure I am following. What am I doing wrong/should I be doing differently?
[Setup: Win 10 Home 64-bit 20H2, Python 3.7, TF2.4.1, TFP 0.12.2, running in Jupyter Lab]
I have been building a model step by step following the example of TFP Probabilistic Layers Regression. The Case 1 code runs fine, but my parallel model doesn't and I cannot see the difference that might cause this
yhat = model(x_tst)
to fail with message Input 0 of layer sequential_14 is incompatible with the layer: : expected min_ndim=2, found ndim=1. Full shape received: (2019,) (which is the correct 1D size of x_tst)
For comparison: Google's load_dataset function for the TFP example returns y, x, x_tst, which are all np.ndarray of size 150, whereas I read data from a csv file with pandas.read_csv, split it into train_ and test_datasets and then take 1 col of data as independent variable 'g' and dependent variable 'redz' from the training dataset.
I know x, y, etc. need to be np.ndarray, but one does not create ndarray directly, so I have...
x = np.array(train_dataset['g'])
y = np.array(train_dataset['redz'])
x_tst = np.array(test_dataset['g'])
where x, y, x_tst are all 1-dimensional - just like the TFP example.
The model itself runs
model = tf.keras.Sequential([
tf.keras.layers.Dense(1),
tfp.layers.DistributionLambda(lambda t: tfd.Normal(loc=t, scale=1)),
])
# Do inference.
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.01), loss=negloglik)
model.fit(x, y, epochs=1, verbose=False);
(and when plotted gives the expected output for the google data - I don't get this far):
But, per the example when I try to "profit" by doing yhat = model(x_tst) I get the dimensions error given above.
What's wrong?
(If I try mode.predict I think I hit a known bug/gap in TFP; then it fails the assert)
Update - Explicit Reshape Resolves Issue
The hint from Frightera led to further investigation: x_tst had shape (2019,)
Reshaping by x_tst = x_tst.rehape(2019,1) resolved the issue. Is TF inconsistent in its requirements or is there some good reason that the explicit final dimension 1 was required? Who knows. At least predictions can be made now.
In this question Difference between numpy.array shape (R, 1) and (R,), the OP asked for the difference between (R,) and (R,1) but the answers given did not address this specific point.
Similarly in this question Difference between these array shapes in numpy
I believe the answer lies in the numpy glossary, where it says of (n,) that
A parenthesized number followed by a comma denotes a tuple with one
element. The trailing comma distinguishes a one-element tuple from a
parenthesized n.
Which, naturally, echoes the Python statements concerning tuples here
Thus an array of shape (R,) is a tuple describing an array as being 1D of a certain extent R, where the comma is appended to distinguish the tuple (R,) from the non-tuple (R).
However, for a 1D array, there is no sense of row or column ordering; (R,1) is R rows by 1 column, but (1, R) would be 1 row of R columns, and though it shouldn't matter to a 1D iterator either it does or the iterator doesn't correctly recognise ( ,) and thinks it is 2D. (i.e. I don't know the technical details of that part, but these seem to be the only options that account for the behaviour.)
This issue is unrelated to the indeterminacy of size that occurs in tensor definition in Tensorflow. In the context of Tensorflow, Tensors (arrays) may have indeterminate shapes, so that more data may be added along a certain axis as processing occurs, e.g. in batches, in which case the initial Tensor shape includes a leading None to indicate where array expansion is expected to occur. (See e.g. tensor's shape here)

dimension of a tensor created by tf.zeros(n)

I'm confused by the dimension of a tensor created with tf.zeros(n). For instance, if I write: tf.zeros(6).eval.shape, this will return me (6, ). What dimension is this? is this a matrix of 6 rows and arbitrary # of columns? Or is this a matrix of 6 columns with arbitrary # of rows?
weights = tf.random_uniform([3, 6], minval=-1, maxval=1, seed=1)- this is 3X6 matrix
b=tf.zeros(6).eval- I'm not sure what dimension this is.
Why I am able to add the two like weights+b? If I understand correctly, in order for the two to be added, b needs to be 3X1 dimension.
why i am able to add the two like weights+b?
Operator + is the same as using tf.add() (<obj>.__add__() calls the tf.add() or tf.math.add()) and if you read the documentation it says:
NOTE: math.add supports broadcasting. AddN does not. More about broadcasting here
Now I'm quoting from numpy broadcasting rules (which are the same for tensorflow):
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when
they are equal, or
one of them is 1
So you're able to add two tensors with different shapes because they have the same trailing dimensions. If you change the dimension of your weights tensor to, let's say [3, 5], you will get InvalidArgumentError exception because trailing dimensions differ.
(6,) is python syntax for a tuple with 6 as a single element. Hence the shape here is a uni-dimensional vector of length 6.

Complex convolution in tensorflow

I'm trying to run a simple convolution but with complex numbers:
r = np.random.random([1,10,10,10])
i = np.random.random([1,10,10,10])
x = tf.complex(r,i)
conv_layer = tf.layers.conv2d(
inputs=x,
filters=10,
kernel_size=[3,3],
kernel_initializer=utils.truncated_normal_complex(),
activation=tf.nn.sigmoid)
However I get this error:
TypeError: Value passed to parameter 'input' has DataType complex128 not in list of allowed values: float16, float32
Does anyone know how to implement such a convolution in Tensorflow?
Will I need to implement a custom op, or is there some better option here?
Frustratingly, complex matrix multiplication is possible, e.g. the following runs fine:
def r():
return np.random.random([10,10])
A = tf.complex(r(),r())
B = tf.complex(r(),r())
C = tf.multiply(A,B)
sess.run(C)
So there's no real reason convolution shouldn't work, I would think (as convolution is essentially just matrix multiplication).
Thanks
Probably too late but for anyone who still is interested: applying convolutions to complex valued data is not as straightforward as your usual data types, like float32. There are studies that investigat different network structures for this purpose (for example see this link for "Deep Complex U-Net"). There are implementations of these structures in pytorch and tensorflow.
All complex-valued features are split into either Cartesian (real, imaginary) or polar (modulus, angle) representations. Nobody is really trying to use a single feature that is purely complex; I would love to be proven wrong!

TensorFlow: Contracting a dimension of two tensors via dot product

I have two tensors, a of rank 4 and b of rank 1. I'd like to produce aprime, of rank 3, by "contracting" the last axis of a away, by replacing it with its dot product against b. In numpy, this is as easy as np.tensordot(a, b, 1). However, I can't figure out a way to do this in Tensorflow.
How can I replace the last axis of a tensor with a value equal to that axis's dot product against another tensor (of course, of the same shape)?
UPDATE:
I see in Wikipedia that this is called the "Tensor Inner Product" https://en.wikipedia.org/wiki/Dot_product#Tensors aka tensor contraction. It seems like this is a common operation, I'm surprised that there's no explicit support for it in Tensorflow.
I believe that this may be possible via tf.einsum; however, I have not been able to find a generalized way to do this that works for tensors of any rank (this is probably because I do not understand einsum and have been reduced to trial and error)
Aren't you just using tensor in the sense of a multidimensional array? Or in some disciplines a tensor is 3d (vector 1d, matrix 2d, etc). I haven't used tensorflow but I don't think it has much to do with tensors in that linear algebra sensor. They talk about data flow graphs. I'm not sure where the tensor part of the name comes from.
I assume you are talking about an expression like:
In [293]: A=np.tensordot(np.ones((5,4,3,2)),np.arange(2),1)
resulting in a (5,4,3) shape array. The einsum equivalent is
In [294]: B=np.einsum('ijkl,l->ijk',np.ones((5,4,3,2)),np.arange(2))
np.einsum implements Einstine Notation, as discussed here: https://en.wikipedia.org/wiki/Einstein_notation. I got this link from https://en.wikipedia.org/wiki/Tensor_contraction
You seem to be talking about straight forward numpy operations, not something special in tensorflow.
I would first add 3 dimensions of size 1 to b so that it can be broadcast along the 4'th dimension of a.
b = tf.reshape(b, (1, 1, 1, -1))
Then you can multiply b and a and it will broadcast b along all of the other dimensions.
a_prime = a * b
Finally, reduce the sum along the 4'th dimension to get rid of that dimension and replace it with the dot product.
a_prime = tf.reduce_sum(a_prime, [3])
This seems like it would work (for the first tensor being of any rank):
tf.einsum('...i,i->...', x, y)