Trouble Understanding broadcasting behavior for tensors - numpy

I am trying to do element-wise multiplication of two tensors of dimensions (1,5,64) and (1,5). As far as I know, in spite of their dimension mismatch, broadcasting should allow this to work. So, I use this code:
x = tf.range(0,64*5)
x = tf.reshape(x, [1,5, 64])
y = tf.range(0,5)
y = tf.reshape(y, [1, 5])
prodct = x*y
This causes this error:
InvalidArgumentError: Incompatible shapes: [1,5,64] vs. [1,5] [Op:Mul]
However If i reshape first tensor to dimension (1,64,5), then it works. Code:
x = tf.range(0,64*5)
x = tf.reshape(x, [1,64, 5])
y = tf.range(0,5)
y = tf.reshape(y, [1, 5])
prodct = x*y
I do not understand why the first code does not work.

The General Broadcasting Rules, when operating on two arrays, numpy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when
they are equal, or
one of them is 1
If these conditions are not met, a ValueError: operands could not be broadcast together exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the size that is not 1 along each axis of the inputs.
tensorflow also follows the same spirit. Check the documentation for more examples and details. For your case, the rightmost dimension doesn't follow the rules and throws an error.
1, 5, 64
1, 5
But this would work as it obeys the rules.
1, 64, 5
1, 5
Code
In numpy, and in tensorflow for reference.
import numpy as np
a = np.arange(64*5).reshape(1, 64, 5)
b = np.arange(5).reshape(1,5)
(a*b).shape
(1, 64, 5)
import tensorflow as tf
x = tf.reshape(tf.range(0,64*5), [1, 64, 5])
y = tf.reshape(tf.range(0,5), [1, 5])
(x*y).shape
TensorShape([1, 64, 5])

Related

Using tf extract_image_patches for input to a CNN?

I want to extract patches from my original images to use them as input for a CNN.
After a little research I found a way to extract patches with
tensorflow.compat.v1.extract_image_patches.
Since these need to be reshaped to "image format" I implemented a method reshape_image_patches to reshape them and store the reshaped patches in an array.
image_patches2 = []
def reshape_image_patches(image_patches, sess, ksize_rows, ksize_cols):
a = sess.run(tf.shape(image_patches))
nr, nc = a[1], a[2]
for i in range(nr):
for j in range(nc):
patch = tf.reshape(image_patches[0,i,j,], [ksize_rows, ksize_cols, 3])
image_patches2.append(patch)
return image_patches2
How can I use this in combination with Keras generators to make these patches the input of my CNN?
Edit 1:
I have tried the approach in Load tensorflow images and create patches
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
dataset = tf.keras.preprocessing.image_dataset_from_directory(
<directory>,
label_mode=None,
seed=1,
subset='training',
validation_split=0.1,
image_size=(900, 900))
get_patches = lambda x: (tf.reshape(
tf.image.extract_patches(
x,
sizes=[1, 16, 16, 1],
strides=[1, 8, 8, 1],
rates=[1, 1, 1, 1],
padding='VALID'), (111*111, 16, 16, 3)))
dataset = dataset.map(get_patches)
fig = plt.figure()
plt.subplots_adjust(wspace=.1, hspace=.2)
images = next(iter(dataset))
for index, image in enumerate(images):
ax = plt.subplot(2, 2, index + 1)
ax.set_xticks([])
ax.set_yticks([])
ax.imshow(image)
plt.show()
In line: images = next(iter(dataset)) I get the error: InvalidArgumentError: Input to reshape is a tensor with 302800896 values, but the requested shape has 9462528
[[{{node Reshape}}]]
Does somebody know how to fix this?
The tf.reshape does not change the order of or the total number of elements in the tensor. The error as states, you are trying to reduce total number of elements from 302800896 to 9462528 . You are using tf.reshape in lambda function.
In below example, I have recreated your scenario where I have the given the shape argument as 2 for tf.reshape which doesn't accommodate all the elements of original tensor, thus throws the error -
Code -
%tensorflow_version 2.x
import tensorflow as tf
t1 = tf.Variable([1,2,2,4,5,6])
t2 = tf.reshape(t1, 2)
Output -
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-3-0ff1d701ff22> in <module>()
3 t1 = tf.Variable([1,2,2,4,5,6])
4
----> 5 t2 = tf.reshape(t1, 2)
3 frames
/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)
InvalidArgumentError: Input to reshape is a tensor with 6 values, but the requested shape has 2 [Op:Reshape]
tf.reshape should be in such a way that the arrangement of elements can change but total number of elements must remain the same. So the fix would be to change the shape to [2,3] -
Code -
%tensorflow_version 2.x
import tensorflow as tf
t1 = tf.Variable([1,2,2,4,5,6])
t2 = tf.reshape(t1, [2,3])
print(t2)
Output -
tf.Tensor(
[[1 2 2]
[4 5 6]], shape=(2, 3), dtype=int32)
To solve your problem, either extract patches(tf.image.extract_patches) of size that you are trying to tf.reshape OR change the tf.reshape to size of extract patches.
Will also suggest you to look into other tf.image functionality like tf.image.central_crop and tf.image.crop_and_resize.

Understanding INDArray dimension reshaping for Tensorflow Object detection models

Trying to load Tensorflow trained model into Deeplearning4J with following error:
IllegalStateException: Invalid array shape: cannot associate an array with shape [38880] with a placeholder of shape [-1, -1, -1, 3]:shape is wrong rank or does not match on one or more dimensions
var arr: INDArray = Nd4j.create(data) //.reshape(1, -1, -1, 3);
arr = Nd4j.pile(arr, arr)
sd.associateArrayWithVariable(arr, sd.variables.get(0))
Python model was loaded like that:
# Load image using OpenCV and
# expand image dimensions to have shape: [1, None, None, 3]
# i.e. a single-column array, where each item in the column has the pixel RGB value
image = cv2.imread(PATH_TO_IMAGE)
image_expanded = np.expand_dims(image, axis=0)
Please explain any question if you know:
1) What means [1, None, None, 3] in terms of Python arrays
2) What means np.expand_dims(image, axis=0) in Python
3) Deeplearning4J reshape(1, -1, -1, 3);
You're mixing two different concepts here, TF placeholders, and imperative numpy-like reshape.
In your case, model expects 4D input tensor, with shape [-1, -1, -1, 3]. For human it can be translated to [Any, Any, Any, 3]. But you're trying to feed it with tensor with shape [38880], rank 1.
Now to your questions.
1) See above. -1 is treated as "Any".
2) This function adds 1 as dimension. i.e. if you have [38880], expand_dims at axis=0 will make it [1, 38880]
3) Nope, that's wrong. You should not use that as your shape. You have some image there, so you should specify proper dimensions your image has, i.e. [1, 800, 600, 3].

How to apply dropout in tensorflow to multidimensional tensors?

I have a 3D tensor called X, of shape say [2,20,300] and I would like to apply dropout to only the third dimension. However, I want the dropped elements to be the same for the 20 instances (second dimension) but not necessarily for first dimension.
What is the behaviour of the following:
tf.nn.dropout(X[0], keep_prob=p)
Would it only act on the dimension that I want? If so, then for multiple first dimensions, I could loop over them and apply the above line.
See the documentation of tf.nn.dropout:
By default, each element is kept or dropped independently. If
noise_shape is specified, it must be broadcastable to the shape of x,
and only dimensions with noise_shape[i] == shape(x)[i] will make
independent decisions
So it is as simple as:
import tensorflow as tf
import numpy as np
data = np.arange(300).reshape((1, 1, 300))
data = np.tile(data, (2, 20, 1))
data_op = tf.convert_to_tensor(data.astype(np.float32))
data_op = tf.nn.dropout(data_op, 0.5, noise_shape=[2, 1, 300])
with tf.Session() as sess:
data = sess.run(data_op)
for b in range(2):
for c in range(20):
assert np.allclose(data[0, 0, :], data[0, c, :])
assert np.allclose(data[1, 0, :], data[1, c, :])
print((data[0, 0, :] - data[1, 0, :]).sum())
# output something != 0 with high probability#

How to explain the result of tf.map_fn?

Look at the code:
import tensorflow as tf
import numpy as np
elems = tf.ones([1,2,3],dtype=tf.int64)
alternates = tf.map_fn(lambda x: (x, x, x), elems, dtype=(tf.int64, tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
The output is:
(array([[[1, 1, 1],
[1, 1, 1]]], dtype=int64), array([[[1, 1, 1],
[1, 1, 1]]], dtype=int64), array([[[1, 1, 1],
[1, 1, 1]]], dtype=int64))
I can't understand the output, who can tell me?
update
elems is a tensor, so it should be unpacked along axis-0, and we will get [[1,1,1],[1,1,1]], and then map_fn pass [[1,1,1],[1,1,1]] into lambda x:(x,x,x),which means x=[[1,1,1],[1,1,1]], and I think the output of map_fn is
[[[1,1,1],[1,1,1]],
[[1,1,1],[1,1,1]],
[[1,1,1],[1,1,1]]]
The shape of output is [3,2,3] or a list of shape(2,3)
But in fact, the output is a list of tensor, the shape of each tensor is [1,2,3].
Or in other words:
import tensorflow as tf
import numpy as np
elems = tf.constant([1,2,3],dtype=tf.int64)
alternates = tf.map_fn(lambda x: (x, 2*x, -x), elems, dtype=(tf.int64, tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
Why the output is
(array([1, 2, 3], dtype=int64),
array([2, 4, 6], dtype=int64),
array([-1, -2, -3], dtype=int64))
rather than
(array([1, 2, -1], dtype=int64),
array([2, 4, -2], dtype=int64),
array([3, 6, -3], dtype=int64))
The two question is the same.
Update2
import tensorflow as tf
import numpy as np
elems = [tf.constant([1,2,3],dtype=tf.int64)]
alternates = tf.map_fn(lambda x: x, elems, dtype=tf.int64)
with tf.Session() as sess:
print(sess.run(alternates))
elems is a list of tensor, so according to api, tf.constant([1,2,3],dtype=tf.int64) will be unpacked along axis-0, so map_fn will works as [x for x in [1,2,3]], but in fact it will raise a error.
ValueError: The two structures don't have the same nested structure. First struc
ture: <dtype: 'int64'>, second structure: [<tf.Tensor 'map/while/TensorArrayRead
V3:0' shape=() dtype=int64>].
What's wrong?
update3
import tensorflow as tf
import numpy as np
elems = (tf.constant([1,2,3],dtype=tf.int64),tf.constant([1,2,3],dtype=tf.int64))
alternates = tf.map_fn(lambda x: x, elems, dtype=(tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
The output is
(array([1, 2, 3], dtype=int64), array([1, 2, 3], dtype=int64))
It seems that elems aren't unpacked, why?
import tensorflow as tf
import numpy as np
elems = (tf.constant([1,2,3],dtype=tf.int64),tf.constant([1,2,3],dtype=tf.int64))
alternates = tf.map_fn(lambda x: [x], elems, dtype=(tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
It will raise a error
TypeError: The two structures don't have the same sequence type. First structure
has type <class 'tuple'>, while second structure has type <class 'list'>.
Who can tell me how tf.map_fn works?
First,
elems = tf.ones([1,2,3],dtype=tf.int64)
elems is a 3-dimensional tensor with shape 1x2x3 full of ones, that is:
[[[1, 1, 1],
[1, 1, 1]]]
Then,
alternates = tf.map_fn(lambda x: (x, x, x), elems, dtype=(tf.int64, tf.int64, tf.int64))
alternates is a tuple of three tensors with the same shape as elems, each of which is built according to the given function. Since the function simply returns a tuple repeating its input three times, that means that the three tensors will be the same as elems. If the function were lambda x: (x, 2 * x, -x) then the first output tensor would be the same as elems, the second would be the double of elems and the third one the opposite.
In all these cases it is preferable to use regular operations instead of tf.map_fn; however, there may be cases where you have a function accepting tensors with N dimensions and you have a tensor with N + 1 that you want to have it applied to.
UPDATE:
I think you are thinking of tf.map_fn "the other way around", so to say. There is not a one-to-one correspondence between the number of elements or rows in the tensor and the number of outputs in the function; in fact, you could pass a function returning a tuple with as many elements as you want.
Taking your last example:
elems = tf.constant([1,2,3],dtype=tf.int64)
alternates = tf.map_fn(lambda x: (x, 2*x, -x), elems, dtype=(tf.int64, tf.int64, tf.int64))
tf.map_fn first split elems in the first axis, that is into 1, 2 and 3, and applies the function to each of them, getting:
(1, 2, -1)
(2, 4, -2)
(3, 6, -3)
Note that, as I said, each of these tuples could have as many elements as you wanted. Now, the final output is produced concatenating the results in the same position; so you get:
[1, 2, 3]
[2, 4, 6]
[-1, -2, -3]
Again, if the function produced tuples with more elements you would get more output tensors.
UPDATE 2:
About your new example:
import tensorflow as tf
import numpy as np
elems = (tf.constant([1,2,3],dtype=tf.int64),tf.constant([1,2,3],dtype=tf.int64))
alternates = tf.map_fn(lambda x: x, elems, dtype=(tf.int64, tf.int64))
with tf.Session() as sess:
print(sess.run(alternates))
The documentation says:
This method also allows multi-arity elems and output of fn. If elems is a (possibly nested) list or tuple of tensors, then each of these tensors must have a matching first (unpack) dimension. The signature of fn may match the structure of elems. That is, if elems is (t1, [t2, t3, [t4, t5]]), then an appropriate signature for fn is: fn = lambda (t1, [t2, t3, [t4, t5]]):.
Here elems is a tuple of two tensors with the same size in the first dimension, as needed. tf.map_fn takes one element of each input tensor at a time (so a tuple of two elements) and applies the given function to it, which should return the same structure that you passed in dtypes (a tuple of two elements, too); if you don't give a dtypes, then the expected output is the same as the input (again, a tuple of two elements, so in your case dtypes is optional). Anyway, it goes like this:
f((1, 1)) -> (1, 1)
f((2, 2)) -> (2, 2)
f((3, 3)) -> (3, 3)
These results are combined, concatenating all the corresponding elements in the structure; in this case, all the numbers in the first position produce the first output and all the numbers in the second positions produce the second output. The result is, finally, the requested structure (the two-element tuple) filled with these concatenations:
([1, 2, 3], [1, 2, 3])
Your input elems have shape (1,2,3) and look like this:
[[[1, 1, 1],
[1, 1, 1]]]
It's not a matrix containing values 1,2,3, because you create it with tf.ones() that makes a tensor filled with 1 with the shape you pass as parameter
Replying to the Update:
map_fn is applied to elems itself.
According to tf.map_fn's documentation:
elems: A tensor or (possibly nested) sequence of tensors, each of which will be unpacked along their first dimension. The nested sequence of the resulting slices will be applied to fn.
From what I understand there, the function expects a tensor or a list of tensors and supposedly slices it and applies the function to each element. However, from the results it seems that if you pass in a tensor that's the element it applies the function to directly, so x has shape (1,2,3) when the lambda function is called.
The function then creates a tuple with 3 copies of your (1,2,3) matrix (which is the array(...) in your output)
Restructuring the output line and adding indent to make it more clear, the output looks as follows:
(
array( # first copy of `x`
[
[
[1, 1, 1],
[1, 1, 1]
]
], dtype=int64
),
array( # second copy of `x`
[
[
[1, 1, 1],
[1, 1, 1]
]
], dtype=int64
),
array( # third copy of `x`
[
[
[1, 1, 1],
[1, 1, 1]
]
], dtype=int64
),
) # end of the tuple
Update 2:
My suspicion is that you ran into a bug. If you define elems as a list, you have the error, but if you define it as a tuple with elems = (tf.constant([1,2,3],dtype=tf.int64)), the code works as expected. Different handling of tuples and lists is very suspicious... which is why I believe it's a bug.
As #mrry pointed out, in my example with the tuple I missed a comma (and thus elems was the tensor itself and not a tuple containing the tensor).

Flatten batch in tensorflow

I have an input to tensorflow of shape [None, 9, 2] (where the None is batch).
To perform further actions (e.g. matmul) on it I need to transform it to [None, 18] shape. How to do it?
You can do it easily with tf.reshape() without knowing the batch size.
x = tf.placeholder(tf.float32, shape=[None, 9,2])
shape = x.get_shape().as_list() # a list: [None, 9, 2]
dim = numpy.prod(shape[1:]) # dim = prod(9,2) = 18
x2 = tf.reshape(x, [-1, dim]) # -1 means "all"
The -1 in the last line means the whole column no matter what the batchsize is in the runtime. You can see it in tf.reshape().
Update: shape = [None, 3, None]
Thanks #kbrose. For the cases where more than 1 dimension are undefined, we can use tf.shape() with tf.reduce_prod() alternatively.
x = tf.placeholder(tf.float32, shape=[None, 3, None])
dim = tf.reduce_prod(tf.shape(x)[1:])
x2 = tf.reshape(x, [-1, dim])
tf.shape() returns a shape Tensor which can be evaluated in runtime. The difference between tf.get_shape() and tf.shape() can be seen in the doc.
I also tried tf.contrib.layers.flatten() in another . It is simplest for the first case, but it can't handle the second.
flat_inputs = tf.layers.flatten(inputs)
You can use dynamic reshaping to get value of batch dimension through tf.batch during runtime, calculate the whole set of new dimensions into tf.reshape. Here's an example of reshaping flat list into square matrix without knowing list length.
tf.reset_default_graph()
sess = tf.InteractiveSession("")
a = tf.placeholder(dtype=tf.int32)
# get [9]
ashape = tf.shape(a)
# slice the list from 0th to 1st position
ashape0 = tf.slice(ashape, [0], [1])
# reshape list to scalar, ie from [9] to 9
ashape0_flat = tf.reshape(ashape0, ())
# tf.sqrt doesn't support int, so cast to float
ashape0_flat_float = tf.to_float(ashape0_flat)
newshape0 = tf.sqrt(ashape0_flat_float)
# convert [3, 3] Python list into [3, 3] Tensor
newshape = tf.pack([newshape0, newshape0])
# tf.reshape doesn't accept float, so convert back to int
newshape_int = tf.to_int32(newshape)
a_reshaped = tf.reshape(a, newshape_int)
sess.run(a_reshaped, feed_dict={a: np.ones((9))})
You should see
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]], dtype=int32)