I have one image data tensor with shape of B*H*W*C and one position tensor with shape of B*H*W*2. The values in position tensor are pixel coordinates and I want to sample pixels in image data tensor according to these pixel coordinates. I have tried one way to do that like reshaping the tensor to one-dimension tensor, but I think it's really inconvenient. I wonder whether I could implement it by some more convenient approach like matrix mapping(e.g. remap in opencv).
I would first ask if you are sure the position matrix isn't redundant. If the position matrix entries simply correspond to the pixel locations in the image array, then for a given application however you access the position matrix could be used instead on the image data.
Perhaps as a starting point, running
sess = tf.Session()
np_img, np_pos = sess.run([tf_img, tf_pos], feed_dict={...})
will convert tensors to numpy arrays, which may make your operations easier.
Otherwise, a 1D-tensor isn't that bad and there are TF functions for reshaping easily.
Related
This seems so basic, but for some reason, I can't find any clear documentation on it.
So lets say I know my ONNX model wants an input of shape [245, 245, 3]. The second argument in the constructor Ort::Value::CreateTensor wants a linear array of the data to fill the tensor. What is the order of the linear array?
For example, are the first three values in the linear array the BGR values for the 0-th pixel in the image, or are the first three values in the linear array the B-channel value of the first three pixels in the image? And as for ordering of pixels in the image: row-major?
The short answer is : ONNX only supports NCHW
As a reference, please check the section My converted TensorFlow model is slow - why? in onnxruntime.ai. This is the only "official" material that talking about the data format I found so far.
It's row-major. The format of inputs in ONNX is NCHW. C = number of channels. In this case C=3. The ordering of C (BGR or RGB) depends on the model. For e.g. the YOLO model takes an image 3(RGB) x 416px x 416px.
I have a 28x28 pixel image as a numpy array and its shape is (28,28) using the np.array.shape function. I want the shape to be 784x1. In other words with a NxN matrix how do you convert it to a N^2x1. Using the flatten function i get almost what I'm looking for, the shape from flatten is (784,).
Another possible way is to use np.atleast_2d
np.atleast_2d(arr.flatten())
I have input image Tensor with shape [?, 448, 448, 3] and my network predicts a bounding box with shape [?, 4]. I want to slice my image tensor with the bounding box tensor and re-size the resulting tensor into a fixed size image for further processing.
Is this possible with tensorflow (or even better, natively in Keras)? I have read the relevant questions. E.g, this, and this, but they do not apply to when both the indexing tensor and the original tensor have an unknown first dimension.
Any help in the right direction is much appreciated !
The best way for you should be to use tf.image.crop_and_resize. From the documentation:
Extracts crops from the input image tensor and bilinearly resizes them
(possibly aspect ratio change) to a common output size specified by
crop_size. This is more general than the crop_to_bounding_box op which
extracts a fixed size slice from the input image and does not allow
resizing or aspect ratio change.
Returns a tensor with crops from the input image at positions defined
at the bounding box locations in boxes. The cropped boxes are all
resized (with bilinear interpolation) to a fixed size = [crop_height,
crop_width]. The result is a 4-D tensor [num_boxes, crop_height,
crop_width, depth].
I am a newbie to Keras (and somehow to TF) but I have found shape definition for the input layer very confusing.
So in the examples, when we have a 1D vector of length 20 for input, shape gets defined as
...Input(shape=(20,)...)
And when a 2D tensor for greyscale images needs to be defined for MNIST, it is defined as:
...Input(shape=(28, 28, 1)...)
So my question is why the tensor is not defined as (20) and (28, 28)? Why in the first case a second dimension is added and left empty? Also in second, number of channels have to be defined?
I understand that it depends on the layer so Conv1D, Dense or Conv2D take different shapes but it seems the first parameter is implicit?
According to docs, Dense needs be (batch_size, ..., input_dim) but how is this related the example:
Dense(32, input_shape=(784,))
Thanks
Tuples vs numbers
input_shape must be a tuple, so only (20,) can satisfy it. The number 20 is not a tuple. -- There is the parameter input_dim, to make your life easier if you have only one dimension. This parameter can take 20. (But really, I find it just confusing, I always work with input_shape and use tuples, to keep a consistent understanding).
Dense(32, input_shape=(784,)) is the same as Dense(32, input_dim=784).
Images
Images don't have only pixels, they also have channels (red, green, blue).
A black/white image has only one channel.
So, (28pixels, 28pixels, 1channel)
But notice that there isn't any obligation to follow this shape for images everywhere. You can shape them the way you like. But some kinds of layers do demand a certain shape, otherwise they couldn't work.
Some layers demand specific shapes
It's the case of the 2D convolutional layers, which need (size1,size2,channels). They need this shape because they must apply the convolutional filters accordingly.
It's also the case of recurrent layers, which need (timeSteps,featuresPerStep) to perform their recurrent calculations.
MNIST models
Again, there isn't any obligation to shape your image in a specific way. You must do it according to which first layer you choose and what you intend to achieve. It's a free thing.
Many examples simply don't care about an image being a 2d structured thing, and they just use models that take 784 pixels. That's enough. They probably start with Dense layers, which demand shapes like (size,)
Other examples may care, and use a shape (28,28), but then these models will have to reshape the input to fit the needs of the next layer.
Convolutional layers 2D will demand (28,28,1).
The main idea is: input arrays must match input_shape or input_dim.
Tensor shapes
Be careful, though, when reading Keras error messages or working with custom / lambda layers.
All these shapes we defined before omit an important dimension: the batch size or the number of samples.
Internally all tensors will have this additional dimension as the first dimension. Keras will report it as None (a dimension that will adapt to any batch size you have).
So, input_shape=(784,) will be reported as (None,784).
And input_shape=(28,28,1) will be reported as (None,28,28,1)
And your actual input data must have a shape that matches that reported shape.
I'm doing a Matrix Factorization in TensorFlow, I want to use coo_matrix from Spicy.sparse cause it uses less memory and it makes it easy to put all my data into my matrix for training data.
Is it possible to use coo_matrix to initialize a variable in tensorflow?
Or do I have to create a session and feed the data I got into tensorflow using sess.run() with feed_dict.
I hope that you understand my question and my problem otherwise comment and i will try to fix it.
The closest thing TensorFlow has to scipy.sparse.coo_matrix is tf.SparseTensor, which is the sparse equivalent of tf.Tensor. It will probably be easiest to feed a coo_matrix into your program.
A tf.SparseTensor is a slight generalization of COO matrices, where the tensor is represented as three dense tf.Tensor objects:
indices: An N x D matrix of tf.int64 values in which each row represents the coordinates of a non-zero value. N is the number of non-zeroes, and D is the rank of the equivalent dense tensor (2 in the case of a matrix).
values: A length-N vector of values, where element i is the value of the element whose coordinates are given on row i of indices.
dense_shape: A length-D vector of tf.int64, representing the shape of the equivalent dense tensor.
For example, you could use the following code, which uses tf.sparse_placeholder() to define a tf.SparseTensor that you can feed, and a tf.SparseTensorValue that represents the actual value being fed :
sparse_input = tf.sparse_placeholder(dtype=tf.float32, shape=[100, 100])
# ...
train_op = ...
coo_matrix = scipy.sparse.coo_matrix(...)
# Wrap `coo_matrix` in the `tf.SparseTensorValue` form that TensorFlow expects.
# SciPy stores the row and column coordinates as separate vectors, so we must
# stack and transpose them to make an indices matrix of the appropriate shape.
tf_coo_matrix = tf.SparseTensorValue(
indices=np.array([coo_matrix.rows, coo_matrix.cols]).T,
values=coo_matrix.data,
dense_shape=coo_matrix.shape)
Once you have converted your coo_matrix to a tf.SparseTensorValue, you can feed sparse_input with the tf.SparseTensorValue directly:
sess.run(train_op, feed_dict={sparse_input: tf_coo_matrix})