Reshaping numpy 3D array - numpy

I have a dataset with dimensions: (32, 32, 73257) where 32x32 are pixels of a single image.
How do I reshape it to (73257, 1024) so that every image is unrolled in a row?
So far, I did:
self.train_data = self.train_data.reshape(n_training_examples, number_of_pixels*number_of_pixels)
and it looks like I got garbage instead of normal pictures. I am assuming that reshaping was performed across wrong dimension...??

As suggested in the comments, first get every image in a column, then transpose:
self.train_data = self.train_data.reshape(-1, n_training_examples).T
The memory layout of your array will not be changed by any of these operations, so two contiguous pixels of any image will lay 73257 bytes apart (assuming a uint8 image), which may not be the best of options if you want to process your data one image at a time. You will need to time and validate this, but creating a copy of the array may prove advantageous performance-wise:
self.train_data = self.train_data.reshape(-1, n_training_examples).T.copy()

Related

Trying to understand what this code is doing

I have a 2D numpy array call the_array with shape (5,10)
I would like to make sure what this piece of code is doing
h,w = the_array.shape
mask = np.ones((h,w))
mask[:int(h*0.35),:] =0 #?? what??
the_array = the_array* mask
I see that mask is an array of the same dimensions all made of 1s but what after that? (it if it any help these arrays are gonna be used as images later)
mask[:int(h*0.35), :] = 0 is nothing but an assignment which turns the first 35% of rows into zeros. So basically your mask will be 35% zeros and the rest ones. Multiply it with your image (i.e the_array * mask) will make the top part of the image completely black, like a naive image filter.

Implement CVAE for a single image

I have a multi-dimensional, hyper-spectral image (channels, width, height = 15, 2500, 2500). I want to compress its 15 channel dimensions into 5 channels.So, the output would be (channels, width, height = 5, 2500, 2500). One simple way to do is to apply PCA. However, performance is not so good. Thus, I want to use Variational AutoEncoder(VAE).
When I saw the available solution in Tensorflow or keras library, it shows an example of clustering the whole images using Convolutional Variational AutoEncoder(CVAE).
https://www.tensorflow.org/tutorials/generative/cvae
https://keras.io/examples/generative/vae/
However, I have a single image. What is the best practice to implement CVAE? Is it by generating sample images by moving window approach?
One way of doing it would be to have a CVAE that takes as input (and output) values of all the spectral features for each of the spatial coordinates (the stacks circled in red in the picture). So, in the case of your image, you would have 2500*2500 = 6250000 input data samples, which are all vectors of length 15. And then the dimension of the middle layer would be a vector of length 5. And, instead of 2D convolutions that are normally used along the spatial domain of images, in this case it would make sense to use 1D convolution over the spectral domain (since the values of neighbouring wavelengths are also correlated). But I think using only fully-connected layers would also make sense.
As a disclaimer, I haven’t seen CVAEs used in this way before, but like this, you would also get many data samples, which is needed in order for the learning generalise well.
Another option would be indeed what you suggested -- to just generate the samples (patches) using a moving window (maybe with a stride that is the half size of the patch). Even though you wouldn't necessarily get enough data samples for the CVAE to generalise really well on all HSI images, I guess it doesn't matter (if it overfits), since you want to use it on that same image.

RGB to gray filter doesn't preserve the shape

I have 209 cat/noncat images and I am looking to augment my dataset. In order to do so, this is the following code I am using to convert each NumPy array of RGB values to have a grey filter. The problem is I need their dimensions to be the same for my Neural Network to work, but they happen to have different dimensions.The code:
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.2989, 0.5870, 0.1140])
Normal Image Dimension: (64, 64, 3)
After Applying the Filter:(64,64)
I know that the missing 3 is probably the RGB Value or something,but I cannot find a way to have a "dummy" third dimension that would not affect the actual image. Can someone provide an alternative to the rgb2gray function that maintains the dimension?
The whole point of applying that greyscale filter is to reduce the number of channels from 3 (i.e. R,G and B) down to 1 (i.e. grey).
If you really, really want to get a 3-channel image that looks just the same but takes 3x as much memory, just make all 3 channels equal:
grey = np.dstack((grey, grey, grey))
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [[0.2989, 0.5870, 0.1140],[0.2989, 0.5870, 0.1140],[0.2989, 0.5870, 0.1140]])

Why do we take the transpose after flattening an image

I am currently trying to learning deeplearning and numpy. In an example given, after reshaping a test set of 60 128x128 images of carrots by using
`carrots_test.reshape(carrots_test.shape[60],-1)`
The example went on to then add a T to the end. I understand that this means a transpose but why would you transpose this new flattened image.
I understand what it is to flatten an image and why but can't intuitively see why we need to transpose (swap the rows and columns) it
There is no global reason to do it. Your application expects the shape to be (elements, images), not (images, elements). A reshape only adjusts the shape of the buffer. transpose adjusts the strides of the dimensions and compensates by rearranging the shape.

Changing numpy array using dpi value

I have an numpy array which I save to a image using savefig(). Then I read it in my code and the image is multiplied bigger than my original aray as dpi while saving is 100.
Is it possible to use dpi to make the image size larger and get it in a numpy array without saving and loading it again?
Sounds like you want to take an array of size (a, b) and scale it by an arbitrary factor s so that the resulting array has shape (a*s, b*s)?
There are several ways of doing this as far as I am aware, but perhaps the best resource is the cookbook page on rebinning: http://www.scipy.org/Cookbook/Rebinning
HTH