I am using a custom dataset with images of different sizes in the Lab format (Lightness, a, b) which are feed into a CNN. The input layer has 3 in-channels and so my idea was to split all 3 channels (L, a, b) and feed those into the network. Next I was wondering if each tensor needs to be transposed? My doubt is that it would lose its dimensions which are variable from image to image and I would not be able to reconstruct the image in the end. Any thoughts or ideas how I should normalize the image?
You can normalise without the need of transposing the image or splitting it based on its channels
torchvision.transforms.Normalize(mean=[l_channel_mean, a_channel_mean , b_channel_mean], std= [l_channel_mean, a_channel_mean , b_channel_mean])
The only required transform is the one that converts the images to tensors :
torchvision.transforms.ToTensor()
Related
I have a Matrix each row reprsent a point with coordinate (x,y,z) I want to make feature extraction for each point using 3 shared MLP layers (64,128,1024) (Conv1D with kernal size 1) and at the end I want to aggregat the features using MaxPooling1D.
My question is How to define that my input is the whole Matrix (what I mean that I want each layer to applay on the whole rows of matrix not just on one row )
I made code but I'm soure it's wrong
**Model=Sequential([
Conv1D(64,1,input_dim=(1,3),activation='relu')
BatchNormalization(axis=-1)
Conv1D(128,1,activation='relu')
BatchNormalization(axis=-1)
Conv1D(1021,1,activation='relu')
BatchNormalization(axis=-1)
MaxPooling1D(1)
])**
thanks in advance
I have 209 cat/noncat images and I am looking to augment my dataset. In order to do so, this is the following code I am using to convert each NumPy array of RGB values to have a grey filter. The problem is I need their dimensions to be the same for my Neural Network to work, but they happen to have different dimensions.The code:
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.2989, 0.5870, 0.1140])
Normal Image Dimension: (64, 64, 3)
After Applying the Filter:(64,64)
I know that the missing 3 is probably the RGB Value or something,but I cannot find a way to have a "dummy" third dimension that would not affect the actual image. Can someone provide an alternative to the rgb2gray function that maintains the dimension?
The whole point of applying that greyscale filter is to reduce the number of channels from 3 (i.e. R,G and B) down to 1 (i.e. grey).
If you really, really want to get a 3-channel image that looks just the same but takes 3x as much memory, just make all 3 channels equal:
grey = np.dstack((grey, grey, grey))
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [[0.2989, 0.5870, 0.1140],[0.2989, 0.5870, 0.1140],[0.2989, 0.5870, 0.1140]])
I am currently trying to learning deeplearning and numpy. In an example given, after reshaping a test set of 60 128x128 images of carrots by using
`carrots_test.reshape(carrots_test.shape[60],-1)`
The example went on to then add a T to the end. I understand that this means a transpose but why would you transpose this new flattened image.
I understand what it is to flatten an image and why but can't intuitively see why we need to transpose (swap the rows and columns) it
There is no global reason to do it. Your application expects the shape to be (elements, images), not (images, elements). A reshape only adjusts the shape of the buffer. transpose adjusts the strides of the dimensions and compensates by rearranging the shape.
I have DNA methylation data of Lower grade glioma samples obtained from GDC data portal. the values in my data are ranging from 0 to 1. I want to do classification of these samples into two classes as IDH WT and IDH mutant.
I want to use CNN for classification purpose. Thus i am doing image embedding here to give image as input. I am new to deep learning methods. I needed help with data preparation for CNN method. I am referring to this paper-(reference paper: doi: http://dx.doi.org/10.1101/364323.) My question is,
I have data frame of 9203*513 (row*column), i have saved each column into a separate file (excel file).
then i have reshaped each single file into 767*12 (row*column) by inserting one more row and zero is added there.
then i have done image embedding of each file. and i have submitted these all files to CNN as input. (training set: 80% images and test set: 20% images)
So is this approach correct to give input to CNN for classification problem?
I'm open to all suggestions,
Thank you for your time and consideration,
I have model A (autoencoder) which takes as input a batch of images A_in (original images), and outputs a batch of images A_out (reconstructed images). Then I have model B (binary classifier) which takes as input a batch of images B_in, which is a mixture of A_in and A_out.
I want B to distinguish between A_in and A_out, to see if A is doing a good job reconstructing images. B_out is a probability that a given image is A_in.
B trains in parallel with A to classify the two kinds of images. B_loss = (B_out - label). Labels are 0 or 1 (original or reconstructed). When we optimize B_loss we only update B parameters.
I want to train model A so that it optimizes a combined loss function: Combined_Loss = reconstruction error (A_out - A_in) - classification error (B_out - label), so that it tries to reconstruct the images and fool B at the same time. Here I want to only update A parameters (we don't want to help B here).
Now, my question is about constructing that mixture of A_in and A_out, and feeding it to B so that the graphs A and B are connected.
Right now it's like this:
A_out = autoencoder(A_in: orig_images)
B_out = classifier(B_in: numpy(mix(A_in, A_out))
How do I define it like this:
A_out = autoencoder(A_in: orig_images)
B_out = classifier(mix(A_out, A_in))
So that when I train A and B at the same time:
sess.run([autoencoder_train_op, classifier_train_op], feed_dict=
{A_in: orig_images, B_in: classifier_images, labels: classifier_labels})
I wouldn't need B_in placeholder (the graphs would be connected)?
Here's my Numpy code that constructs classifier_images (mix(A_in, A_out)):
reconstr_images = sess.run(A_out, feed_dict={A_in: orig_images})
half_and_half_images = np.concatenate((reconstr_images[:batch_size/2], orig_images[batch_size/2:]))
half_and_half_labels = np.zeros(labels.shape)
half_and_half_labels[batch_size/2:] = 1
random_indices = np.random.permutation(batch_size)
classifier_images = half_and_half_images[random_indices]
classifier_labels = half_and_half_labels[random_indices]
How do I convert it into TensorFlow node?
You can connect your models directly. In other words, you don't use placeholder for B's inputs, but use your mixture of A_in and A_out. If you just want to run B, you can still feed your inputs into the tensors that are coming from A. Feeding only placeholders is common, but TensorFlow supports feeding a value into any tensor. If it makes it easier to think about, you can pass the A's outputs through tf.identity so that you have something like a placeholder.
Another approach is what is usually done in GANs (where the generator output is fed into discriminator). You can create two "towers" of operations that share the variables. One tower will be just B and you can feed your inputs into B's placeholders to run just B. Another tower can be B on top of A, which you can use to run/train A and B together. The Bs in these two towers will have the same structure and share variables, but have separate ops. This approach is likely the cleanest and most flexible.