About the Input in a NN - tensorflow

So i am new to NN and i'm trying to go deep and apply to my subject. I would like to ask: the input of the NN can it be 2 or more values for example-> the measurement of a value, distance and time? Thanks in advance!

Yes you can have more than 1 value as your input. By my experience typically you're entering in these values as an array of values. Here is some example code from tensorflow: https://www.tensorflow.org/datasets/keras_example
In this example you see 784 inputs, each input is one of the pixels in the 28x28 greyscale image.

Related

How applying Conv1D layer on the whole Matrix

I have a Matrix each row reprsent a point with coordinate (x,y,z) I want to make feature extraction for each point using 3 shared MLP layers (64,128,1024) (Conv1D with kernal size 1) and at the end I want to aggregat the features using MaxPooling1D.
My question is How to define that my input is the whole Matrix (what I mean that I want each layer to applay on the whole rows of matrix not just on one row )
I made code but I'm soure it's wrong
**Model=Sequential([
Conv1D(64,1,input_dim=(1,3),activation='relu')
BatchNormalization(axis=-1)
Conv1D(128,1,activation='relu')
BatchNormalization(axis=-1)
Conv1D(1021,1,activation='relu')
BatchNormalization(axis=-1)
MaxPooling1D(1)
])**
thanks in advance

RGB to gray filter doesn't preserve the shape

I have 209 cat/noncat images and I am looking to augment my dataset. In order to do so, this is the following code I am using to convert each NumPy array of RGB values to have a grey filter. The problem is I need their dimensions to be the same for my Neural Network to work, but they happen to have different dimensions.The code:
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.2989, 0.5870, 0.1140])
Normal Image Dimension: (64, 64, 3)
After Applying the Filter:(64,64)
I know that the missing 3 is probably the RGB Value or something,but I cannot find a way to have a "dummy" third dimension that would not affect the actual image. Can someone provide an alternative to the rgb2gray function that maintains the dimension?
The whole point of applying that greyscale filter is to reduce the number of channels from 3 (i.e. R,G and B) down to 1 (i.e. grey).
If you really, really want to get a 3-channel image that looks just the same but takes 3x as much memory, just make all 3 channels equal:
grey = np.dstack((grey, grey, grey))
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [[0.2989, 0.5870, 0.1140],[0.2989, 0.5870, 0.1140],[0.2989, 0.5870, 0.1140]])

Should I transpose a Tensor when feeding it into a CNN

I am using a custom dataset with images of different sizes in the Lab format (Lightness, a, b) which are feed into a CNN. The input layer has 3 in-channels and so my idea was to split all 3 channels (L, a, b) and feed those into the network. Next I was wondering if each tensor needs to be transposed? My doubt is that it would lose its dimensions which are variable from image to image and I would not be able to reconstruct the image in the end. Any thoughts or ideas how I should normalize the image?
You can normalise without the need of transposing the image or splitting it based on its channels
torchvision.transforms.Normalize(mean=[l_channel_mean, a_channel_mean , b_channel_mean], std= [l_channel_mean, a_channel_mean , b_channel_mean])
The only required transform is the one that converts the images to tensors :
torchvision.transforms.ToTensor()

Neural network non-linear input

I have a question about the input choice for my neural network. I have a geographical area that is split into 40 smaller parts which i wish to give as input to my network. I have labeled those from 0-40 and passed them as ints to the network togeather with some other parameters to find a relation. However the desired result from these area inputs are completely unrelated, so the input area 1 and 2 is just as different as 1 and 25.
Often when i read exammples the input value is quite logical. 0 or 1 if the input is a simple true/false alternative. Or maybe if the image is a 32*32 grayscale picture the input is 1024 neurons accepting values from 0-255.
In my case when the 'area' parameter is not linear, what is the proper method to pass it to my network? Or is the whole setup faulty?
I would recommend 40 input variables. Every input variable would correspond to exactly one of your 40 areas. You would set only the input variable corresponding to the correct location to ''1'', and all others to ''0''

Simply switch output usage?

i got a game with only 10x2 pixels as input and it learns after one hour training doing it by itself. Now i want to use one float value output of the model instead of three classifier outputs. The three classifier outputs where stop,1-step right, 1-step-left. Now i want to produce one output value which tells me e.g. -4 => 4 steps-left, +2 => 2 steps-right and so on.
But after training for 1-2 hours, it only produces numbers around 0.001, but it should produce numbers between -10.0->+10.0 ?
Do i need todo it in a completly other way, or can i use an classifier model to output real value without changing much code ?
thanks for help
game code link
Training a classifier is much simpler than coming up with a good loss function that will give you scalaer values that make sense. Much (!) simpler.
Make it a classifier with 21 classes (0=10 left, 1=9 left, 2=8 left,...,10=stay, 11=1 right, ..., 20=10 right)