I have a dataset that I need to use with a 1d CNN, however I am not sure how to structure the data dimensions so that I can be used with the 1d CNN.
The data has 5 output classes, however, the input data is where I'm unsure how to proceed. Each output has a matrix of data associated with it, that is 16 x 8000. In other words, for every output I have, there is an associated matrix of numbers that must be fed together to reach that output. I have multiple of these data 16 x 1800 matrices for different samples, from which I am trying to make a prediction.
I was wondering how I can create a data frame for this that can be passed into a 1-d CNN?
More specifically, what would the input_shape parameter be set to in the model?
Right now, I am thinking that my output will be (# samples, 5). My input would be along the lines of ( # samples, (16 x 1800)), but I don't know how this would be implemented in keras?
Any help would be appreciated.
Related
since I am not very experienced I am struggling with a siamese twin network.
I have 2 images which run trough the same CNN and generate each a distinct feature vector. I would like to train a further network interpreting these two image vectors (each with 32 elements). In an intermediate step I would like to use these vectors as input for a function NCC which is located as a Layer between the CNN and the NN and defined in the following snippet ( i.e. the output should be used for the next NN):
def NCC(a, b):
l=a.shape[1]
av_a=tf.math.reduce_mean(a)
av_b=tf.math.reduce_mean(b)
a=a-av_a
b=b-av_b
norm_a=tf.math.sqrt(tf.math.reduce_sum(a*a))
norm_b=tf.math.sqrt(tf.math.reduce_sum(b*b))
a=a/norm_a
b=b/norm_b
A=tf.reshape(tf.repeat(a, axis=0, repeats=l),(l,l))
B=tf.reshape(tf.repeat(b, axis=0, repeats=l),(l,l))
ncc=Flatten()(A*tf.transpose(B))
return ncc
The output vector (for batchsize=1) should have a 32x32=1024 elements. It seems to work for a batchsize of 1. If I increase the batch size I run into trouble because the input vectors are now tensors with shape=(batch_size,32). I think this is a very stupid question- But how can I circumvent this issue? (It should be noted I wish also to have an output tensor with shape=(batch_size,1024))
Thanks in advance
Mike
I am trying to understand how values/losses are calculated in the following scenario while training a lstm network:
Input shape for lstm is (64,,5,66,150,3): 5 rgb images, with batch size of 64.
Corresponding output shape (64,5): Each input(5images) has output(5values), i.e, one int value for one image.
Question:
Training runs successfully for the following last layer scenario:
1)Dense(5) and 2)Dense(1)
and should ideally fail for all other Dense(n) layer with a shape mismatch error.
How does Dense(1) work if I have 5 fixed reference values per input? How are the losses and final value calculated. What is this final value in this scenario? Is it just average of all 5 (or) n values?
Kindly help me better understand this.
My situation is I have a excel file with 747 nodes (as input) each with a value (imagine 747 columns with floats) and I have an output of 741 values/columns with again floats. These are basically inputs and outputs of a geological simulation. So one row has 747(input)+741(output) = 1488 floats which is one dataset (from one simulation). I have 4 such datasets (rows) to train a neural network such that when I test them on 3 test datasets (747 columns) I get the output of 741 columns. This is just a simple run to get the skeleton of the neural network going before further modifications.
I have come across the Multi-Target Regression example of NYCTaxi (https://github.com/zeahmed/DeepLearningWithMLdotNet/tree/master/NYCTaxiMultiOutputRegression) but I can seem to wrap my head around it.
This is the training set (Input till and including column 'ABS', rest is output):
https://docs.google.com/spreadsheets/d/12TKVbGExt9KcK5RQKTexrToVo8qA5YfeItSaa7E2QdU/edit?usp=sharing
This is the test set:
https://docs.google.com/spreadsheets/d/1-RjyZsdguucCSOr9QTdTp2ehJBqWCr5yz1-aRjQ_4zo/edit?usp=sharing
This is the test Output (To validate) : https://docs.google.com/spreadsheets/d/10O_6711CEpJ4DN1w-kCmW01NikjFVZTDmNRuqO3U_6A/edit?usp=sharing
Any guidance/tips would be well appreciated. TIA!
We can use an AutoEncoder for this task. An AutoEncoder takes in the data, compresses it into a latent representation. Now, this representation vector is used to construct the output variable.
So, you can feed the 747-dimensional vector to the model and generate another 747-dimensional vector which is the output. After proper training, the model will be able to generate the target variables for a given set of inputs.
I'm working on a problem using Keras that has been presenting me with issues:
My X data is all of shape (num_samples, 8192, 8), but my Y data is of shape (num_samples, 4), where 4 is a one-hot encoded vector.
Both X and Y data will be run through LSTM layers, but the layers are rejecting the Y data because it doesn't match the shape of the X data.
Is padding the Y data with 0s so that it matches the dimensions of the X data unreasonable? What kind of effects would that have? Is there a better solution?
Edited for clarification:
As requested, here is more information:
My Y data represents the expected output of passing the X data through my model. This is my first time working with LSTMs, so I don't have an architecture in mind, but I'd like to use an architecture that works well with classifying long (8192-length) sequences of words into one of several categories. Additionally, the dataset that I have is of an immense size when fed through an LSTM, so I'm currently using batch-training.
Technologies being used:
Keras (Tensorflow Backend)
TL;DR Is padding one tensor with zeroes in all dimensions to match another tensor's shape a bad idea? What could be a better approach?
First of all, let's make sure your representation is actually what you think it is; the input to an LSTM (or any recurrent layer, for that matter) must be of dimensionality: (timesteps, shape), i.e. if you have 1000 training samples, each consisting of 100 timesteps, with each timestep having 10 values, your input shape will be (100,10,). Therefore I assume from your question that each input sample in your X set has 8192 steps and 8 values per step. Great; a single LSTM layer can iterate over these and produce 4-dimensional representations with absolutely no problem, just like so:
myLongInput = Input(shape=(8192,8,))
myRecurrentFunction = LSTM(4)
myShortOutput = myRecurrentFunction(myLongInput)
myShortOutput.shape
TensorShape([Dimension(None), Dimension(4)])
I assume your problem stems from trying to apply yet another LSTM on top of the first one; the next LSTM expects a tensor that has a time dimension, but your output has none. If that is the case, you'll need to let your first LSTM also output the intermediate representations at each time step, like so:
myNewRecurrentFunction=LSTM(4, return_sequences=True)
myLongOutput = myNewRecurrentFunction(myLongInput)
myLongOutput.shape
TensorShape([Dimension(None), Dimension(None), Dimension(4)])
As you can see the new output is now a 3rd order tensor, with the second dimension now being the (yet unassigned) timesteps. You can repeat this process until your final output, where you usually don't need the intermediate representations but rather only the last one. (Sidenote: make sure to set the activation of your last layer to a softmax if your output is in one-hot format)
On to your original question, zero-padding has very little negative impact on your network. The network will strain itself a bit in the beginning trying to figure out the concept of the additional values you have just thrown at it, but will very soon be able to learn they're meaningless. This comes at a cost of a larger parameter space (therefore more time and memory complexity), but doesn't really affect predictive power most of the time.
I hope that was helpful.
I'm following udacity MNIST tutorial and MNIST data is originally 28*28 matrix. However right before feeding that data, they flatten the data into 1d array with 784 columns (784 = 28 * 28).
For example,
original training set shape was (200000, 28, 28).
200000 rows (data). Each data is 28*28 matrix
They converted this into the training set whose shape is (200000, 784)
Can someone explain why they flatten the data out before feeding to tensorflow?
Because when you're adding a fully connected layer, you always want your data to be a (1 or) 2 dimensional matrix, where each row is the vector representing your data. That way, the fully connected layer is just a matrix multiplication between your input (of size (batch_size, n_features)) and the weights (of shape (n_features, n_outputs)) (plus the bias and the activation function), and you get an output of shape (batch_size, n_outputs). Plus, you really don't need the original shape information in a fully connected layer, so it's OK to lose it.
It would be more complicated and less efficient to get the same result without reshaping first, that's why we always do it before a fully connected layer. For a convolutional layer, on the opposite, you'll want to keep the data in original format (width, height).
That is a convention with fully connected layers. Fully connected layers connect every node in the previous layer with every node in the successive layer so locality is not an issue for this type of layer.
Additionally by defining the layer like this we can efficiently calculate the next step by calculating the formula: f(Wx + b) = y. This would not be as easily possible with multidimensional input and reshaping the input is low cost and easy to accomplish.