When you guys make a convolution layer includes hidden layers, how to decide parameters? like filter, stride and even the number of convolution layers? I know the meaning of each parameter, but if I have to make from the start, how can I?
Please refer to below links to have better understanding of CNNs and how to make use of them.
http://cs231n.github.io/convolutional-networks/
https://medium.com/#RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148
https://www.analyticsvidhya.com/blog/2018/12/guide-convolutional-neural-network-cnn/
https://towardsdatascience.com/deciding-optimal-filter-size-for-cnns-d6f7b56f9363
Related
I have two features vector. One is deep features vector extracted by CNN and another is handcrafted features extracted by uniform local binary pattern. I want to find common best features after concatenating these two features vector. I want to use a final pooling layer for this reason. Is it possible?
After you have concatenated the two feature vectors, the final pooling layer would help in reducing those feature vectors.
If you can define more what you aim to do / which pooling layer do you want to use ?
I'm not sure I understand correctly what you meant by "final pooling layer"
But in my opinion, adding ONLY a pooling layer after the concatenation layer and before the output layer (e.g., Dense-softmax...) may not help much in this case as pooling layers have no learnable parameters, and they operate over each activation map independently to reduce the size of the activation maps.
There is one simple way of feature fusion methods I would like to suggest is that you can apply another subnet (set of layers like convolution, pooling, dense) to the concatenated tensor. Thus, the model can keep learning to enhance the good features.
I searched many articles about convolutional neural networks and found that there are some good structures that I can refer to. For example, AlexNet, VGG, GoogleNet.
However, if I want to customize CNN architecture by myself, how to arrange/order different layers? E.g. convolution layer, dropout, max pooling... Is there any standard? or just keep trying different combination to produce the good result?
According to me there isn't a standard per say,But combinations
1-Like if you want to create a deeper network you can use residual block to avoid facing vanishing gradient problem.
2-The standard of using a 3,3 convolution is because it reduces computational cost ex 3 simultaneous 3,3 convolution can achieve a 7,7 convolution for a smaller cost
3-The main reason for dropout is to introduce regularization ,which can also be achieved by batch normalization as the author claims.
4-Before what to enhanced and how to enhanced ,one must understand the problem he/she is trying to solve.
You can go through the case study which was taught at Standford
Standford case study
The video can help you understand much of these combinations and how they result in model improvement and can help you built your network.
You generally want to put a pooling layer after a convolutional layer. Also, you can think of dropout as a parameter that is applied to a layer, and not a separate layer altogether -- whichever is easier for you to envision.
So tensorflow is extremely useful at creating neural networks that involve perceptron neurons. However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code? I can't seem to find an answer. I understand this would change the forward propagation, and more mathematical calculations, and I am willing to change all the necessary areas.
I am also aware that I can just code from scratch the layers I need, and the neurons I had in mind, but tensorflow nevertheless has GPU integration, so one can see its more ideal to manipulate their code as opposed to creating my own from scratch.
Has anyone experimented with this? My goal is to create neural network structures that use a different type of neuron than the classic perceptron.
If someone who knows where in tensorflow I could look to see where they initialize the perceptron neurons, I would very much appreciate it!
Edit:
To be more specific, is it possible to alter code in tensorflow to use a different neuron type rather than the perceptron to invoke the tensorlfow Module: tf.layers for example? Or tf.nn? (conv2D, batch-norm, max-pool, etc). I can figure out the details. I just need to know where (I'm sure they're a few locations) I would go about changing code for this.
However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code?
Yes. Tensorflow provides you the possibility to define a computational graph. It then can automatically calculate the gradient for that. No need to do it yourself. This is the reason why you define it symbolically. You might want to read the whitepaper or start with a tutorial.
I have seen two ways of visualizing transposed convolutions from credible sources, and as far as I can see they conflict.
My question boils down to, for each application of the kernel, do we go from many (e.g. 3x3) elements with input padding to one, or do we go from one element to many (e.g. 3x3)?
Related question: Which version does tf.nn.conv2d_transpose implement?
The sources of my confusion are:
A guide to convolution arithmetic for deep learning has probably the most famous visualization out there, but it isn't peer reviewed (Arxiv).
The second is from Deconvolution and Checkerboard Artifacts, which technically isn't peer reviewed either (Distil), but it is from a much more reputable source.
(The term deconvolution is used in the article, but it is stated that this is the same as transposed conv.)
Due to the nature of this question it is hard to look for results online, e.g. this SO posts takes the first position, but I am not sure to what extent I can trust it.
I want to stress a little more what Littleone also mentioned in his last paragraph:
A transposed convolution will reverse the spatial transformation of a regular convolution with the same parameters.
If you perform a regular convolution followed by a transposed convolution and both have the same settings (kernel size, padding, stride), then the input and output will have the same shape. This makes it super easy to build encoder-decoder networks with them. I wrote an article about different types of convolutions in Deep Learning here, where this is also covered.
PS: Please don't call it a deconvolution
Strided convolutions, deconvolutions, transposed convolutions all mean the same thing. Both papers are correct and you don't need to be doubtful as both of them are cited a lot. But the distil image is from a different perspective as its trying to show the artifacts problem.
The first visualisation is transposed convolutions with stride 2 and padding 1. If it was stride 1, there wouldn't be any padding in between inputs. The padding on the borders depend on the dimension of the output.
By deconvolution, we generally go from a smaller dimension to a higher dimension. And input data is generally padded to achieve the desired output dimensions. I believe the confusion arises from the padding patterns. Take a look at this formula
output = [(input-1)stride]+kernel_size-2*padding_of_output
Its a rearrangement of the general convolution output formula. Output here refers to the output of the deconvolution operation. To best understand deconvolution, I suggest thinking in terms of the equation, i.e., flipping what a convolution does. Its asking how do I reverse what a convolution operation does?
Hope that helps.
Good explanation from Justin Johnson (part of the Stanford cs231n mooc):
https://youtu.be/ByjaPdWXKJ4?t=1221 (starts at 20:21)
He reviews strided conv and then he explains transposed convolutions.
I have managed to train a word2vec with tensorflow and I want to feed those results into an rnn with lstm cells for sequence labeling.
1) It's not really clear on how to use your trained word2vec model for a rnn. (How to feed the result?)
2) I don't find much documentation on how to implement a sequence labeling lstm. (How do I bring in my labels?)
Could someone point me in the right direction on how to start with this task?
I suggest you start by reading the RNN tutorial and sequence-to-sequence tutorial. They explain how to build LSTMs in TensorFlow. Once you're comfortable with that, you'll have to find the right embedding Variable and assign it using your pre-trained word2vec model.
I realize this was posted a while ago, but I found this Gist about sequence labeling and this Gist about variable sequence labeling really helpful for figuring out sequence labeling. The basic outline (the gist of the Gist):
Use dynamic_rnn to handle unrolling your network for training and prediction. This method has moved around some in the API, so you may have to find it for your version, but just Google it.
Arrange your data into batches of size [batch_size, sequence_length, num_features], and your labels into batches of size [batch_size, sequence_length, num_classes]. Note that you want a label for every time step in your sequence.
For variable-length sequences, pass a value to the sequence_length argument of the dynamic_rnn wrapper for each sequence in your batch.
Training the RNN is very similar to training any other neural network once you have the network structure defined: feed it training data and target labels and watch it learn!
And some caveats:
With variable-length sequences, you will need to build masks for calculating your error metrics and stuff. It's all in the second link above, but don't forget when you make your own error metrics! I ran in to this a couple of times and it made my networks look like they were doing much worse on variable-length sequences.
You might want to add a regularization term to your loss function. I had some convergence issues without this.
I recommend using tf.train.AdamOptimizer with the default settings at first. Depending on your data, this may not converge and you will need to adjust the settings. This article does a good job of explaining what the different knobs do. Start reading from the beginning, some of the knobs are explained before the Adam section.
Hopefully these links are helpful to others in the future!