I'm trying to implement a local response normalization layer in Tensorflow to be used in a Keras model:
Here is an image of the operation I am trying to implement:
Here is the Paper link, please refer to section 3.3 to see the description of this layer
I have a working NumPy implementation, however, this implementation uses for loops and inbuilt python min and max operators to compute the summation. However, these pythonic operations will cause errors when defining a custom keras layer, so I can't use this implementation.
The issue here lies in the fact that I need to iterate over all the elements in the feature map and generate a normalized value for each of them. Additionally, the upper and lower bound on the summation change depending on which value I am currently normalizing. I can't really think of a way to handle this without nested for loops, but this will not work in a Keras custom layer as it isn't a native TensorFlow function.
Could anyone point me towards tensorflow/keras backend functions that could help me in implementing this layer?
EDIT: I know that this layer is implemented as a keras layer, but I want to build intuition about custom layers, so I want to implement this layer using tensor ops.
Related
I am using the C++ API of Tensorflow (v2.3.1) to serve a model (in SavedModel format) that contains a stateful GRU layer. Periodically, I need to reset the hidden states of the model. If I was working in Python and Keras, I could achieve this using tf.keras.Model.reset_states(), but alas I need to use the C++ API.
My model is loaded using tensorflow::LoadSavedModel function which provides me with a tensorflow::SavedModelBundle object. The idea I am pursuing right now is to first access the model graph using bundle.meta_graph_def.mutable_graph_def(). I plan to then find the VarHandleOp op in the graph corresponding to the hidden state of the GRU and manually fill that tensor with 0s. So far I have not been able to identify the op, and I have not found a way to write values manually to the VarHandleOp object. Am I on the right track? Is there another way to reset the states?
So tensorflow is extremely useful at creating neural networks that involve perceptron neurons. However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code? I can't seem to find an answer. I understand this would change the forward propagation, and more mathematical calculations, and I am willing to change all the necessary areas.
I am also aware that I can just code from scratch the layers I need, and the neurons I had in mind, but tensorflow nevertheless has GPU integration, so one can see its more ideal to manipulate their code as opposed to creating my own from scratch.
Has anyone experimented with this? My goal is to create neural network structures that use a different type of neuron than the classic perceptron.
If someone who knows where in tensorflow I could look to see where they initialize the perceptron neurons, I would very much appreciate it!
Edit:
To be more specific, is it possible to alter code in tensorflow to use a different neuron type rather than the perceptron to invoke the tensorlfow Module: tf.layers for example? Or tf.nn? (conv2D, batch-norm, max-pool, etc). I can figure out the details. I just need to know where (I'm sure they're a few locations) I would go about changing code for this.
However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code?
Yes. Tensorflow provides you the possibility to define a computational graph. It then can automatically calculate the gradient for that. No need to do it yourself. This is the reason why you define it symbolically. You might want to read the whitepaper or start with a tutorial.
I've been working on a prototype and I am having issues with backpropagation.I am currently using the latest keras and tensorflow build ( as tensorflow as a backend, I have looked into cntk, mxnet, and chainer; so far only chainer would allow me to do it but the training time is quite slow..)
My current layer is similar to a convolutional layer with more operations than a simple multiplication.
I know that tensorflow should use automatic differentiation if all the operations support it to calculate the gradient and perform gradient descent.
Currently my layer uses the following operator : reduce_sum, sum, subtraction, multiplication and division.
I also relies on the following methods: extract_image_patches, reshape, transpose.
I doubt any of these would cause an issue with automatic gradient descent. I built 2 layers as tests, one inherits from the base layer in keras while the other inherit directly from _Conv. In both cases whenever I use that layer anywhere in a model no weights are updated during the training process.
How could I solve this problem and fix backpropagation?
Edit:
(Here is the layer implementation https://github.com/roya0045/cvar2/blob/master/tfvar.py,
for the testing iteself see https://github.com/roya0045/cvar2/blob/master/test2.py )
I have a pytorch model and a tensorflow model, I want to train them together on one GPU, following the process bellow: input --> pytorch model--> output_pytorch --> tensorflow model --> output_tensorflow --> pytorch model.
Is is possible to do this? If answer is yes, is there any problem which I will encounter?
Thanks in advance.
I haven't done this but it is possible but implementing is can be a little bit.
You can consider each network as a function, you want to - in some sense - compose these function to form your network, to do this you can compute the final function by just giving result of one network to the other and then use chain-rule to compute the derivatives(using symbolic differentiation from both packages).
I think a good way for implementing this you might be to wrap TF models as a PyTorch Function and use tf.gradients for computing the backward pass.
Doing gradient updates can really get hard (because some variables exist in TF's computation graph) you can turn TF variables to PyTorch Variable turn them into placeholdes in TF computation graph, feed them in feed_dict and update them using PyTorch mechanisms, but I think it would be really hard to do, instead if you do your updates inside backward method of the function you might be able to do the job(it is really ugly but might do the job).
I want to implement a convolution-deconvolution network for a image segmentation project. In the deconvolution part, I am planning to upsample the feature map by 2. e.g. The original feature map is of dimension 64*64*4 and I want to upsample it into 128*128*4. Does anyone know a tensor operation that does this? Thanks!
You could use tf.image.resize_images(). It takes batches of images or single images and supports the most common methods such as bilinear and nearest_neighbor.
Here's the link to the TensorFlow API reference: resizing
You can also take a look at how the upsampling operation is implemented in a higher-level API such as tflearn. You can find upsample_2d and upscore_layer in their Github repo: conv.py
Note: the output might be cast to tf.float32 in older TF versions