How can I implement a convolutional LSTM cell in TensorFlow? - tensorflow

Is the correct general approach to simply copy all of the code of class BasicLSTMCell(RNNCell) and replace all the matrix multiplication with conv2d operations? What are things that I have to keep in mind when implementing it this way?

That is the basic idea. I got an implementation of it working in tensorflow here. It can generate videos that look like this. They seem to work surprisingly well. I edited the rnn_cell.py file to get it working.

Related

How to remove the last layer from Hub Module in Tensorflow

I want to remove the last layer(s) from MobileBERT from Hub. I know there is a solution for Keras Model in TensorFlow, but this case is different from that one.
I was thinking of something like this, but it doesn't seem user-friendly.
What is the common way of doing this?
There is no first-class APIs to do this. The solution along the lines what you have mentioned is the way to go.

How to draw samples from a categorical distribution in TensorFlow.js

Issue in short
In Python version of Tensorflow there is a tf.random.categorical() method that draws samples from a categorical distribution. But I can't find a similar method in TensorFlow.js API. So, what is the proper way to draw samples from a categorical distribution in TensorFlow.js?
Issue in details
In Text generation with an RNN tutorial the tf.random.categorical() method is being used in generate_text() function to decide what character should be passed next to the RNN input to generate a sequence.
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
I'm experimenting with TensorFlow.js and trying trying to generate a "random" Shakespeare-like writing but in the browser. All parts of the tutorial seems to work well together except the step with using a tf.random.categorical() method.
I guess writing the alternative to tf.random.categorical() function manually should not be that hard, and also there are couple of 3rd-party JavaScript libraries that implement this functionality already, but it looks pretty logical to have it as a part of TensorFlow.js API.
I think you can use tf.multinomial instead.
I peeked at the source code and with name and seed parameters set to None, it is essentially the same as tf.multinomial with some random seeding going on, I guess.

Is there any way to implement the mathematical deconvolution(which exactly reverse the convolution) using tensorflow? Please let me know if there is

I'm trying to make a software in which I need to reverse the convolution process. I haven't found anything useful.
Yes, it is called Transposed Convolution in Tensorflow and also in PyTorch. Here is the link for TF1.14.
Here is for TF2.0.

Tensorflow: How to create new neuron (Not perceptron neuron)

So tensorflow is extremely useful at creating neural networks that involve perceptron neurons. However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code? I can't seem to find an answer. I understand this would change the forward propagation, and more mathematical calculations, and I am willing to change all the necessary areas.
I am also aware that I can just code from scratch the layers I need, and the neurons I had in mind, but tensorflow nevertheless has GPU integration, so one can see its more ideal to manipulate their code as opposed to creating my own from scratch.
Has anyone experimented with this? My goal is to create neural network structures that use a different type of neuron than the classic perceptron.
If someone who knows where in tensorflow I could look to see where they initialize the perceptron neurons, I would very much appreciate it!
Edit:
To be more specific, is it possible to alter code in tensorflow to use a different neuron type rather than the perceptron to invoke the tensorlfow Module: tf.layers for example? Or tf.nn? (conv2D, batch-norm, max-pool, etc). I can figure out the details. I just need to know where (I'm sure they're a few locations) I would go about changing code for this.
However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code?
Yes. Tensorflow provides you the possibility to define a computational graph. It then can automatically calculate the gradient for that. No need to do it yourself. This is the reason why you define it symbolically. You might want to read the whitepaper or start with a tutorial.

regarding caffe to tensorflow

Currently, there are a lot of deep learning models developed in Caffe instead of tensorflow. If I want to re-write these models in tensorflow, how to start? I am not familiar with Caffe structure. It seems to me that there are some files storing the model architecture only. My guess is that I only need to understand and transfer those architecture design into Tensorflow. The input/output/training will be re-written anyway. Is this thought meaningful?
I see some Caffe implementation also need to hack into the original Caffe framework down to the C++ level, and make some modifications. I am not sure under what kind of scenario the Caffe model developer need to go that deep? If I just want to re-implement their models in Tensorflow, do I need to go to check their C++ modifications, which are sometimes not documented at all.
I know there are some Caffe-Tensorflow transformation tool. But there are always some constraints, and I think re-write the model directly maybe more straightforward.
Any thougts, suggestions, and link to tutorials are highly appreciated.
I have already asked a similar question.
To synthetise the possible answers :
You can either use pre-existing tools like etheron's kaffe(which is really simple to use). But its simplicity comes at a cost: it is not easy to debug.
As #Yaroslav Bulatov answered start from scratch and try to make each layer match. In this regard I would advise you to look at ry's github which is a remarkable example where you basically have small helper functions which indicate how to reshape the weights appropriately from caffe to Tensorflow, which is the only real thing you have to do to make simple models match and also provides activations check layer by layer.