Word Embedding for Convolution Neural Network - tensorflow

I am trying to apply word2vec for convolution neural network. I am new with Tensorflow. Here is my code for pre-train layer.
W = tf.Variable(tf.constant(0.0, shape=[vocabulary_size, embedding_size]),
trainable=False, name="W")
embedding_placeholder = tf.placeholder(tf.float32, [vocabulary_size, embedding_size])
embedding_init = W.assign(embedding_placeholder)
sess = tf.Session()
sess.run(embedding_init, feed_dict={embedding_placeholder: final_embeddings})
I think I should use embedding_lookup but not sure how to use it. I really appreace it someone could give some advice.
Thanks

Tensorflow has an example using word2vec-cnn for text classification: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/text_classification_cnn.py

You are on the right track. As embedding_lookup works under the assumption that words are represented as integer ids you need to transform your inputs vectors to comply with that. Furthermore, you need to make sure that your transformed words are correctly indexed into the embedding matrix. What I did was I used the information about the index-to-word-mapping generated from the embedding model (I used gensim for training my embeddings) to create a word-to-index lookup table that I subsequently used to transform my input vectors.

I am doing something similar. I stumbled upon this blog that implements the paper "Convolutional neural networks for Sentence Classification". This blog is good. http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

Related

Purpose of pooling layer after text embedding layer

I'm following the tutorial on the tensorflow site (https://www.tensorflow.org/tutorials/text/word_embeddings#create_a_simple_model) to learn word embeddings, and a confusion that I have is about the purpose of having a Globalaveragepooling layer right after the embedding layer as follows:
model = keras.Sequential([
layers.Embedding(encoder.vocab_size, embedding_dim),
layers.GlobalAveragePooling1D(),
layers.Dense(16, activation='relu'),
layers.Dense(1)
])
I understand what pooling means and how it's done. If someone can explain why we need a pooling layer, and what would change if we didn't use it, I'd appreciate it.
The purpose of this tutorial is to get you to understand word-embeddings through a simple toy task: binary sentiment analysis.
To start with, they make you code a simple model: take the average of all embeddings in a sentence and add a feed-forward neural net to classify this aggregated input. GlobalAveragePooling1D does this averaging.
Obviously in the real world you'd want to use more complex models as RNNs, LSTMs, bidirectional models, atrous-convolution-based models or Transformers but that's not the point in this tutorial.
The "simple model" they mention being a feed-forward neural net, it expects a fixed input dimension so when you have sequential data of variable length you need to address this somehow: averaging, padding, cropping etc. Here they average with this GlobalAveragePooling1D layer

How to do fine-tuning in tensorflow with notop layers and define my own input image size

There are many examples about how to do fine-tuning with tensorflow. Almost all these examples are try to resize our images to the specified size that the existing model needs. Like for example, 224×224 is the input size that vgg19 needs. However, in keras, we can change the input size by setting the include_top to false:
base_model = VGG19(include_top=False, weights="imagenet", input_shape=(input_size, input_size, input_channels))
Then we do not have to fix the image size to be 224×224 anymore. Can we do such kind of fine-tuning by using official pre-trained models in tensorflow? I cannot find the solutions up till now, anyone help me?
Yes, it is possible to do this kind of fine-tuning. You would just have to ensure that you also fine-tune some of the first few layers (to account for changed input) of the original network in addition to the last few layers (to account for changed output).
I work with TensorFlow using Keras. If you are open to that, then there is a code snippet that shows the general fine-tuning flow here:
https://keras.io/applications/
Specifically, I had to write the following code to make it work for my case:
#img_width,img_height is the size of your new input, 3 is the number of channels
input_tensor = Input(shape=(img_width, img_height, 3))
base_model =
keras.applications.vgg19.VGG19(include_top=False,weights='imagenet', input_tensor=input_tensor)
#instantiate whatever other layers you need
model = Model(inputs=base_model.inputs, outputs=predictions)
#predictions is the new logistic layer added to account for new classes
Hope this helps.

Tensorflow embeddings

I know what embeddings are and how they are trained. Precisely, while referring to the tensorflow's documentation, I came across two different articles. I wish to know what exactly is the difference between them.
link 1: Tensorflow | Vector Representations of words
In the first tutorial, they have explicitly trained embeddings on a specific dataset. There is a distinct session run to train those embeddings. I can then later on save the learnt embeddings as a numpy object and use the
tf.nn.embedding_lookup() function while training an LSTM network.
link 2: Tensorflow | Embeddings
In this second article however, I couldn't understand what is happening.
word_embeddings = tf.get_variable(“word_embeddings”,
[vocabulary_size, embedding_size])
embedded_word_ids = tf.gather(word_embeddings, word_ids)
This is given under the training embeddings sections. My doubt is: does the gather function train the embeddings automatically? I am not sure since this op ran very fast on my pc.
Generally: What is the right way to convert words into vectors (link1 or link2) in tensorflow for training a seq2seq model? Also, how to train the embeddings for a seq2seq dataset, since the data is in the form of separate sequences for my task unlike (a continuous sequence of words refer: link 1 dataset)
Alright! anyway, I have found the answer to this question and I am posting it so that others might benefit from it.
The first link is more of a tutorial that steps you through the process of exactly how the embeddings are learnt.
In practical cases, such as training seq2seq models or Any other encoder-decoder models, we use the second approach where the embedding matrix gets tuned appropriately while the model gets trained.

DeepLearning Anomaly Detection for images

I am still relatively new to the world of Deep Learning. I wanted to create a Deep Learning model (preferably using Tensorflow/Keras) for image anomaly detection. By anomaly detection I mean, essentially a OneClassSVM.
I have already tried sklearn's OneClassSVM using HOG features from the image. I was wondering if there is some example of how I can do this in deep learning. I looked up but couldn't find one single code piece that handles this case.
The way of doing this in Keras is with the KerasRegressor wrapper module (they wrap sci-kit learn's regressor interface). Useful information can also be found in the source code of that module. Basically you first have to define your Network Model, for example:
def simple_model():
#Input layer
data_in = Input(shape=(13,))
#First layer, fully connected, ReLU activation
layer_1 = Dense(13,activation='relu',kernel_initializer='normal')(data_in)
#second layer...etc
layer_2 = Dense(6,activation='relu',kernel_initializer='normal')(layer_1)
#Output, single node without activation
data_out = Dense(1, kernel_initializer='normal')(layer_2)
#Save and Compile model
model = Model(inputs=data_in, outputs=data_out)
#you may choose any loss or optimizer function, be careful which you chose
model.compile(loss='mean_squared_error', optimizer='adam')
return model
Then, pass it to the KerasRegressor builder and fit with your data:
from keras.wrappers.scikit_learn import KerasRegressor
#chose your epochs and batches
regressor = KerasRegressor(build_fn=simple_model, nb_epoch=100, batch_size=64)
#fit with your data
regressor.fit(data, labels, epochs=100)
For which you can now do predictions or obtain its score:
p = regressor.predict(data_test) #obtain predicted value
score = regressor.score(data_test, labels_test) #obtain test score
In your case, as you need to detect anomalous images from the ones that are ok, one approach you can take is to train your regressor by passing anomalous images labeled 1 and images that are ok labeled 0.
This will make your model to return a value closer to 1 when the input is an anomalous image, enabling you to threshold the desired results. You can think of this output as its R^2 coefficient to the "Anomalous Model" you trained as 1 (perfect match).
Also, as you mentioned, Autoencoders are another way to do anomaly detection. For this I suggest you take a look at the Keras Blog post Building Autoencoders in Keras, where they explain in detail about the implementation of them with the Keras library.
It is worth noticing that Single-class classification is another way of saying Regression.
Classification tries to find a probability distribution among the N possible classes, and you usually pick the most probable class as the output (that is why most Classification Networks use Sigmoid activation on their output labels, as it has range [0, 1]). Its output is discrete/categorical.
Similarly, Regression tries to find the best model that represents your data, by minimizing the error or some other metric (like the well-known R^2 metric, or Coefficient of Determination). Its output is a real number/continuous (and the reason why most Regression Networks don't use activations on their outputs). I hope this helps, good luck with your coding.

Tensorflow: jointly training CNN + LSTM

There are quite a few examples on how to use LSTMs alone in TF, but I couldn't find any good examples on how to train CNN + LSTM jointly.
From what I see, it is not quite straightforward how to do such training, and I can think of a few options here:
First, I believe the simplest solution (or the most primitive one) would be to train CNN independently to learn features and then to train LSTM on CNN features without updating the CNN part, since one would probably have to extract and save these features in numpy and then feed them to LSTM in TF. But in that scenario, one would probably have to use a differently labeled dataset for pretraining of CNN, which eliminates the advantage of end to end training, i.e. learning of features for final objective targeted by LSTM (besides the fact that one has to have these additional labels in the first place).
Second option would be to concatenate all time slices in the batch
dimension (4-d Tensor), feed it to CNN then somehow repack those
features to 5-d Tensor again needed for training LSTM and then apply a cost function. My main concern, is if it is possible to do such thing. Also, handling variable length sequences becomes a little bit tricky. For example, in prediction scenario you would only feed single frame at the time. Thus, I would be really happy to see some examples if that is the right way of doing joint training. Besides that, this solution looks more like a hack, thus, if there is a better way to do so, it would be great if someone could share it.
Thank you in advance !
For joint training, you can consider using tf.map_fn as described in the documentation https://www.tensorflow.org/api_docs/python/tf/map_fn.
Lets assume that the CNN is built along similar lines as described here https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10.py.
def joint_inference(sequence):
inference_fn = lambda image: inference(image)
logit_sequence = tf.map_fn(inference_fn, sequence, dtype=tf.float32, swap_memory=True)
lstm_cell = tf.contrib.rnn.LSTMCell(128)
output_state, intermediate_state = tf.nn.dynamic_rnn(cell=lstm_cell, inputs=logit_sequence)
projection_function = lambda state: tf.contrib.layers.linear(state, num_outputs=num_classes, activation_fn=tf.nn.sigmoid)
projection_logits = tf.map_fn(projection_function, output_state)
return projection_logits
Warning: You might have to look into device placement as described here https://www.tensorflow.org/tutorials/using_gpu if your model is larger than the memory gpu can allocate.
An Alternative would be to flatten the video batch to create an image batch, do a forward pass from CNN and reshape the features for LSTM.