General usefulness of Dense layers for different identification tasks - tensorflow

I'd like to ask, is it practical to use embeddings and similarity metrics to any form of identification task? If I had a neural network trained to find different objects in a photo, would extracting the fully-connected layers/Dense layers and clustering them be useful?
I've recently found that there is an embeddings projector tool from tensorflow that is very cool and useful. I know that there has been some work in word embeddings and how similar words cluster together. This is the case for faces as well.
Having said that, I want to follow the same methods into analyzing geological sites; can I train a model to create embeddings of the features of a site and use clustering methods to classify?

Yes, we can do that. We can use embeddings for images and visualize the embeddings in the tensorboard.
You can replicate using the fashion mnist embedding example found here for your use case.

Related

How to use google inception model to classify DNA or protein sequences data sets?

I tried to classify protein using its sequences into their families. Can I use deep convolutional models on this purpose even though they use RGB 3 input metrics of an image? Is there any specific way to convert dataset other than the image in order to classify using these models. I'm new to Artificial neural networks, your suggestions are highly appreciated.
First you need to understand that the models you have in mind are tasked with a very difficult problem: Object Recognition in colored images therefore the models used are very big.
Then you need to know the purpose of using CNNs, is to extract as many features as we can from colored images in order to perform detection.
With the knowledge above considered I think classifying protein using its sequences seems achievable with a much more smaller convolutional model. You may need at max 10 layers of convolution. To conclude you should not need a CNN as complex as google inception model.
About your data: There is no rule about CNNs which say you can only use RGB pictures. These pictures are only arrays. If you have any kind of numeric data which can be used in algorithmic operations ofcourse, you can definitely use CNNs for feature extraction. I recommend you to take a look at this example.
I also recommend you to take a look at the following libraries. SK-LEARN, KERAS and PYTORCH. These libraries are very begginer friendly and they have amazing documentaries.
Best of luck.

Classification of a sequence of images (fixed number)

I successfully trained a CNN for a single image classification, using pre-trained resnet50 from tensorflow_hub.
Now my goal is to give as input to my network a chronological sequence of images (not a video), to classify the behavior of the subject.
Each sequence consists of 20 images taken every 100ms.
What is the best kind of NN? Where can I find documentation/examples for problems similar to mine?
Any time there is sequential data some type of Recurrent Neural Network is a great candidate (usually in the form of an LSTM).
Your model may look like a combination of an CNN-LSTM because your pictures have some sort of sequential relationship.
Here is a link to some examples and tutorials. He will set up a CNN in his example but you could probably rig your architecture to use the resNet you have already made. Though your are not dealing with a video your problem shares the same domain.
Here is a paper than uses a NN architecture like the one described above you might find useful.

How to know what Tensorflow actually "see"?

I'm using cnn built by keras(tensorflow) to do visual recognition.
I wonder if there is a way to know what my own tensorflow model "see".
Google had a news showing the cat face in the AI brain.
https://www.smithsonianmag.com/innovation/one-step-closer-to-a-brain-79159265/
Can anybody tell me how to take out the image in my own cnn networks.
For example, what my own cnn model recognize a car?
We have to distinguish between what Tensorflow actually see:
As we go deeper into the network, the feature maps look less like the
original image and more like an abstract representation of it. As you
can see in block3_conv1 the cat is somewhat visible, but after that it
becomes unrecognizable. The reason is that deeper feature maps encode
high level concepts like “cat nose” or “dog ear” while lower level
feature maps detect simple edges and shapes. That’s why deeper feature
maps contain less information about the image and more about the class
of the image. They still encode useful features, but they are less
visually interpretable by us.
and what we can reconstruct from it as a result of some kind of reverse deconvolution (which is not a real math deconvolution in fact) process.
To answer to your real question, there is a lot of good example solution out there, one you can study it with success: Visualizing output of convolutional layer in tensorflow.
When you are building a model to perform visual recognition, you actually give it similar kinds of labelled data or pictures in this case to it to recognize so that it can modify its weights according to the training data. If you wish to build a model that can recognize a car, you have to perform training on a large train data containing labelled pictures. This type of recognition is basically a categorical recognition.
You can experiment with the MNIST dataset which provides with a dataset of pictures of digits for image recognition.

How to perform a multi label classification with tensorflow in purpose of auto tagging?

I'm new to tensorflow and would like to know if there is any tutorial or example of a multi-label classification with multiple network outputs.
I'm asking this because I have a collection of articles, in which, each article can have several tags.
Out of the box, tensorflow supports binary multi-label classification via tf.nn.sigmoid_cross_entropy_with_logits loss function or the like (see the complete list in this question). If your tags are binary, in other words there's a predefined set of possible tags and each one can either be present or not, you can safely go with that. A single model to classify all labels at once. There are a lot of examples of such networks, e.g. one from this question.
Unfortunately, multi-nomial multi-label classification is not supported in tensorflow. If this is your case, you'd have to build a separate classifier for each label, each using tf.nn.softmax_cross_entropy_with_logits or a similar one.

What is meant by visualizing an embedding space(neural network)?

I was reading about an activity recognition paper https://arxiv.org/pdf/1705.07750.pdf. Here, they use 3D convolution on inception v1 to perform activity recognition. I was listening to a talk that said visualizing embedding space of the features from the video.
1) What does it mean to visualize an embedding space? Are you looking at the filters that it has learnt or are you looking for clusterings of similar activities?
2) Do you just visualize the weight matrix for seeing the features that it is capturing? If yes, which weight matrix?
3)Does tf.summary.image() help in visualizing the weight matrix?
The embedding space is the space of the features produced by some learning algorithm. In the specific case of a (convolutional) neural network, this usually means one of the output feature maps (flattened) at some predefined layer or the output of one of the fully connected layers.
What one would visualize is not the weight matrix, but the values of the produced features for some input test data. For example one takes the full test set and passes it through the network and computes the features for each image at a specific layer, and then visualizes those values.
TensorBoard has functionality to automatically visualize embeddings and other feature spaces, you should take a look at it.
Note that in some application contexts like NLP an embedding has a slightly different definition but the use is the same.