can I use tensorflow object detection API for gender recognition?
I want to train SSD_mobile net for gender recognition and detection. I changed labelmap to:
item {
id: 1
name: 'man'
}
item {
id: 2
name: 'woman'
}
and num_classes=2
I attach to training_loss=8 but when I feed an image to the network to test, the result is awful.
what should I do? can somebody help me?
For this kind of task you will need a huge dataset and a very long time for training if you don't have a super computer haha jokes apart but this is pretty difficult we need very keen analysis because men or women has almost same kind of features for computer not for us but for computer just like it can not make a difference between a bitch and a dog but we human can by just one watch so I hope you will understand what I am trying to say but you should definitely try it its a very nice idea and there is a lot of applications for this if you can do some thing better with this. Good luck let me know if you can do some thing better.
You can. The method that you need to follow is as following:
Use SSD to extract the location of the object to be found (face in here).
Get relevant feature map of the location at conv5 (assume that you use VGG). For example, if you find your object at location (100, 100, 100, 100 - XYWH) within input image with size (300, 300), cut conv5 features at (12, 12, 12, 12 - XYWH). Math is (100 / 300) * 38.
Now you shall have the activation features cut from conv5 (12 x 12 x 512) and which is only relevant with the face that you want to predict the gender.
Flatten this feature activation and apply DNN Classifier for it (i.e. Classifier used for VGG).
Get binary output stating either male or female.
Train your network by adding gender loss to global loss function.
Voila. You have the gender estimation network.
Related
I am following the https://codelabs.developers.google.com/codelabs/tfjs-training-regression/index.html#7 tutorial and trying to understand how to use tensorflow.js - I have a pretty simple problem I'm trying to solve, but no idea how to get there. Rather than mpg and horsepower in the tutorial, I have data with
{rating: x, value: y} - Rating being a scale of 1-5, and value being a float. What I'm trying to do is make the model more likely to predict closer to a value that has a rating of 5, rather than randomly. When I evaluated the predictions, I noticed that wasn't happening, because the model doesn't know that I want it to be more biased based on the rating. Can anyone explain how to do this? (And if there is a machine learning term that describes what I'm trying to do?)
I am noob in ML. I have a Person table that have,
-----------------------------------
User
-----------------------------------
UserId | UserName | UserPicturePath
1 | MyName | MyName.jpeg
Now I have tens of millions of persons in my database. I wanna train my model to predict the UserId by giving images(png/jpeg/tiff) in bytes. So, input will be images and the output I am looking is UserId. Right now I am looking for a solution in ML.NET but I am open to switch to TensorFlow.
Well, this is nothing but a mapping problem, particularly an id-to-face mapping problem, and neural nets excell at this more than on anything else.
As you have understood by now, you can do this using tensorflow, pytorch or any library of the same purpose.
But if you want to use tensorflow, read on for a ready code at the end. It is easiest to achieve your task by transfer learning, i.e. by loading some pretrained model, freezing all but last layer and then training the network to produce a latent one-dimensional vector for a given face image. Then you can save this vector into a database and map it into an id.
Then, whenever there is a new image and you want to predict an id for the image, you run your image through the network, get your vector and compute cosine similarity with vectors in your database. If the similarity is above some threshold and it is the highest among other similarities, you have found your id.
There are many ways to go about this. Sure you have to preprocess your data, and augment it at the same time, but if you want some ready code to play with then have a look at this famous happy house tutorial from Andrew NG and his team:
https://github.com/gemaatienza/Deep-Learning-Coursera/blob/master/4.%20Convolutional%20Neural%20Networks/Keras%20-%20Tutorial%20-%20Happy%20House%20v2.ipynb
This should suffice your needs.
Hope it helps!
I use the cleverhans code for cw to produce adversarial examples on Imagenet. The target model is InceptionV3(from keras) and I want to use cw for targeted attack. But when I save the adv image, they have changed a lot from the original images. I think maybe I use the wrong parameters. cw_params = {'binary_search_steps': 10,
'y_target': None,#(I specific the y_target later)
'max_iterations': 20000,
'learning_rate': .0002,
'batch_size': 1,
'initial_const': 10}
I have tried a lot of parameters, but I still can't find the great effects as carlini's paper. And when I use this parameter, the runing time is really long. I don't know the proper time.
#just some key codes:
temp_seeds=np.array(image.load_img(item_in_seed,target_size=(299,299)))
temp_seeds=np.expand_dims(temp_seeds,axis=0)
cw = CarliniWagnerL2(wrap, sess=sess)
cw_params = {'binary_search_steps': 10,
'y_target': None,#(I specific the y_target later)
'max_iterations': 20000,
'learning_rate': .0002,
'batch_size': 1,
'initial_const': 10}
adv= cw.generate_np(temp_seeds, **cw_params)
The successful examples of targeted attack have changed a lot from the original images in Imagenet. How can I get the small perturbation and the same great effects as the cw's paper
It is difficult to pin down the specific problem you are facing from this description but here are two suggestions:
Make sure the input domain is easy to optimize over (the CW paper has a change of variables to ensure that box constraints are respected).
Make sure that you are passing the right values from the model to the attack when it comes to computing the adversary's loss. It is often the case that numerical instabilities will prevent attacks from properly functioning.
Hope this helps!
My task here is to find a way to get a suggested value of the most important feature or features. By changing into the suggested values of the features, I want the classification result to change as well.
Snapshot of dataset
The following is the procedures that I have tried so far:
Import dataset (shape: 1162 by 22)
Build a simple neural network (2 hidden layers)
Since the dependent variable is simply either 0 or 1 (classification problem), I onehot-encoded the variable. So it's either [0, 1] or [1,0]
After splitting into train & test data, I train my NN model and got accuracy of 77.8%
To know which feature (out of 21) is the most important one in the determination of either 0 or 1, I trained the data using Random Forest classifier (scikit-learn) and also got 77.8% accuracy and then used the 'feature_importances_' offered by the random forest classifier.
As a result, I found out that a feature named 'a_L4' ranks the highest in terms of relative feature importance.
The feature 'a_L4' is allowed to have a value from 0 to 360 since it means an angle. In the original dataset, 'a_L4' comprises of only 12 values that are [5, 50, 95, 120, 140, 160, 185, 230, 235, 275, 320, 345].
I augmented the original dataset by directly adding all the possible 12 values for each cases giving a new dataset of shape (1162x12 by 22).
I imported the augmented dataset and tested it on the previously trained NN model. The result was a FAILURE. There hardly was any change in the classification meaning almost no '1's switched to '0's.
My conclusion was that changing the values of 'a_L4' was not enough to bring a change in the classification. So I additionally did the same procedure again for the 2nd most important feature which in this case was 'b_L7_p1'.
So writing all the possible values that the two most important features can have, now the new dataset becomes the shape of (1162x12x6 by 22). 'b_L7_p1' is allowed to have 6 different values only, thus the multiplication by 6.
Again the result was a FAILURE.
So, my question is what might have I done wrong in the procedure described above? Do I need to keep searching for more important features and augment the data with all the possible values they can have? But since this is a tedious task with multiple procedures to be done manually and leads to a dataset with a huge size, I wish there was a way to construct an inference-based NN model that can directly give out the suggested values of a certain feature or features.
I am relatively new to this field of research, so could anyone please tell me some key words that I should search for? I cannot find any work or papers regarding this issue on Google.
Thanks in advance.
In this case I would approach the problem in the following way:
Normalize the whole dataset. As you can see from the dataset your features have different scales. It is utterly important that you make all features to have the same scale. Have a look at: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
The second this that I would do now is train and evaluate a model (It can be whatever you want) to get a so called baseline model.
Then, I would try PCA to see whether all features are needed. Maybe you are including unnecessary sparsity to the model. See: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
For example if you set the n_components in PCA to be 0.99 then you are reducing the number of features while retaining as 0.99 explained variance.
Then I would train the model to see whether there is any improvement. Please note that only by adding the normalization itself there should be an improvement.
If I want to see by the dataset itself which features are important I would do: https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectKBest.html This would select a specified number of features based on some statistical test lets say: https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html
Train a model and evaluate it again to see whether there is some improvement.
Also, you should be aware that the NNs can perform feature engineering by themselves, so computing feature importance is redundant in a way.
Let me know whether you will see any improvements.
I came across an experimental use of Deep Learning using Tensorflow, https://github.com/asrivat1/DeepLearningVideoGames. The author trained CNNs to play Pong game. All seem straightforward to me, except the visualization to illustrate Q-value in the CNN layers. Here's the youtube video, https://www.youtube.com/watch?v=W9jGIzkVCsM. Anyone can explain how the graphs (heat map looking) are plotted?
Thx.
I digged into the code and found this file from a previous commit, but it is not present anymore in the master version (weird).
Inside you will find the code to visualize, the important lines are:
self.l1.imshow(np.reshape(np.rollaxis(c1, 2, 1),(20,20*32)),aspect = 6)
self.l2.imshow(np.reshape(np.rollaxis(c2, 2, 1),(5,5*64)),aspect = 12)
self.l3.imshow(np.reshape(np.rollaxis(c3, 2, 1),(3,3*64)),aspect = 12)
Here they take the activation map of size (20, 20, 32) and plot all the activations. They reshape to (20, 20*32) to plot all the feature maps (32 in total) side by side. To make it fit into the screen, they use an aspect ratio of 6, which compresses the image horizontally.
To sum it up, they plot all the feature maps side by side, and compress it to fit into the screen.
I would advise you to avoid changing the aspect ratio, and instead use little blocks for each activation (32 blocks in total) and arrange the blocks in a 8x4 layout for instance.