I read many recommendation system implementation in tensorflow, for example, https://keras.io/examples/structured_data/collaborative_filtering_movielens/, but most of them could only predict ratings for the known user, known user means the user used in training the model.
so for example, if the model trained with 5 users, it could only predict rating result for those 5 users.
for new users with some rating data, the model cannot make prediction without retrain the whole model.
is it possible to predict new users with some rating data without retrain the model?
Related
If we look at the list of available models in Keras as shown here we see that almost all of them are instantiated with weights='imagenet'. For instance:
model = VGG16(weights='imagenet', include_top=False)
Why always imagenet? is it because it is the baseline? If not, what are the other options available?
Thank you
Imagenet is a defacto standard for images classification. A yearly contest is run with millions of training images in 1000 categories. The models used in the imagenet classification competitions are measured against each other for performance. Therefore it provides a "standard" measure for how good a model is for image classification. So many often used transfer learning model models use the imagenet weights. Your model if you are using transfer learning can be customized for your application by adding additional layers to the model. You do not have to use the imagenet weighst but it generally is beneficial as it helps the model converge in less epochs. I use them but I also set all layers to be trainable which helps adapt the weights of the model to your application.
I am using Google's Dopamine framework to train a specific reinforcement learning use-case. I am using an auto encoder to pre-train the convolutional layers of the Deep Q Network and then transfer those pre-trained weights in the final network.
To that end, I have created a separate model (in this case an auto-encoder) which I train and save the resulting model and weights.
The DQN model is created using Keras's model sub-classing method and the model used to save the trained convolutional layers weights was build using the Sequential API. My issue is with when trying to load the pre-trained weights to my final DQN model. Based on whether I use the load_model() or load_weights() functionality from Tensorflow's API I get two different overall behaviors of my network and I would like to understand why. Specifically I have the two following scenarios:
Loading the weights with theload_weights() method to the final model. The weights are the weights of the encoder plus one additional layer(added just before saving the weights) to fit the architecture of the final network implemented in dopamine where they are loaded.
First load the saved model with load_model() and then when defining the new model in the __init__() method, extract the relevant layers from the loaded model and then use them for the final model.
Overall, I would expect the two approaches to yield similar results with regards to the average reward achieved per episode , when I use the same pre-trained weights. However the two approaches differ ( 1. yield higher average reward than 2. although using the same pre-trained weights) and I don't understand why.
Furthermore, in order to validate this behavior I have tried loading random weights with the two aforementioned approaches in order to see a change in behavior. In both cases, based on which of the two aforementioned loading methods I am using, I end up with very similar resulting behavior with the respected case when loading the trained weights. It's seems like the pre-trained weights in each respected case have no effect on the overall resulting training behavior. Although, this might be irrelevant to the issue I am trying to investigate here as it might be the case that the pre-trained weights don't offer any benefit overall which is also possible.
Any thoughts and ideas on this would be much appreciated.
I've got keras model traing and I'm using this model to generate data. I want to use that data to re-traing my model. After training this model seems to know how to predict new data, but somehow lost knowledge about previous data. I do not compile model again before training. There is some special actions to perform re-training in keras?
I had got pre-trained model weight files of a model for audio classification. How can i extract model information from that weights given (For examples no of layers used to build architecture)? This question asked in the interview. is if possible to extract model information from its weights.
Check these two threads
How to read keras model weights without a model
https://github.com/keras-team/keras/issues/91
SCENARIO
What if my intention is to train for a dataset of medical images and I have chosen a coco pre-trained model.
My Doubts
1 Since I have chosen medical images there is no point of train it on COCO dataset, right? if so what is a possible solution to do the same?
2 Adding more layers to a pre-trained model will screw the entire model? with classes of around 10 plus and 10000's of training datasets?
3 Without train from scratch what are the possible solutions , like fine-tuning the model?
PS - let's assume this scenario is based on deploying the model for business purposes.
Thanks-
Yes, it is a good idea to reuse the Pre-Trained Models or Transfer Learning in Real World Projects, as it saves Computation Time and as the Architectures are proven.
If your use case is to classify the Medical Images, that is, Image Classification, then
Since I have chosen medical images there is no point of train it on
COCO dataset, right? if so what is a possible solution to do the same?
Yes, COCO Dataset is not a good idea for Image Classification as it is efficient for Object Detection. You can reuse VGGNet or ResNet or Inception Net or EfficientNet. For more information, refer TF HUB Modules.
Adding more layers to a pre-trained model will screw the entire model?
with classes of around 10 plus and 10000's of training datasets?
No. We can remove the Top Layer of the Pre-Trained Model and can add our Custom Layers, without affecting the performance of the Pre-Trained Model.
Without train from scratch what are the possible solutions , like
fine-tuning the model?
In addition to using the Pre-Trained Models, you can Tune the Hyper-Parameters of the Model (Custom Layers added by you) using HParams of Tensorboard.