I ask a question about optimizing the trained function of a PyTorch-written deep neural network (data-driven optimization) here, but it looks like there isn't any solution for it.
In my previous effort, I trained a DL networked in PyTorch, exported the output function using torch.jit.trace, and tried to optimize the trained output function with Pyomo, But it didn't work.
Now, I want to ask what other alternative framework combination (DL framework + optimization framework) can I use to first train my network and then optimize the output trained function without any problem?
It should be noted that my training data are some physical properties (such as temperature, pressure, etc.) therefore I am facing a regression problem.
Related
I am doing machine learning research where I explore non backpropagation options for training neural networks. I have implemented a genetic algorithm for training of keras models but I would like to make some improvements.
Currently I'm training the model by extracting the parameters with model.get_weights(), modifying them in a genetic algorithm, and then setting the weights again, with model.set_weights(), for loss calculation. This is being done by passing the network to a training function that performs the training and returns a history as such:
history = genetic_algorithm(model, *args).
I would like to act within the keras interface if possible so that I could compile the model using my training function somehow and then simply train with model.fit(). Is this possible? I saw this post regarding creating custom optimizers but it seems to be to constrained to allow a genetic algorithm.
This question is related to another question I have asked. I split them up to make each question more concise.
It is been a while I am looking for the best pipeline to do some classification using AutoML. But I want to know if it is possible to select the model manually and then just optimize its hyperparameters. For example, I want to just optimize SVM's hyperparameters and don't care about other models.
You can optimize only the selected model in MLJAR AutoML. It is open-source AutoML with code available at GitHub: https://github.com/mljar/mljar-supervised
The example code will look like:
automl = AutoML(algorithms=["Xgboost"], mode="Compete")
automl.fit(X, y)
The above code will tune only the Xgboost algorithm. The mode Compete is needed because the MLJAR AutoML can work in three modes: Explain, Perform, and Compete. Algorithms available in MLJAR AutoML: Baseline, Linear, Random Forest, Extra Trees, Decision Tree, Neural Networks, Nearest Neighbors, Xgboost, LightGBM, CatBoost.
I'm the author of MLJAR AutoML, I'll be happy to help you set it and run.
I followed all the steps mentioned in the article:
https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/
Then I compared the results with Linear Regression and found that the error is less (68) than the tensorflow model (84).
from sklearn.linear_model import LinearRegression
logreg_clf = LinearRegression()
logreg_clf.fit(X_train, y_train)
pred = logreg_clf.predict(X_test)
print(np.sqrt(mean_squared_error(y_test, pred)))
Does this mean that if I have large dataset, I will get better results than linear regression?
What is the best situation - when I should be using tensorflow?
Answering your first question, Neural Networks are notoriously known for overfitting on smaller datasets, and here you are comparing the performance of a simple linear regression model with a neural network with two hidden layers on the testing data set, so it's not very surprising to see that the MLP model falling behind (assuming that you are working with relatively a smaller dataset) the linear regression model. Larger datasets will definitely help neural networks in learning more accurate parameters and generalize the phenomena well.
Now coming to your second question, Tensorflow is basically a library for building deep learning models, so whenever you are working on a deep learning problem like image recognition, Natural Language Processing, etc. you need massive computational power and will be processing a ton of data to train your models, and this is where TensorFlow becomes handy, it offers you GPU support which will significantly boost your training process which otherwise becomes practically impossible. Moreover, if you are building a product that has to be deployed in a production environment for it to be consumed, you can make use of TensorFlow Serving which helps you to take your models much closer to the customers.
i want to change loss of object detection for ones of object detection (such as SSD) ,
Q1 : i want to know where do i modify the loss function for SSD ,
Q2 : is it possible to fine-tune ssd_mobilenet on my dataset with my define loss ? is it good or must be train ssd_mobile from scratch with my loss function ?
Q1:
If you are using the object detection api then a config is used to define the network and the loss, such as these:
https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs
Looking at a basic ssd mobilenet config you should see the losses it is using, including a classification loss and localization loss. You can look at other configs to see other loss options, or look at the source code for the full list of options or even modify the source code to add your own loss.
Q2:
It is certainly possible, but you will need to dig into the internals of how the object detection api works, modify it to add your loss function and train on your dataset. It will be more work than you might expect. Knowing nothing about your dataset or metric, I expect your fine-tuned result will converge more quickly than a from scratch result and give comparable results.
You can change the loss function in the configuration file like line 198 in the link - https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config , when you do this the performance will be drastically reduced may if you retrain the network performance may improve.
If you can elaborate on the goal more clearly, it would be helpful to suggest the solution.
I've implemented the Neural Network using Tensorflow. During the implementation and training, I've found several not-so-trivial bugs.
Example: during the training I had same Mini-Batch loss for different steps/epochs, but different accuracy.
Now the neural network seems to be ready and working properly. I haven't managed to train it well yet, but I am working on it.
Anyway, I would like to check somehow that I haven't done any computational errors there. I am thinking about generating some artificial data for "fake" classification problem with lets say 4 features. The classification should have a very clear human-understandable dependency between the classification output and 4 features. The idea is to try to train the NN on it and see how it performs.
What do you think?
Stanford's c231n has a couple of general tips for this, like gradient checking.
If you're just learning neural networks, why don't you try to run your implementation on some known data? Many courses provide error and loss curves form models with specified hyperparameters, so you can check whether your implementation's behavior differs significantly from correct implementation.