How do I integrate a training process into keras, that is not backpropagation based? - tensorflow

I am doing machine learning research where I explore non backpropagation options for training neural networks. I have implemented a genetic algorithm for training of keras models but I would like to make some improvements.
Currently I'm training the model by extracting the parameters with model.get_weights(), modifying them in a genetic algorithm, and then setting the weights again, with model.set_weights(), for loss calculation. This is being done by passing the network to a training function that performs the training and returns a history as such:
history = genetic_algorithm(model, *args).
I would like to act within the keras interface if possible so that I could compile the model using my training function somehow and then simply train with model.fit(). Is this possible? I saw this post regarding creating custom optimizers but it seems to be to constrained to allow a genetic algorithm.
This question is related to another question I have asked. I split them up to make each question more concise.

Related

Optimizing DL trained output function with optimization packages/frameworks

I ask a question about optimizing the trained function of a PyTorch-written deep neural network (data-driven optimization) here, but it looks like there isn't any solution for it.
In my previous effort, I trained a DL networked in PyTorch, exported the output function using torch.jit.trace, and tried to optimize the trained output function with Pyomo, But it didn't work.
Now, I want to ask what other alternative framework combination (DL framework + optimization framework) can I use to first train my network and then optimize the output trained function without any problem?
It should be noted that my training data are some physical properties (such as temperature, pressure, etc.) therefore I am facing a regression problem.

Best case to use tensorflow

I followed all the steps mentioned in the article:
https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/
Then I compared the results with Linear Regression and found that the error is less (68) than the tensorflow model (84).
from sklearn.linear_model import LinearRegression
logreg_clf = LinearRegression()
logreg_clf.fit(X_train, y_train)
pred = logreg_clf.predict(X_test)
print(np.sqrt(mean_squared_error(y_test, pred)))
Does this mean that if I have large dataset, I will get better results than linear regression?
What is the best situation - when I should be using tensorflow?
Answering your first question, Neural Networks are notoriously known for overfitting on smaller datasets, and here you are comparing the performance of a simple linear regression model with a neural network with two hidden layers on the testing data set, so it's not very surprising to see that the MLP model falling behind (assuming that you are working with relatively a smaller dataset) the linear regression model. Larger datasets will definitely help neural networks in learning more accurate parameters and generalize the phenomena well.
Now coming to your second question, Tensorflow is basically a library for building deep learning models, so whenever you are working on a deep learning problem like image recognition, Natural Language Processing, etc. you need massive computational power and will be processing a ton of data to train your models, and this is where TensorFlow becomes handy, it offers you GPU support which will significantly boost your training process which otherwise becomes practically impossible. Moreover, if you are building a product that has to be deployed in a production environment for it to be consumed, you can make use of TensorFlow Serving which helps you to take your models much closer to the customers.

Quantization aware training examples?

I want to do quantization-aware training with a basic convolutional neural network that I define directly in tensorflow (I don't want to use other API's such as Keras). The only ressource that I am aware of is the readme here:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize
However its not clear exactly where the different quantization commands should go in the overall process of training and then freezing the graph for actual inference.
Therefore I am wondering if there is any code example out there that shows how to define, train, and freeze a simple convolutional neural network with quantization aware training in tensorflow?
It seems that others have had the same question as well, see for instance here.
Thanks!

How to implement minibatch gradient descent in tensorflow without using feeddict?

From what I understand, using feed_dict is a computationally expensive process and should be avoided according to this article. Tensorflow's input pipelines are supposedly better.
All mini-batch gradient descent tutorials that I've found are implemented with feed_dict. Is there a way to use input pipelines and minibatch gradient descent?
If you are just making a small model, you will do fine with feed_dict. Many large models have been trained with the feed_dict method in the past. If you are scaling to a very deep convnet with a large dataset or something, you may want to use tf.data and the dataset pipeline, probably with the dataset serialized to a .tfrecord file so that the data can be pre-fetched to the GPU to reduce idle time. These optimizations are worthwhile with large models, and if you would really like to learn the API, visit the quickstart guide and this helpful Medium article on getting started with the API.

Neural Network - how to test that it is implemented properly?

I've implemented the Neural Network using Tensorflow. During the implementation and training, I've found several not-so-trivial bugs.
Example: during the training I had same Mini-Batch loss for different steps/epochs, but different accuracy.
Now the neural network seems to be ready and working properly. I haven't managed to train it well yet, but I am working on it.
Anyway, I would like to check somehow that I haven't done any computational errors there. I am thinking about generating some artificial data for "fake" classification problem with lets say 4 features. The classification should have a very clear human-understandable dependency between the classification output and 4 features. The idea is to try to train the NN on it and see how it performs.
What do you think?
Stanford's c231n has a couple of general tips for this, like gradient checking.
If you're just learning neural networks, why don't you try to run your implementation on some known data? Many courses provide error and loss curves form models with specified hyperparameters, so you can check whether your implementation's behavior differs significantly from correct implementation.