I have a deep fully connected network.
I want to be able to change the structure of middle layers of the network dynamically.
What is the best way of doing that?
What I did right now is to create an output placeholder for my network. I thought I will create a network dynamically by using feed_dict. However, when I run it it says.
`ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ... `
Tensorflow won't make this easy for you. Once you define the graph and open a session it's fixed. I believe you need to define a new graph, copy over your variables, and move on from there every time you want to alter the architecture. Kinda annoying for experimenting with this kind of stuff.
I have a friend/fellow researcher who's been experimenting with dynamic neural network architectures and is tackling this in pytorch, which has specific support for dynamically altering network architectures.
Related
I'm training a neural network using keras but I'm not sure how to feed the training data into the model in the way that I want.
My training data set is effectively infinite, I have some code to generate training examples as needed, so I just want to pipe a continuous stream of novel data into the network. keras seems to want me to specify my entire dataset in advance by creating a numpy array with everything in it, but this obviously wont work with my approach.
I've experimented with creating a generator class based on keras.utils.Sequence which seems like a better fit, but it still requires me to specify a length via the __len__ method which makes me think it will only create that many examples before recycling them. Can someone suggest a better approach?
Data augmentation can easily be achieved using ad hoc modules in e.g. TensorFlow. This works perfectly for classification problems, however when the objective of the network is the prediction of a geometrical feature, e.g. a landmark, a problem arises. As the image is modified, e.g. flipped, or distorted, the corresponding labels also need to be adapted.
1 - Is there any tool to do this? I am sure that this is a common problem.
2 - Would it be useful to create a data augmentation script for neural networks that predict geometrical features?
I want to understand if I need to code all of this by myself or if I am missing something that already exists. If I need to do it and it could be useful I would just create an open source thing.
You can use imgaug library https://github.com/aleju/imgaug
An example of augmentation for key points using imgaug you can find here https://github.com/aleju/imgaug#example-augment-images-and-keypoints
So tensorflow is extremely useful at creating neural networks that involve perceptron neurons. However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code? I can't seem to find an answer. I understand this would change the forward propagation, and more mathematical calculations, and I am willing to change all the necessary areas.
I am also aware that I can just code from scratch the layers I need, and the neurons I had in mind, but tensorflow nevertheless has GPU integration, so one can see its more ideal to manipulate their code as opposed to creating my own from scratch.
Has anyone experimented with this? My goal is to create neural network structures that use a different type of neuron than the classic perceptron.
If someone who knows where in tensorflow I could look to see where they initialize the perceptron neurons, I would very much appreciate it!
Edit:
To be more specific, is it possible to alter code in tensorflow to use a different neuron type rather than the perceptron to invoke the tensorlfow Module: tf.layers for example? Or tf.nn? (conv2D, batch-norm, max-pool, etc). I can figure out the details. I just need to know where (I'm sure they're a few locations) I would go about changing code for this.
However, if one wanted to use a new type of neuron instead of the classic perceptron neuron, is this possible through augmenting tensorflow code?
Yes. Tensorflow provides you the possibility to define a computational graph. It then can automatically calculate the gradient for that. No need to do it yourself. This is the reason why you define it symbolically. You might want to read the whitepaper or start with a tutorial.
I'm actualy new in Machine Learning, but this theme is vary interesting for me, so Im using TensorFlow to classify some images from MNIST datasets...I run this code on Compute Engine(VM) at Google Cloud, because my computer is to weak for this. And the code actualy run well, but the problam is that when I each time enter to my VM and run the same code I need to wait while my model is training on CNN, and after I can make some tests or experiment with my data to plot or import some external images to impruve my accuracy etc.
Is There is some way to save my result of trainin model just once, some where, that when I will decide for example to enter to the same VM tomorrow...and dont wait anymore while my model is training. Is that possible to do this ?
Or there is maybe some another way to do something similar ?
You can save a trained model in TensorFlow and then use it later by loading it; that way you only have to train your model once, and use it as many times as you want. To do that, you can follow the TensorFlow documentation regarding that topic, where you can find information on how to save and load the model. In short, you will have to use the SavedModelBuilder class to define the type and location of your saved model, and then add the MetaGraphs and variables you want to save. Loading the saved model for posterior usage is even easier, as you will only have to run a command pointing to the location of the file in which the model was exported.
On the other hand, I would strongly recommend you to change your working environment in such a way that it can be more profitable for you. In Google Cloud you have the Cloud ML Engine service, which might be good for the type of work you are developing. It allows you to train your models and perform predictions without the need of an instance running all the required software. I happen to have worked a little bit with TensorFlow recently, and at first I was also working with a virtualized instance, but after following some tutorials I was able to save some money by migrating my work to ML Engine, as you are only charged for the usage. If you are using your VM only with that purpose, take a look at it.
You can of course consult all the available documentation, but as a first quickstart, if you are interested in ML Engine, I recommend you to have a look at how to train your models and how to get your predictions.
I want to use the VGG converted tensorflow model from Ryan.
https://github.com/ry/tensorflow-vgg16
Now I want to adjust the layers and add another layer or change the fully connected layers. But I don't know how to get the single layers/weights out of the graphDef or how to adjust the graph.
Short answer: you can't adjust a graph, but there are probably ways to get what you want accomplished.
Long answer: TensorFlow Graph objects are structurally immutable. You can modify some aspects of them (e.g., the shape of a tensor flowing into a node), but you can't remove a node or add a node between two existing nodes. However, there are a couple ways to get the same effect:
If your changes are limited to additions only, then there's no problem with doing this. For instance, if you wanted to add a layer on the end of a network, go for it. Likewise, you can "replace" the last layer by simply adding a new layer which takes the second-to-last layer as input and just ignoring the existing last layer. When you run the graph, if you never ask for the output of the original last layer, TensorFlow will never compute it.
If you need to do modifications, one way is to slowly build up a copy of the graph node by node. So read in the original graph definition, then build your own new graph by iterating over the original and adding similar nodes to your new copy. This is somewhat tedious and can be error-prone. Moreover...
...You might not need to "adjust" the graph at all. If you want something similar to that VGG-16 implementation, you can just work off the python code directly. Don't like the width of fc6? Just edit the code that generates it.
This brings us to the real issue, though. If your goal is to modify the network and be able to re-use the weights, then 2. and 3. aren't going to work. Realistically, this isn't possible in a lot of cases. For instance, if I wanted to add or remove a layer in the middle of VGG-16 (say, adding another convolutional layer), the pre-trained weights are no longer valid. You might be able to salvage any pre-trained weights which are upstream of your changes, but everything downstream will basically be wrong. You'll need to retrain the network anyways. (Maybe you can use the pre-trained networks as initialization, but you'll still need to retrain.) Even if you're just adding to the network (as in 1.), you'll still need to train the network.
Thanks! I have recreated the graph and then loaded every single weight by getting the value of the graph definition.
This was done by graph.get_tensor_by_name('import/...') where ... is the name of the weight
https://www.tensorflow.org/versions/r0.9/how_tos/tool_developers/index.html