Initializing the weights of a layer from the output of another layer in tensorflow/keras [closed] - tensorflow

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm trying to implement the paper "learning to segment everything" and I need to set the weights of a layer in the segmentation network using the output of a weight transfer function.
The output of the last layer in the weight transfer fetched using layer.output in Keras is of type 'tensorflow.python.framework.ops.Tensor' while the weights should be initialized as a numpy array. Any idea how I can set the weights?

From what i got from the paper, the weights should be connected to the output of this transform layer let's say it's X. So what you want isn't creating "weights" then initializing the weights with this output X using tf.assign or any other method as this will not be differentiable., what you want is to connect the output X directly to work as weights in this other graph.
The thing is you can't do this through Keras layers or even tf.layers as this high level api doesn't allow you control this, because as soon as you create a layer in tf.layers or keras it creates it's own weights and you don't want that, you want to use this output X as weights not creating a new weights. So what you can do is easily re-implement whatever layer you want by yourself and use X directly as weights in this layer this will allow the gradient to flow back through this X.

Weights are typically stored in Variables. tf.assign operation can be used to assign values (represented as Tensors) to variables. You can see some basic examples of using tf.assign in session tests. It name there is state_ops.assign().
Just be aware that, like other tensorflow operations, it does not update the value of the variable immediately (unless you are using eager execution). It returns a tensor, that when evaluated (e.g. via session.run()), will update the variable.
From your question, I suspect that you might not be 100% clear about tensorflow computation model. The Tensor type is a symbolic representation of some value that will be produced only when the computation is actually run (via session.run()). You can't really talk about "converting a Tensor to numpy array" because you can't really convert the "result of operation foo" to concrete floats. You have to run the computation to compute the "result of operation foo" to know the concrete numbers. tf.assign works in this symbolic space. When using it, you are saying, "whatever the value of this tensor (output of some layer) will be when I run the computation, assign it to this variable".

Related

Keras - Regularization & custom loss [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I have built a custom Keras model which consists of various layers. Since I wanted to add L2 regularization to such layers, I've passed an instance of keras.regularizers.l2 as the argument for the kernel_regularizer parameter of those layers (as an example, see the constructor of keras.layers.Conv2D). Now, if I were to train this model using, say, Keras's implementation of the binary cross-entropy loss (keras.losses.BinaryCrossEntropy), I would be sure that the L2 regularization that I've specified would be taken into consideration when computing the loss.
In my case, however, I have a custom loss function that requires several other parameters aside from y_true and y_pred, meaning that there's no way I can pass this function as the argument for the loss parameter of model.compile(...) (in fact, I don't even call model.compile(...)). As a result, I also had to write a custom training loop. In other words, instead of simply running model.fit(...), I had to:
Perform forward propagation by calling model(x)
Compute the loss
Compute the gradients of the loss with respect to the model's weights (that is, model.trainable_variables) with tf.GradientTape
Apply the gradients
Repeat
My question is: in which phase is regularization accounted for?
During forward propagation?
During the computation/application of the gradients?
Keep in mind that my custom loss function does NOT account for regularization, so if it's not accounted for in any of the two phases I've mentioned above, then I'm actually training a model with no regularization whatsoever (even though I've provided a value for the kernel_regularizer argument in each layer that my network is made of). In that case, would I be forced to compute the regularization term by hand and add it to the loss?
Regularization losses are computed on the forward pass of the model, and their gradients are applied on the backward pass. I don't think that your training step is applying any weight regularization, and consequently your model isn't regularized. One way to check this would be to actually look at the weights of a trained model - if they're sparse, it means you've regularized the weights in some way. L1 regularization will actually push some weights to 0. L2 regularization does a similar thing, but often results in less sparse weights.
This post outlines writing a training loop from scratch in Keras and has a section on model regularization. The author adds the loss from regularization layers in his training step with the following command:
loss += sum(model.losses)
I think this may be what you need. If you are still unsure, I would train a model with the line above in the training loop, and another model without that line. Inspecting the weights of the trained models will give you some input on whether or not the weight regularization is working as expected.

Training Tensorflow only one object

Corresponding Tensorflow documentation I trained 3 objects and get result (It can recognize these objects). When I show other objects (not the 3 ones) it doesn't work correctly.
I want to train only one object (example: a cup) and recognize only this object. Is it possible to do via Tensorflow ?
Your question doesn't provide enough details, but as I can guess your trained the network with softmax activation and Categorical or SparseCategorical cross entropy loss. If my guess is right, such network always generates prediction to one of three classess, regardless to actual data, i.e. there is no option of "no-one".
In order to train network to recognize only one class of objects, make the only one output with only one channel and sigmoid activation. Use BinaryCrossEntropy loss to train your model for the specific object. Provide dataset that includes examples with this object and without it.

word embeddings in tensorflow (no pre_trained) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am new to tensorflow and trying to look at different examples of tensorflow to understand it better.
Now I have seen this line being used in many tensorflow examples without mentioning of any specific embedding algorithm being used for getting the words embeddings.
embeddings = tf.Variable(tf.random_uniform((vocab_size, embed_dim), -1, 1))
embed = tf.nn.embedding_lookup(embeddings, input_data)
Here are some examples:
https://github.com/Decalogue/dlnd_tv_script_generation/blob/master/dlnd_tv_script_generation.py
https://github.com/ajmaradiaga/cervantes-text-generation/blob/master/cervants_nn.py
I understand that the first line will initialize the embedding of the words by random distribution but will the embedding vectors further be trained in the model to give more accurate representation of the words (and change the initial random values to more accurate numbers) and if yes what is the actual method being used when there is no mention of any obvious embedding methods such as using word2vec and glove inside the code (or feeding the pre_tained vectors of these methods instead of random numbers in the beginning)?
Yes, those embeddings are trained further just like weights and biases otherwise representing words with some random values wouldn't make any sense. Those embeddings are updated while training like you would update a weight matrix, that is, by using optimization methods like Gradient Descent or Adam optimizer, etc.
When we use pre-trained embeddings like word2vec, they're already trained on very large datasets and are quite accurate representations of words already hence, they don't need any further training. If you are asking how those are trained, there are two main training algorithms that can be used to learn the embedding from the text; they are Continuous Bag of Words (CBOW) and Skip Grams. Explaining them completely is not possible here but I would suggest taking help from Google. This article might get you started.

How to perform a multi label classification with tensorflow? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm new to tensorflow and would like to know if there is any tutorial or example of a multilabel classification with multiple network outputs.
I'm asking this because I have a collection of images, in which, each image can belong to several classes and my output needs to have a score of each class.
I also do not know if the tensorflow follows some file pattern of the images and the classes, so if someone has some example it would facilitate a lot.
Thank you.
You can also try transforming your problem from a multi-label to multi-class classification using a Label Powerset approach. Label Powerset transformation treats every label combination attested in the training set as a different class and constructs one instance of a multi-class clasifier - and after prediction converts the assigned classes back to multi-label case. It is provided in scikit-multilearn and scikit-compatibility wrapper over the tensorflow Estimator or via an input_fn or use skflow. Then just plug it into an instance of LabelPowerset.
The code could go as follows:
from skmultilearn.problem_transform import LabelPowerset
import tensorflow.contrib.learn as skflow
# assume data is loaded using
# and is available in X_train/X_test, y_train/y_test
# initialize LabelPowerset multi-label classifier
# with tensor flow DNN base classifier
classifier = LabelPowerset(skflow.TensorFlowDNNClassifier(OPTIONS))
# train
classifier.fit(X_train, y_train)
# predict
predictions = classifier.predict(X_test)
The most naive (and reasonable) approach would be to train a classification network, and remove the softmax layer and replace it with a vector of sigmoids. This way you can have multiple units with an activation of 1.
You can see on TF-slim examples for classification networks. Under the path datasets you will find examples on how to prepare the TFExample "file pattern" for images and classes
Most solutions refer to sigmoid loss, and sigmoid do solve multi-label classification well in my case by tf.nn.sigmoid_cross_entropy_with_logits(labels,logits) in tensorflow.
However, when I handled class unbalance problem, where negative cases is much more than positive cases, I found my edited softsign loss worked much better than sigmoid. The adjust coefficient gamma is added to label to lower negative class's gradient by 3/4.
def unbalance_softsign_loss(labels, logits):
gamma = 1.25 *labels - 0.25
res = 1 - tf.log1p( gamma*logits/(1+ tf.abs(logits)) )
return res
where labels is multi-hot encoding vectors like [0, 1, 0, 1, 0], logits ~ (-inf, inf)

How do a track validation loss in TensorBoard? [duplicate]

This question already has an answer here:
When using tensorboard, how to summarize a loss that is computed over several minibatches?
(1 answer)
Closed 6 years ago.
I am training a model in TensorFlow. Periodically during training, I evaluate the model on a validation set. I'd like to write a summary of the training procedure so that TensorBoard displays a plot of the validation set loss so that I can see it go down with more training iterations. (Or jump back up if I start to overfit.)
I already have a global iteration variable as part of my summary. I'm thinking of creating a scalar summary validation_loss variable in the model graph that isn't connected to anything, but to which I periodically assign a variable to from my training loop.
Is this a good strategy? Is there a more idiomatic way to do this in TensorFlow?
(The specific project I'm working on is the TensorFlow RNN Language Model, which is a generalization of the RNN tutorial in the TensorFlow documentation.)
As I understand it, the idiomatic solution is to merge all summaries (in case loss is not your only summary) before creating a tf.train.SummaryWriter separately for your training and validation set. Then use the add_summary Op on the validation SummaryWriter for each (periodic) iteration.