Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I have created a simple Keras model which fits to the points I've created as shown in below image. My problem is that when I run the code, I can't always get this red line (maybe in 1 run of 5 runs I get the desired result. In other 4 runs I get a straight line). What should I do to make my model fit always?
Is this related to randomness
model = models.Sequential()
model.add(layers.Dense(2,activation = 'linear'))
model.add(layers.Dense(2,activation = 'relu'))
model.add(layers.Dense(1,activation = 'linear'))
model.compile(loss='mse', optimizer=optimizers.Adam(lr=0.001))
history = model.fit(X,y, epochs=1500, verbose=False)
y_hat = model.predict(X)**strong text**
pyplot image
Yes, the initial weights of your network get randomly initialized. To get reproducable results, you must set a seed for the random numbers. The random initialization of the weights is based on this seed. Having the same seed will therefore always result in the same initialization.
Taken from https://machinelearningmastery.com/reproducible-results-neural-networks-keras/
that is:
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)
The seed values can have any number of course, but must be fixed.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
Improve this question
I am struggling with training a CNN model to identify dogbreeds. I intend to train the Stanford Dogs Dataset using ResNet architecture. I downloaded the dataset from http://vision.stanford.edu/aditya86/ImageNetDogs/ into google-colab notebook and have extracted the images in the dataset. I get a folder structure like this: folder_structure. I know I need the folder structure which has subfolders train and test and then further subfolders with images of dogs with corresponding species. How do I go along doing that?
You don't need to strictly create separate folders for train and test. You can use the method tf.keras.utils.image_dataset_from_directory from tensorflow. It lets you load your all-in-one-folder dataset taking the right split while loading. This is how:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/images/", # path to your data folder
validation_split=0.2, # percentage reserved for test
subset="training", # this dataset is for training
seed=1024 # must be the same for both train and test: ensures that you take the images consistently
)
test_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/images/",
validation_split=0.2,
subset="validation",
seed=1024
)
Both functions return a tf.data.Dataset object. The argument validation_split lets you specify the percentage of data to reserve for validation (test in your case). In the example above I chose 80% train and 20% validation.
The seed argument must be the same for both train_ds and test_ds, because it ensures that the images are taken in same order, so you don't end up with mixed images in your train and test split.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
Improve this question
I am somewhat new to the concept of the metrics MAE and RMSE, I know that using these metrics instead of accuracy is reccomended since I use regression instead of classification. I am wondering how to measure the true accuracy of my model, the labeled sets are either -1 or 1 depending on the specified inputs, and my model outputs both negative and positive numbers linearly. Here are the following graphs that were returned on training:
My model doesn't appear to look overfitted in comparison to both training and testing lines, also what does it signify that RMSE is .5 and cannot go any lower? Thank you.
Mean squared error calculates the squared difference between the predicted labels and the true labels.
On the other hand, Root mean squared error calculates the squared difference between the predicted labels and the true labels just like MSE, but unlike MSE, it then takes the square root of it. Therefore, RMSE calculates the absolute distance between the predicted labels and the true labels.
For example, if your model predicts 1 but the true label is -1, then,
MSE = {1-(-1)}^2 = 4
RMSE = √MSE = √4 = 2
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am new to tensorflow and trying to look at different examples of tensorflow to understand it better.
Now I have seen this line being used in many tensorflow examples without mentioning of any specific embedding algorithm being used for getting the words embeddings.
embeddings = tf.Variable(tf.random_uniform((vocab_size, embed_dim), -1, 1))
embed = tf.nn.embedding_lookup(embeddings, input_data)
Here are some examples:
https://github.com/Decalogue/dlnd_tv_script_generation/blob/master/dlnd_tv_script_generation.py
https://github.com/ajmaradiaga/cervantes-text-generation/blob/master/cervants_nn.py
I understand that the first line will initialize the embedding of the words by random distribution but will the embedding vectors further be trained in the model to give more accurate representation of the words (and change the initial random values to more accurate numbers) and if yes what is the actual method being used when there is no mention of any obvious embedding methods such as using word2vec and glove inside the code (or feeding the pre_tained vectors of these methods instead of random numbers in the beginning)?
Yes, those embeddings are trained further just like weights and biases otherwise representing words with some random values wouldn't make any sense. Those embeddings are updated while training like you would update a weight matrix, that is, by using optimization methods like Gradient Descent or Adam optimizer, etc.
When we use pre-trained embeddings like word2vec, they're already trained on very large datasets and are quite accurate representations of words already hence, they don't need any further training. If you are asking how those are trained, there are two main training algorithms that can be used to learn the embedding from the text; they are Continuous Bag of Words (CBOW) and Skip Grams. Explaining them completely is not possible here but I would suggest taking help from Google. This article might get you started.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm trying to implement the paper "learning to segment everything" and I need to set the weights of a layer in the segmentation network using the output of a weight transfer function.
The output of the last layer in the weight transfer fetched using layer.output in Keras is of type 'tensorflow.python.framework.ops.Tensor' while the weights should be initialized as a numpy array. Any idea how I can set the weights?
From what i got from the paper, the weights should be connected to the output of this transform layer let's say it's X. So what you want isn't creating "weights" then initializing the weights with this output X using tf.assign or any other method as this will not be differentiable., what you want is to connect the output X directly to work as weights in this other graph.
The thing is you can't do this through Keras layers or even tf.layers as this high level api doesn't allow you control this, because as soon as you create a layer in tf.layers or keras it creates it's own weights and you don't want that, you want to use this output X as weights not creating a new weights. So what you can do is easily re-implement whatever layer you want by yourself and use X directly as weights in this layer this will allow the gradient to flow back through this X.
Weights are typically stored in Variables. tf.assign operation can be used to assign values (represented as Tensors) to variables. You can see some basic examples of using tf.assign in session tests. It name there is state_ops.assign().
Just be aware that, like other tensorflow operations, it does not update the value of the variable immediately (unless you are using eager execution). It returns a tensor, that when evaluated (e.g. via session.run()), will update the variable.
From your question, I suspect that you might not be 100% clear about tensorflow computation model. The Tensor type is a symbolic representation of some value that will be produced only when the computation is actually run (via session.run()). You can't really talk about "converting a Tensor to numpy array" because you can't really convert the "result of operation foo" to concrete floats. You have to run the computation to compute the "result of operation foo" to know the concrete numbers. tf.assign works in this symbolic space. When using it, you are saying, "whatever the value of this tensor (output of some layer) will be when I run the computation, assign it to this variable".
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm new to tensorflow and would like to know if there is any tutorial or example of a multilabel classification with multiple network outputs.
I'm asking this because I have a collection of images, in which, each image can belong to several classes and my output needs to have a score of each class.
I also do not know if the tensorflow follows some file pattern of the images and the classes, so if someone has some example it would facilitate a lot.
Thank you.
You can also try transforming your problem from a multi-label to multi-class classification using a Label Powerset approach. Label Powerset transformation treats every label combination attested in the training set as a different class and constructs one instance of a multi-class clasifier - and after prediction converts the assigned classes back to multi-label case. It is provided in scikit-multilearn and scikit-compatibility wrapper over the tensorflow Estimator or via an input_fn or use skflow. Then just plug it into an instance of LabelPowerset.
The code could go as follows:
from skmultilearn.problem_transform import LabelPowerset
import tensorflow.contrib.learn as skflow
# assume data is loaded using
# and is available in X_train/X_test, y_train/y_test
# initialize LabelPowerset multi-label classifier
# with tensor flow DNN base classifier
classifier = LabelPowerset(skflow.TensorFlowDNNClassifier(OPTIONS))
# train
classifier.fit(X_train, y_train)
# predict
predictions = classifier.predict(X_test)
The most naive (and reasonable) approach would be to train a classification network, and remove the softmax layer and replace it with a vector of sigmoids. This way you can have multiple units with an activation of 1.
You can see on TF-slim examples for classification networks. Under the path datasets you will find examples on how to prepare the TFExample "file pattern" for images and classes
Most solutions refer to sigmoid loss, and sigmoid do solve multi-label classification well in my case by tf.nn.sigmoid_cross_entropy_with_logits(labels,logits) in tensorflow.
However, when I handled class unbalance problem, where negative cases is much more than positive cases, I found my edited softsign loss worked much better than sigmoid. The adjust coefficient gamma is added to label to lower negative class's gradient by 3/4.
def unbalance_softsign_loss(labels, logits):
gamma = 1.25 *labels - 0.25
res = 1 - tf.log1p( gamma*logits/(1+ tf.abs(logits)) )
return res
where labels is multi-hot encoding vectors like [0, 1, 0, 1, 0], logits ~ (-inf, inf)