The problem is related to :
Tensorflow self-adjoint eigen decomposition not successful, input might not be valid
Is there a workaround to the use of self_adjoint_eig call in tensorflow?
I resolved the issue. the problem was with the propagation of NaN from the second loss function that I used for my stacked model.
Related
I am trying to quantize MobileFacenet (code from sirius-ai) according to the suggestion
and I think I met the same issue as this one
When I add tf.contrib.quantize.create_training_graph() into training graph
(train_nets.py ln.187: before train_op = train(...) or in train() utils/common.py ln.38 before gradients)
It did not add quantize-aware ops into the graph to collect dynamic range max\min.
I assume that I should see some additional nodes in tensorboard, but I did not, thus I think I did not successfully add quantize-aware ops in training graph.
And I try to trace tensorflow, found that I got nothing with _FindLayersToQuantize().
However when I add tf.contrib.quantize.create_eval_graph() to refine the training graph. I can see some quantize-aware ops as act_quant...
Since I did not add ops in training graph successfully, I have no weights to load in eval graph.
Thus I got some error message as
Key MobileFaceNet/Logits/LinearConv1x1/act_quant/max not found in checkpoint
or
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value MobileFaceNet/Logits/LinearConv1x1/act_quant/max
Does anyone know how to fix this error? or how to get quantized MobileFacenet with good accuracy?
Thanks!
H,
Unfortunately, the contrib/quantize tool is now deprecated. It won't be able to support newer models, and we are not working on it anymore.
If you are interested in QAT, I would recommend trying the new TF/Keras QAT API. We are actively developing that and providing support for it.
I am pretty new to tensorflow and I am struggling to get tensorboard to display some of my custom metrics. The model I am working with is a tf.estimator.Estimator, with an associated EstimatorSpec. The first new metric I am trying to log is from my loss function, which is composed of two components: a loss for an age prediction (tf.float32) and a loss for a class prediction (one-hot/multiclass), which I add together to determine a total loss (my model is predicting both a class and an age). The total loss is output just fine during training and shows up on tensorboard, but I would like to track the individual age and the class prediction loss components as well.
I think a solution that is supposed to work is to add a eval_metric_ops argument to the EstimatorSpec as described here (Custom eval_metric_ops in Estimator in Tensorflow). I have not been able to make this approach work, however. I defined a custom metric function that looks like this:
def age_loss_function(labels, ages_pred, ages_true):
per_sample_age_loss = get_age_loss_per_sample(ages_pred, ages_true) ### works fine
#### The error happens on this line:
mean_abs_age_diff, age_loss_update_fn = tf.metrics.Mean(per_sample_age_loss)
######
return mean_abs_age_diff, age_loss_update_fn
eval_metric_ops = {"age_loss": age_loss_function} #### Want to use this in EstimatorSpec
The instructions seem to say that I need both the error metric and the update function which should both be returned from the tf.metrics command as in examples like the one I linked. But this command fails for me with the error message:
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I am probably just misusing the APIs. If someone can guide me on the proper usage I would really appreciate it. Thanks!
It looks like the problem was from a version change. I had updated to tensorflow 2.0 while the instructions I was following were from 1.X. Using tf.compat.v1.metrics.mean() instead gets past this problem.
I am trying to model a Bayesian Network in python using Pomegranate package. The network should be learned from data. So I am using .from_samples method. However I am having trouble using the method .predict_proba() and it gives me error.
This is how I build the model:
model = BayesianNetwork.from_samples(X_train, algorithm='chow-liu')
and this is how I do prediction:
model.predict_proba(X_train)
and this is the error I get:
ValueError: Sample does not have the same number of dimensions as the model. Your help would be highly appreciated.
I got the answer: you should define your state_names when calling the from_samples method.
Another question is how do we do classification using this model?
you should use predict() method to predict the state of the not-valued nodes.
Check the documentation for more details.
Also, in the repository you can find some interesting tutorials that will help you.
Please add [] around the sample you are passing
I would like to speed up my LSTM network, but as I am using it for a OCR (where sequences have variable lenght), I can not use plain LSTM implementation. That is why I use "tf.nn.dynamic_rnn".
Based on benchmark of RNN in tensorflow (https://github.com/tensorflow/tensorflow/blob/754048a0453a04a761e112ae5d99c149eb9910dd/tensorflow/contrib/cudnn_rnn/python/kernel_tests/cudnn_rnn_ops_benchmark.py#L77), the CUDNN implementation is used for creating all model at once (it does not use "tf.nn.rnn" structure like others). I assume that it maybe impossible to use CUDNN with variable length, but maybe anybody success it?
Second this is using "tf.nn.bidirectional_dynamic_rnn", as I would like to use Bi-LSTM for OCR. But this should be resolved after implementing the first part.
Edit: It looks like "tf.contrib.cudnn_rnn.CudnnLSTM" have "bidirectional" implementation inside. So the only unknown this is that CUDNN can be used with variable input sequence.
Or maybe any working example which use 'CudnnLSTM' would be helpfull.
Just found this:
tf.contrib.cudnn_rnn.CudnnLSTM currently does not support batches with sequences of different length, thus this is normally not an option to use.
Source: http://returnn.readthedocs.io/en/latest/tf_lstm_benchmark.html
TensorFlow will soon finally have support for variable sequence lengths: https://github.com/tensorflow/tensorflow/blob/2f672ee9562a452f8dbfa259a8ccec56367e9b17/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py#L389
It looks like it landed too late for 1.13, so it'll probably only be available on TensorFlow 1.14.
You can try it out today by installing the tf-nightly-gpu package and passing sequence_lengths=lengths where lenghts is a tf.int32 Tensor with shape [batch_size], containing the lengths of each sequence in your batch.
I've written the following segmentor and I can't get the accuracy to work. In fact I'm always getting accuracy of 0.0 whatever the size of my sample.
I think the problem is at the sigmoid layer at the end of U() function where a tensor of continuous elements between 0 and 1 (conv10) is further compared to a binary tensor and therefore there's no chance of getting any equality between the two.
UPDATE: The code can be found as git here
I've resolve the issue. The problem was the numpy arrays conversion to placeholder at feed level. The new update code as git can be found at : https://github.com/JulienBelanger/TensorFlow-Image-Segmentation