Thanks for your help tensorflow community!
I have a question regarding understanding and visualizing the output of the estimator's evaluate function.
I have a DNNClassifier and have trained it on data with 10 output ranges predictions can go into.
After training and running
accuracy = classifier.evaluate(input_fn = test_input_fn)['accuracy']
I see my accuracy as 33.8%. Which who knows how good that is. (Probably not good)
How can I see the output of each of the comparisons?
As the test_data is ran I would like to see what the estimate is, and what the actual value is. Basically a side by side of y and y'.
something like: [0 0 0 0 0 0 0 0 1] vs [0 0 0 0 0 0 0 0 1 0] 'false'
Rather than just seeing the aggregated overall accuracy.
Thanks!
So in the event that someone reads the question above, and understands what I was trying to do (view the output of predictions), I have a solution.
The solution is to utilize the .predict() method.
A good example is here:
https://www.tensorflow.org/get_started/estimator#classify_new_samples
My code ended up looking like:
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
x = {"x": np.array(predict_set.data)},
num_epochs = 1,
shuffle = False)
predictions = list(classifier.predict(input_fn=predict_input_fn))
print("\n Predictions:")
print(len(predictions))
for p in predictions:
print(int(p['classes'][0]))
which outputs the predictions in a column which I can copy / paste into some spread sheet program to examine my data.
Related
I got a problem when using tf.gradients to compute gradient.
my x is a tf.constant() of a vector v of shape (4, 1)
and my y is the sigmoid of v, also of shape (4, 1), so the gradient of y with respect to x should be a diagonal matrix of shape (4, 4).
My code:
c = tf.constant(sigmoid(x_0#w_0))
d = tf.constant(x_0#w_0)
Omega = tf.gradients(c, d)
_Omega = sess.run(Omega)
the error is
Fetch argument None has invalid type .
In addition, I think using tf.gradients might be wrong, there may be some other functions that can compute this.
My question:
point out where I am wrong and how to fix it using tf.gradients
or using another function.
Edit:
want to compute the derivative like this: see the vector_by_vector section https://en.wikipedia.org/wiki/Matrix_calculus#Vector-by-vector
and the result Omega would look like the following:
[[s1(1-s1) 0 0 0 ]
[0 s2(1-s2) 0 0 ]
[0 0 s3(1-s3) 0 ]
[0 0 0 s4(1-s4)]]
where si = sigmoid(x_0i#w_0), where x_0i is the ith row of x_0.
Generally, compute a vector over another vector, should be a matrix.
First of all, you can't calculate gradients for constants. You'll get None op for gradients. That's the reason for your error. One way to calculate gradients would be tf graph (see the code below) Or other way could be using tf.GradientTape in Eager execution mode:
import tensorflow as tf
import numpy as np
arr = np.random.rand(4, 1)
ip = tf.Variable(initial_value=arr)
sess = tf.Session()
c_var = tf.math.sigmoid(ip)
Omega = tf.gradients(c_var, ip)
sess.run(tf.global_variables_initializer())
_Omega = sess.run(Omega)
print(_Omega)
Now, you can pass any sized vector. Still, not sure how you will get (4, 4) diagonal matrix for the gradients.
I'm currently learning TensorFlow but I came across a confusion in the below code snippet:
dataset = dataset.shuffle(buffer_size = 10 * batch_size)
dataset = dataset.repeat(num_epochs).batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
I know that first the dataset will hold all the data but what shuffle(),repeat(), and batch() do to the dataset?
Please help me with an example and explanation.
Update: Here is a small collaboration notebook for demonstration of this answer.
Imagine, you have a dataset: [1, 2, 3, 4, 5, 6], then:
How ds.shuffle() works
dataset.shuffle(buffer_size=3) will allocate a buffer of size 3 for picking random entries. This buffer will be connected to the source dataset.
We could image it like this:
Random buffer
|
| Source dataset where all other elements live
| |
↓ ↓
[1,2,3] <= [4,5,6]
Let's assume that entry 2 was taken from the random buffer. Free space is filled by the next element from the source buffer, that is 4:
2 <= [1,3,4] <= [5,6]
We continue reading till nothing is left:
1 <= [3,4,5] <= [6]
5 <= [3,4,6] <= []
3 <= [4,6] <= []
6 <= [4] <= []
4 <= [] <= []
How ds.repeat() works
As soon as all the entries are read from the dataset and you try to read the next element, the dataset will throw an error.
That's where ds.repeat() comes into play. It will re-initialize the dataset, making it again like this:
[1,2,3] <= [4,5,6]
What will ds.batch() produce
The ds.batch() will take the first batch_size entries and make a batch out of them. So, a batch size of 3 for our example dataset will produce two batch records:
[2,1,5]
[3,6,4]
As we have a ds.repeat() before the batch, the generation of the data will continue. But the order of the elements will be different, due to the ds.random(). What should be taken into account is that 6 will never be present in the first batch, due to the size of the random buffer.
The following methods in tf.Dataset :
repeat( count=0 ) The method repeats the dataset count number of times.
shuffle( buffer_size, seed=None, reshuffle_each_iteration=None) The method shuffles the samples in the dataset. The buffer_size is the number of samples which are randomized and returned as tf.Dataset.
batch(batch_size,drop_remainder=False) Creates batches of the dataset with batch size given as batch_size which is also the length of the batches.
An example that shows looping over epochs. Upon running this script notice the difference in
dataset_gen1 - shuffle operation produces more random outputs (this may be more useful while running machine learning experiments)
dataset_gen2 - lack of shuffle operation produces elements in sequence
Other additions in this script
tf.data.experimental.sample_from_datasets - used to combine two datasets. Note that the shuffle operation in this case shall create a buffer that samples equally from both datasets.
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" # to avoid all those prints
os.environ["TF_GPU_THREAD_MODE"] = "gpu_private" # to avoid large "Kernel Launch Time"
import tensorflow as tf
if len(tf.config.list_physical_devices('GPU')):
tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True)
class Augmentations:
def __init__(self):
pass
#tf.function
def filter_even(self, x):
if x % 2 == 0:
return False
else:
return True
class Dataset:
def __init__(self, aug, range_min=0, range_max=100):
self.range_min = range_min
self.range_max = range_max
self.aug = aug
def generator(self):
dataset = tf.data.Dataset.from_generator(self._generator
, output_types=(tf.float32), args=())
dataset = dataset.filter(self.aug.filter_even)
return dataset
def _generator(self):
for item in range(self.range_min, self.range_max):
yield(item)
# Can be used when you have multiple datasets that you wish to combine
class ZipDataset:
def __init__(self, datasets):
self.datasets = datasets
self.datasets_generators = []
def generator(self):
for dataset in self.datasets:
self.datasets_generators.append(dataset.generator())
return tf.data.experimental.sample_from_datasets(self.datasets_generators)
if __name__ == "__main__":
aug = Augmentations()
dataset1 = Dataset(aug, 0, 100)
dataset2 = Dataset(aug, 100, 200)
dataset = ZipDataset([dataset1, dataset2])
epochs = 2
shuffle_buffer = 10
batch_size = 4
prefetch_buffer = 5
dataset_gen1 = dataset.generator().shuffle(shuffle_buffer).batch(batch_size).prefetch(prefetch_buffer)
# dataset_gen2 = dataset.generator().batch(batch_size).prefetch(prefetch_buffer) # this will output odd elements in sequence
for epoch in range(epochs):
print ('\n ------------------ Epoch: {} ------------------'.format(epoch))
for X in dataset_gen1.repeat(1): # adding .repeat() in the loop allows you to easily control the end of the loop
print (X)
# Do some stuff at end of loop
I need to write a custom loss for my keras model. As I need to write the function using Keras functions for auto-backpropagation, I am not sure how I will implement this, as this might require some looping operations -
Target[1*300] - [...0 0 0 1 0 0 0 0 0 1 0 0 0...]
Output[1*300] - [...0 0 1 0 0 0 0 0 0 0 1 0 0...]
What I need is that while calculating loss I don't need a exact match.
Even if my output has a discrepancy of +/- three places. I want it to mark it to consider this as a correct prediction.
For example, both of these should be considered as the right predictions -
Output[1*300] - [...0 0 1 0 0 0 0 0 0 0 1 0 0...]
Output[1*300] - [...0 1 0 0 0 0 0 0 0 0 0 1 0...]
The code which I have written till now -
import tensorflow as tf
tar = tf.placeholder(tf.float32, shape=(1, 10))
tar_unpacked = tf.unstack(tar)
pred = tf.placeholder(tf.float32, shape=(1, 10))
pred_unpacked = tf.unstack(pred)
for t in tar_unpacked:
result_tensor = tf.equal(t,1)
tar_ind = tf.where(result_tensor)
with tf.Session() as sess:
print(sess.run([tar_ind], feed_dict={tar:np.asarray([[0, 0,1, 0,0,0,1,0,0,0]]),pred:np.asarray([[0, 0,1, 0,0,0,1,0,0,0]])}))
Now what I want to do next is generate valid indexes by adding each from
[-3,-2,-1,0,1,2,3]
to elements in tar_ind and then compare the indexes with pred_unstacked.
My naive loss would be 1 - (NUM_MATCHED/TOTAL)
But the problem is that tar_ind is a variably sized tensor, and I cannot loop over it for the next operation.
Update-1.
As suggested by #user36624, I tried the alternate approach of having tf.py_func which gives the updated y_pred and then I used the updated ones for binary cross-entropy.
As I have implemented the function using py_func, It is giving me error as ValueError: An operation hasNonefor the gradient. Please make sure that all of your ops have a gradient defined (i.e., are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
Also as he suggested that I need to manually stop gradients which I don't know how to do?
def specificity_loss_wrapper():
def specificity_loss(y_true, y_pred):
y_pred = tf.py_func(best_match,[y_true,y_pred],(tf.float32))
y_pred = tf.stop_gradient(y_pred)
y_pred.set_shape(y_true.get_shape())
return K.binary_crossentropy(y_true, y_pred)
return specificity_loss
spec_loss = specificity_loss_wrapper()
and
...
model.compile(loss=spec_loss, optimizer='adam', metrics=['accuracy'])
...
In my understanding, binary_crossentropy should be differentiable.
Thanks
What you are suggesting is to compute
1. offsets = compute_index_offsets( y_true, y_pred )
2. loss = 1 - num(offsets <= 3)/total
I suggest to solve it in an alternative way.
1. y_true_mod = DP_best_match( y_true, y_pred )
2. loss = 1 - num(y_true_mod==y_pred)/total
The advantage of modifying y_true is that it is equivalent to providing a new target value, and thus it is not a part of the model graph optimization or the loss computation.
What DP_best_match( y_true, y_pred ) should do is to modify y_true according to y_pred,
e.g. given
y_true[1*300] - [...0 0 0 1 0 0 0 0 0 1 0 0 0...]
y_pred[1*300] - [...0 0 1 0 0 0 0 0 0 0 1 0 0...]
then DP_best_match( y_true, y_pred ) should give the new target
y_true_mod[1*300] - [...0 0 1 0 0 0 0 0 0 0 1 0 0...]
Note, DP_best_match( y_true, y_pred ) is aiming to modify y_true to best match y_pred, so it is deterministic and nothing to optimize. Thus, no need to have backpropagation. This means you need to manually stop gradients if you implement DP_best_match( y_true, y_pred ) in tf. Otherwise, you can implement it in numpy and wrap the function via tf.py_func, which might be easier to implement.
Final remark, you should make sure the proposed loss function makes sense. For me, it makes more sense to use binary_crossentropy or mse after finding the best y_true_mod.
I have a problem that I will try to explain with an example for easier understanding.
I want to classify oranges (O) and apples (A). For technical/legacy reasons (a component in the network) each batch should have either only O or only A examples. So traditional shuffling at example-level is not possible/adequate, since I cannot afford to have a batch that includes a mixture of O and A examples. However some kind of shuffling is desirable, as it is a common practise to train deep networks.
These are the steps that I take:
I first need to convert raw data/examples into TFRecords.
I shuffle the order of the raw examples, and then I create separate TFRecords that contained either only the shuffled O examples, or only the shuffled A examples. Let's call this "example-level" shuffling. This is something that takes place offline and only once.
At this point I have "clean batches": O-baches that contain only O examples, and A-batches that contain only A examples.
I do not want to first feed the network with all the O-batches and then with all the A-batches sequentially. This would probably not help much in convergence.
Can I shuffle these batches on the "batch-level", i.e. without affecting their interior?
If you use the Dataset api it's fairly straightforward. Just zip the O and A batches, then apply a random selection function with Dataset.map():
ds0 = tf.data.Dataset.from_tensor_slices([0])
ds0 = ds0.repeat()
ds0 = ds0.batch(5)
ds1 = tf.data.Dataset.from_tensor_slices([1])
ds1 = ds1.repeat()
ds1 = ds1.batch(5)
def rand_select(ds0, ds1):
rval = tf.random_uniform([])
return tf.cond(rval<0.5, lambda: ds0, lambda: ds1)
dataset = tf.data.Dataset()
dataset = dataset.zip((ds0, ds1)).map(lambda ds0, ds1: rand_select(ds0, ds1))
iterator = dataset.make_one_shot_iterator()
ds = iterator.get_next()
with tf.Session() as sess:
for _ in range(5):
print(sess.run(ds))
> [0 0 0 0 0]
[1 1 1 1 1]
[1 1 1 1 1]
[0 0 0 0 0]
[0 0 0 0 0]
I am trying to build a multi layer loss
I am using AlexNet as my base network and I have 4 classes that have 3 possible labels, so I tried to build it as follow:
output_gt = tf.placeholder(tf.int32, [None,4,3], name='output')
this is not my real output but this is his size,meaning the output layer on the alexNet is of size [4,3]
I want to be able to view only the output that is relevant to the class I put as an input.so in the end for each image i will get [1,3] size output that came from the relevant part of the original output
for example
batch_size =2
labels = [0,2]
output = [*batc_size_dim*][[0 0 0], [1 1 1],[2 2 2 ],[3 3 3]]
i will get
new_output = [[0 0 0],[2 2 2 ]]
how can I use the labels and the output to get new_output
I tried to use mask and I failed
can you help me?
Please try following:
tf.gather(output, input)