What related GitHub issues or Stack Overflow threads have you found by searching the web for your problem?
I searched #1269 #504
Environment info
Mac OS for build and Android version 5 to run .apk demo.
If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)
I followed the steps mentioned in #1269 and could able to run the example successfully, but the accuracy of the result is very low and often wrong. I have trained my systems on 25 different daily used products like soap, soup, noodles, etc.
Where as when i run the same example using following script it give me very high accuracy (approx. 90-95%)
import sys
import tensorflow as tf
// change this as you see fit
image_path = sys.argv[1]
// Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
// Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/tf_files/retrained_labels.txt")]
// Unpersists graph from file
with tf.gfile.FastGFile("/tf_files/retrained_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
// Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
// Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
The only difference I see here is that the model file used in the Android demo is stripped because it does not support DecodeJpeg, whereas in the above code its the actually generated unstripped model. Is there any specific reason or somewhere I am wrong here?
I also tried using optimize_for_inference
but unfortunately, it fails with following error:
[milinddeore#P028: ~/tf/tensorflow ] bazel-bin/tensorflow/python/tools/optimize_for_inference --input=/Users/milinddeore/tf_files_nm/retrained_graph.pb --output=/Users/milinddeore/tf/tensorflow/tensorflow/examples/android/assets/tf_ul_stripped_graph.pb --input_names=DecodeJpeg/content —-output_names=final_result
Traceback (most recent call last):
File "/Users/milinddeore/tf/tensorflow/bazel-bin/tensorflow/python/tools/optimize_for_inference.runfiles/org_tensorflow/tensorflow/python/tools/optimize_for_inference.py", line 141, in <module>
app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/Users/milinddeore/tf/tensorflow/bazel-bin/tensorflow/python/tools/optimize_for_inference.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/Users/milinddeore/tf/tensorflow/bazel-bin/tensorflow/python/tools/optimize_for_inference.runfiles/org_tensorflow/tensorflow/python/tools/optimize_for_inference.py", line 90, in main
FLAGS.output_names.split(","), FLAGS.placeholder_type_enum)
File "/Users/milinddeore/tf/tensorflow/bazel-bin/tensorflow/python/tools/optimize_for_inference.runfiles/org_tensorflow/tensorflow/python/tools/optimize_for_inference_lib.py", line 91, in optimize_for_inference
placeholder_type_enum)
File "/Users/milinddeore/tf/tensorflow/bazel-bin/tensorflow/python/tools/optimize_for_inference.runfiles/org_tensorflow/tensorflow/python/tools/strip_unused_lib.py", line 71, in strip_unused
output_node_names)
File "/Users/milinddeore/tf/tensorflow/bazel-bin/tensorflow/python/tools/optimize_for_inference.runfiles/org_tensorflow/tensorflow/python/framework/graph_util_impl.py", line 141, in extract_sub_graph
assert d in name_to_node_map, "%s is not in graph" % d
AssertionError: is not in graph
I suspect that this problem is due to the android not being parse DecodeJpeg, but please correct me if i am wrong.
What other attempted solutions have you tried?
Yes, I the above script and it gives me quite high accuracy result.
Well, the reason for bad accuracy is following:
I ran this example code on Lenovo Vibe K5 mobile (This has SanpDragon 415), this wasn't compiled for hexagon DSP, even through the DSP on 415 is very old as compared to 835 (Hexagon DSP 682), in fact i am not quite sure if the Hexagon SDK will work with 415 or not, i haven't tried this though. This means the example was running on CPU to first detect motion and later classify them and hence the poor performance.
Slow FPS, will capture the images very slowly and hence moving objects will be really difficult.
So if you have bad image, there are very strong chances that the prediction will also be bad.
Camera capture and classification is taking long time, due to latency its not quite real-time.
Related
I am currently writing a script to augment a dataset for me using tf.keras (code given below). I'm pretty new to tf and data augmentation so I've been following a tutorial (https://blog.devgenius.io/data-augmentation-programming-e9a4703198be) pretty religiously. Despite this, I've been running into a lot of errors when I try to actually apply the ImageDataGenerator object to the image I'm loading. Specifically, I keep getting this error:
Exception has occurred: FileNotFoundError
[Errno 2] No such file or directory: '/home/kai/SURF22/yolov5/data/sc_google_aug/aug_0_3413.png'
File "/home/kai/SURF22/yolov5/data_augmentation", line 45, in <module>
for batch in idg.flow(aug_array,
It seems like tf can't find the image I want it to augment but I have no idea why because I load the image and input it as an array like the tutorial does. I tried inputting the absolute file path to the image instead one time but then I got a "string to float" error. Basically, I have no idea what is wrong and no one else seems to be getting this error when applying a for loop to .flow(). If anyone has advice on what could be going wrong I'd really appreciate it!
# images folder directory
folder_dir = "/home/kai/SURF22/yolov5/data/"
# initialize count
i = 0
for image in os.listdir(folder_dir + "prelim_data/sc_google_trans"):
# open the image
img = Image.open(folder_dir + "prelim_data/sc_google_trans/" + image)
# make copy of image to augment
# want to preserve original image
aug_img = img.copy()
# define an ImageDataGenerator object
idg = ImageDataGenerator(horizontal_flip=True,
vertical_flip=True,
rotation_range=360,
brightness_range=[0.2, 1.0],
shear_range=45)
# aug_img = load_img(folder_dir + "prelim_data/sc_google_trans/0.png")
# reshape image to a 4D array to be used with keras flow function
aug_array = img_to_array(aug_img)
aug_array = aug_array.reshape((1,) + aug_array.shape)
# augment image
for batch in idg.flow(aug_array,
batch_size=1,
save_to_dir='/home/kai/SURF22/yolov5/data/sc_google_aug',
save_prefix='aug',
save_format='png'):
i += 1
if i > 3:
break
I am trying to download data from Fashion MNIST, but it produces an error. Originally, it was downloading and working properly, but I had to terminate it because I had to turn off my computer. Once I opened the file up again, it gives me an error. I'm not sure what the problem is, but is it because I already downloaded some parts of the data once, and keras doesn't recognize that? I am using Jupyter notebook in a conda environment
Here is the link to the image:
https://i.stack.imgur.com/wLGDm.png
You have missed adding tf. to the line
fashion_mnist = keras.datasets.fashion_mnist
The below code works perfectly for me. Importing the fashion_mnist dataset has been outlined in tensorflow documention here.
Change your code to:
import tensorflow as tf
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
or, use the better way to do it below. This avoids creating an extra variable fashion_mnist:
import tensorflow as tf
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.fashion_mnist.load_data()
I am using tensorflow 1.9.0, keras 2.2.2 and python 3.6.6 on Windows 10 x64 OS.
I know my pc well, I can't download anything larger than 2.7 MB (in terminal), due to WinError 8.
So I manually downloaded all packs from storage.google (since some packs are 25 MB).
Check the packs:
then I paste all packs to \datasets\fashion-mnist
The next time u run your code, it should be fixed.
Note : If u have VScode then just CTRL and click the link, then you can download it easily.
I had an error regarding the cURL connection, and by looking into the error message I was able to track the file where the URL was declared. In my case it was:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow_core/python/keras/datasets/fashion_mnist.py
At line 44 I have commented out the line:
# base = 'https://storage.googleapis.com/tensorflow/tf-keras-datasets/'
And declared a different base URL, which I had found looking into the documentation of the original dataset:
base = 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/'
The download started immediately and gave no errors. Hope this helps.
This is because for some reason you have an incomplete download for the MNIST dataset.
You will have to manually delete the downloaded folder which usually resides in ~/.keras/datasets or any path specified by you relative to this path, in your case MNIST_data.
Go to : C:\Users\Username.keras\datasets
and then Delete the Dataset that you want to redownload or has the error
You should be good to go!
You can also manually add print for the path from which it is taking dataset ..
Ex: print(paths) in file fashion_mnist.py
with gzip.open(paths[3], 'rb') as imgpath:
print(paths) #debug print in fashion_mnist.py
x_test = np.frombuffer(
imgpath.read(), np.uint8, offset=16).reshape(len(y_test), 28, 28)
& from this path, remove the files & this will start to download fresh data ..
Change The base address with 'http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/' as described previously. It works for me.
I was getting error of Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Traceback (most recent call last):
File "C:\Users\AsadA\AppData\Local\Programs\Python\Python38\lib\site-packages\numpy\lib\npyio.py", line 448, in load
return pickle.load(fid, **pickle_kwargs)
EOFError: Ran out of input
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\AsadA\AppData\Local\Programs\Python\Python38\lib\site-packages\numpy\lib\npyio.py", line 450, in load
raise IOError(
OSError: Failed to interpret file 'C:\\Users\\AsadA\\.keras\\datasets\\mnist.npz' as a pickle"**
GO TO FILE C:\Users\AsadA\AppData\Local\Programs\Python\Python38\Lib\site-packages\tensorflow\python\keras\datasets (In my Case) and follow the instructions:
I've been trying to adapt the reddit_tft example from the cloud-ml github samples repo to my needs.
I've been able to get it running as per the tutorial readme.
However what i want to use it for is a binary classification problem and also output keys in batch prediction.
So i have made copy of the tutorial code here and have changed it in a few places to be able to have a model type of deep_classifier that would use a DNNClasifier instead of a DNNRegressor.
I've changed the score variable to be
if(score>0,1,0) as score
It's training fine, deploys to cloud ml but i'm not sure how to now get keys back from my predictions. `
I've updated the sql pulling from BigQuery to include id as example_id here
It seems the code from the tutorial had some sort of placeholder for example_id so i'm trying to leverage that.
It all seems to work but when i get batch predictions all i get is json like this:
{"classes": ["0", "1"], "scores": [0.20427155494689941, 0.7957285046577454]}
{"classes": ["0", "1"], "scores": [0.14911963045597076, 0.8508803248405457]}
...
So example_id does not seem to be making it into the serving functions like i need.
I've tried to follow the approach here which is based on adapting the census example for keys.
I just cant figure out how to finish adapting this reddit example to also output keys in the predictions as they look a bit different to me in terms of design and functions being used.
Update 1
My latest attempt is here Trying to use the approach outlined here.
However this is giving errors:
NotFoundError (see above for traceback): /tmp/tmp2jllvb/model.ckpt-1_temp_9530d2c5823d4462be53fa5415e429fd; No such file or directory
[[Node: save/SaveV2 = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64], _device="/job:ps/replica:0/task:0/device:CPU:0"](save/ShardedFilename, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, dnn/hiddenlayer_0/kernel/part_2/read, dnn/dnn/hiddenlayer_0/kernel/part_2/Adagrad/read, dnn/hiddenlayer_1/kernel/part_2/read, dnn/dnn/hiddenlayer_1/kernel/part_2/Adagrad/read, dnn/input_from_feature_columns/input_layer/subreddit_id_embedding/weights/part_0/read, dnn/dnn/input_from_feature_columns/input_layer/subreddit_id_embedding/weights/part_0/Adagrad/read, dnn/logits/bias/part_0/read, dnn/dnn/logits/bias/part_0/Adagrad/read, global_step)]]
Update 2
My latest attempt and details are here.
I'm now getting a error from tensorflow-fransform (run_preprocess.sh works fine in tft 0.1)
File "/usr/local/lib/python2.7/dist-packages/tensorflow_transform/tf_metadata/dataset_schema.py", line 282, in __setstate__
self._dtype = tf.as_dtype(state['dtype'])
TypeError: string indices must be integers, not str
Update 3
I have changed things to just use beam + csv and avoid tft. Also i'm now using the approach as outlined here for extending the canned estimator to get the key back with the predictions.
However when following this post to try get the comments in as features i'm now running into a new error.
The replica worker 3 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): [...] File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/estimator/python/estimator/extenders.py", line 87, in new_model_fn spec = estimator.model_fn(features, labels, mode, config) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 203, in public_model_fn return self._call_model_fn(features, labels, mode, config) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/canned/dnn_linear_combined.py", line 520, in _model_fn config=config) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/canned/dnn_linear_combined.py", line 158, in _dnn_linear_combined_model_fn dnn_logits = dnn_logit_fn(features=features, mode=mode) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 89, in dnn_logit_fn features=features, feature_columns=feature_columns) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column.py", line 226, in input_layer with variable_scope.variable_scope(None, default_name=column.name): File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 1826, in __enter__ current_name_scope_name = self._current_name_scope.__enter__() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 4932, in __enter__ return self._name_scope.__enter__() File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3514, in name_scope raise ValueError("'%s' is not a valid scope name" % name) ValueError: 'Tensor("Slice:0", shape=(?, 20), dtype=int64)_embedding' is not a valid scope name
My repo for this attempt/approach is here. This all runs fine if i just use subreddit as a feature, it's adding in the comment feature that seems to be causing the problems. Lines 103 to 111 is where i have followed this approach.
Not sure what's triggering the error in my code from reading the trace. Anyone any ideas?
Or can anyone point me towards another approach to go from text to bow to embedding feature in TF?
See:
https://medium.com/#lakshmanok/how-to-extend-a-canned-tensorflow-estimator-to-add-more-evaluation-metrics-and-to-pass-through-ddf66cd3047d
Here's what the code looks like to pass through keys:
def forward_key_to_export(estimator):
estimator = tf.contrib.estimator.forward_features(estimator, KEY_COLUMN)
## This shouldn't be necessary (I've filed CL/187793590 to update extenders.py with this code)
config = estimator.config
def model_fn2(features, labels, mode):
estimatorSpec = estimator._call_model_fn(features, labels, mode, config=config)
if estimatorSpec.export_outputs:
for ekey in ['predict', 'serving_default']:
estimatorSpec.export_outputs[ekey] = \
tf.estimator.export.PredictOutput(estimatorSpec.predictions)
return estimatorSpec
return tf.estimator.Estimator(model_fn=model_fn2, config=config)
##
# Create estimator to train and evaluate
def train_and_evaluate(output_dir):
estimator = tf.estimator.DNNLinearCombinedRegressor(...)
estimator = forward_key_to_export(estimator)
...
tf.estimator.train_and_evaluate(estimator, ...)
We have plans, but haven't moved the changes into Census yet for the output keys. In the mean time can you please see if this gist helps https://gist.github.com/andrewm4894/ebd3ac3c87e2ab4af8a10740e85073bb#file-with_keys_model-py
Please feel free to send a PR if you get to it sooner and we will merge your contribution.
I was training a neural network and had run over all the training data for several epochs successfully.
However, the tfrecord corrputed error suddenly came out as follows:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/lib/io/tf_record.py", line 77, in tf_record_iterator
reader.GetNext(status)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 106241330
I checked the data file again and it was indeed corrupted at that line. But the data was intact before I ran the training code and I simply just read the data by following code:
batch_data = []
record_iterator = tf.python_io.tf_record_iterator(path=file, options=options)
for string_record in record_iterator:
example = tf.train.Example()
example.ParseFromString(string_record)
data = generate_data_from_record(example) # record parsing code
batch_data.append(data)
if len(batch_data) == batch_size:
yield batch_data
batch_data = []
I am wondering why the data file was corrupted and how can I remain the integrity of the data file.
You should make a clean copy of your tfrecord files. Whenever your working copy get corrupted, replace from the clean copy. The dataLoss error seems to be as a result of several reading of the same record, and its also dependent on the disk.
If someone is facing this problem, the above answer by #nwoye-cid worked for me plus the link below to install everything properly.
Also, restart your kernel from scratch if nothing works then only go for other solutions.
Link!
I'm trying to run https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/text_classification_character_cnn.py for learning, but I get an error message:
File "C:\Users\natlun\AppData\Local\Continuum\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py", line 72, in load_csv_without_header
data = np.array(data)
MemoryError
I use CPU installation of TensorFlow and Python 3.5. Any ideas how to solve the problem?? Other scripts using a csv-file for input work fine.
I was having the same issue. And after many hours of reading and googling (and seeing your unanswered question), and just comparing the example with other examples that do run, I noticed that
dbpedia = tf.contrib.learn.datasets.load_dataset(
'dbpedia', test_with_fake_data=FLAGS.test_with_fake_data, size='large')
should just be
dbpedia = tf.contrib.learn.datasets.load_dataset(
'dbpedia', test_with_fake_data=FLAGS.test_with_fake_data)
Based off of what I've read about numpy, I'd bet the "size='large'" parameter causes an over allocation to a numpy array (which throws the memory error).
Or, when you don't set that parameter perhaps the input data is truncated.
Or some other thing. Anyway, I hope this helps others attempting to run this useful example!
--- Update ---
Without "size='large'" the load_dataset functions appears to create smaller training and test data sets (like 1/1000 the size).
After playing around with the example I realized I could manually load and use the whole data set without getting the memory error (assume it is saving the whole data set as it appears).
# Prepare training and testing data
##This was the provided method for setting up the data.
# dbpedia = tf.contrib.learn.datasets.load_dataset(
# 'dbpedia', test_with_fake_data=FLAGS.test_with_fake_data)
# x_trainz = pandas.DataFrame(dbpedia.train.data)[1]
# y_trainz = pandas.Series(dbpedia.train.target)
# x_testz = pandas.DataFrame(dbpedia.test.data)[1]
# y_testz = pandas.Series(dbpedia.test.target)
##And this is my replacement.
x_train = []
y_train = []
x_test = []
y_test = []
with open("dbpedia_data/dbpedia_csv/train.csv", encoding='utf-8') as filex:
reader = csv.reader(filex)
for row in reader:
x_train.append(row[2])
y_train.append(int(row[0]))
with open("dbpedia_data/dbpedia_csv/test.csv", encoding='utf-8') as filex:
reader = csv.reader(filex)
for row in reader:
x_test.append(row[2])
y_test.append(int(row[0]))
x_train = pandas.Series(x_train)
y_train = pandas.Series(y_train)
x_test = pandas.Series(x_test)
y_test = pandas.Series(y_test)
The example seems to now be evaluating the whole training data set. But, the original code will probably need to be run once to get/put the data in the correct sub-folders. Also, even while evaluating the whole data set little memory is used (just a few hundred MB). Which, makes me think that the load_dataset function is broken in some way.