parse_single_sequence_example says number of values != expected - tensorflow

I'm having a problem deserializing a SequenceExample, using tensorflow 2.4.1. I get this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Name: <unknown>, Key: Names, Index: 0. Number of values != expected. values size: 2 but output shape: [] [Op:ParseSequenceExampleV2]
When I run this code:
import tensorflow as tf
feature_list = tf.train.FeatureList(feature=[
tf.train.Feature(bytes_list=tf.train.BytesList(value=[b'foo', b'bar']))
])
feature_lists = tf.train.FeatureLists(feature_list={ 'Names': feature_list })
sequence_example = tf.train.SequenceExample(context=None, feature_lists=feature_lists)
sequence_features = {
'Names': tf.io.FixedLenSequenceFeature([], dtype=tf.string),
}
_, sequence_data = tf.io.parse_single_sequence_example(
serialized=sequence_example.SerializeToString(),
context_features=None,
sequence_features=sequence_features)
What am I doing wrong?

Related

Output probability of prediction in tensorflow.js

I have a model.json generated from tensorflow via tensorflow.js coverter
In the original implementation of model in tensorflow in python, it is built like this:
model = models.Sequential([
base_model,
layers.Dropout(0.2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
In tensorflow, the probability can be generated by score = tf.nn.softmax(predictions[0]), according to the tutorial on official website.
How do I get this probability in tensorflow.js?
I have copied the codes template as below:
$("#predict-button").click(async function () {
if (!modelLoaded) { alert("The model must be loaded first"); return; }
if (!imageLoaded) { alert("Please select an image first"); return; }
let image = $('#selected-image').get(0);
// Pre-process the image
console.log( "Loading image..." );
let tensor = tf.browser.fromPixels(image, 3)
.resizeNearestNeighbor([224, 224]) // change the image size
.expandDims()
.toFloat()
// RGB -> BGR
let predictions = await model.predict(tensor).data();
console.log(predictions);
let top5 = Array.from(predictions)
.map(function (p, i) { // this is Array.map
return {
probability: p,
className: TARGET_CLASSES[i] // we are selecting the value from the obj
};
}).sort(function (a, b) {
return b.probability - a.probability;
}).slice(0, 2);
console.log(top5);
$("#prediction-list").empty();
top5.forEach(function (p) {
$("#prediction-list").append(`<li>${p.className}: ${p.probability.toFixed(6)}</li>`);
});
How should I modify the above code?
The output is just the same as the value of variable 'predictions':
Float32Array(5)
0: -2.5525975227355957
1: 7.398464679718018
2: -3.252196788787842
3: 4.710395812988281
4: -4.636396408081055
buffer: (...)
byteLength: (...)
byteOffset: (...)
length: (...)
Symbol(Symbol.toStringTag): (...)
__proto__: TypedArray
0: {probability: 7.398464679718018, className: "Sunflower"}
1: {probability: 4.710395812988281, className: "Rose"}
length: 2
__proto__: Array(0)
Please help!!!
Thanks!
In order to extract the probabilities from the logits of the model using a softmax function you can do the following:
This is the array of logits that are also the predictions you get from the model
const logits = [-2.5525975227355957, 7.398464679718018, -3.252196788787842, 4.710395812988281, -4.636396408081055]
You can call tf.softmax() on the array of values
const probabilities = tf.softmax(logits)
Result:
[0.0000446, 0.9362511, 0.0000222, 0.0636765, 0.0000056]
Then if you wanted to get the index with the highest probability you can make use of tf.argMax():
const results = tf.argMax(probabilities).dataSync()[0]
Result:
1
Edit
I am not too familiar with jQuery so this might not be correct. But here is how I would get the probabilities of the outputs in descending order:
let probabilities = tf.softmax(predictions).dataSync();
$("#prediction-list").empty();
probabilities.forEach(function(p, i) {
$("#prediction-list").append(
`<li>${TARGET_CLASSES[i]}: ${p.toFixed(6)}</li>`
);
});

How can i use 38 classes instead of 1000 in model.predict decode predictions

i am founding an error in plant disease detection using resnet50 deep learning model every time it raises an error message in decode_predictions
error
expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 38)"
enter code here
model = ResNet50(weights='imagenet',include_top=False,classes=38)
try:
model = load_model('/content/drive/My
Drive/color/checkpoints/ResNet50_model_weights.h5')
print("model loaded")
except:
print("model not loaded")
img_path = '/content/drive/My Drive/color/test/0/appleblackrot188.jpg'
img = image.load_img(img_path, target_size=(300, 300))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted:', decode_predictions(preds,top=3)[0])
You can try using the preprocesing function:
import tensorflow as tf
# Using the keras wrapper on tensorflow (it must be the same using just keras).
IMAGE = [] # From image source, i did it from the camera.
toPred = tf.keras.applications.resnet50.preprocess_input(np.expand_dims(tf.keras.preprocessing.image.img_to_array(IMAGE), axis=0))
Maybe that can help :)
decode_predictions works only for ImageNet (no. of classes = 1000). For these 38 classes of plants, you have to write your own decode predictions based on the ground truth label you've assigned for each plant.
First, you need an index JSON file and create a new decode_predictions function.
For example
This HAM10000 that has 7 classes and you need to split to each folder like this
then make an index JSON file like this
{
"0" : [
"AKIEC",
"akiec"
],
"1" : [
"BCC",
"bcc"
],
"2" : [
"BKL",
"bkl"
],
"3" : [
"DF",
"df"
],
"4" : [
"MEL",
"mel"
],
"5" : [
"NV",
"nv"
],
"6" : [
"VASC",
"vasc"
]
}
Then try this code
def decode_predictions(preds, top=4, class_list_path='/content/SKIN_DATA/HAM10000_index.json'):
if len(preds.shape) != 2 or preds.shape[1] != 7: # your classes number
raise ValueError('`decode_predictions` expects '
'a batch of predictions '
'(i.e. a 2D array of shape (samples, 1000)). '
'Found array with shape: ' + str(preds.shape))
index_list = json.load(open(class_list_path))
results = []
for pred in preds:
top_indices = pred.argsort()[-top:][::-1]
result = [tuple(index_list[str(i)]) + (pred[i],) for i in top_indices]
result.sort(key=lambda x: x[2], reverse=True)
results.append(result)
return results

Error Processing Input for GCP ML Engine Prediction

I have a TensorFlow model on GCP ML Engine, however I have a problem with the JSON string below:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
import json
credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json')
request_data = {'instances':
[{
'inputs':{
'clump_thickness': 2,
'size_uniformity': 1,
'shape_uniformity': 1,
'marginal_adhesion': 1,
'epithelial_size': 2,
'bland_chromatin': 1,
'bare_nucleoli': 2,
'normal_nucleoli': 1,
'mitoses': 1
}
}]
}
parent = 'projects/%s/models/%s/versions/%s' % (PROJECT,
'breastCancer_optimized_06152018_2_2_a', 'v1')
response = api.projects().predict(body=request_data, name=parent).execute()
print(response)
I get the following error:
{'error': "Prediction failed: Error processing input: Expected string, got {u'epithelial_size': 2, u'marginal_adhesion': 1, u'clump_thickness': 2, u'size_uniformity': 1, u'shape_uniformity': 1, u'normal_nucleoli': 1, u'mitoses': 1, u'bland_chromatin': 1, u'bare_nucleoli': 2} of type 'dict' instead."}
I can't seem to format request_data properly. Does anyone see what is wrong?
original serving function:
clump_thickness = tf.feature_column.numeric_column("clump_thickness");
size_uniformity = tf.feature_column.numeric_column("size_uniformity");
shape_uniformity = tf.feature_column.numeric_column("shape_uniformity");
marginal_adhesion = tf.feature_column.numeric_column("marginal_adhesion");
epithelial_size = tf.feature_column.numeric_column("epithelial_size");
bare_nucleoli = tf.feature_column.numeric_column("bare_nucleoli");
bland_chromatin = tf.feature_column.numeric_column("bland_chromatin");
normal_nucleoli = tf.feature_column.numeric_column("normal_nucleoli");
mitoses = tf.feature_column.numeric_column("mitoses");
feature_columns = [clump_thickness, size_uniformity, shape_uniformity, marginal_adhesion, epithelial_size,
bare_nucleoli, bland_chromatin, normal_nucleoli, mitoses];
feature_spec = tf.feature_column.make_parse_example_spec(feature_columns);
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec);
estimator.export_savedmodel(output_dir, export_input_fn, as_text=False)
Then I tried:
def serving_input_fn():
feature_placeholders = {
'clump_thickness' : tf.placeholder(tf.float32, [None]),
'size_uniformity' : tf.placeholder(tf.float32, [None]),
'shape_uniformity' : tf.placeholder(tf.float32, [None]),
'marginal_adhesion' : tf.placeholder(tf.float32, [None]),
'epithelial_size' : tf.placeholder(tf.float32, [None]),
'bare_nucleoli' : tf.placeholder(tf.float32, [None]),
'bland_chromatin' : tf.placeholder(tf.float32, [None]),
'normal_nucleoli' : tf.placeholder(tf.float32, [None]),
'mitoses' : tf.placeholder(tf.float32, [None]),
}
features = feature_placeholders # no transformation needed
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
And in the train_and_eval function:
estimator.export_savedmodel(output_dir, serving_input_fn, as_text=False)
But now I get the following error:
{'error': "Prediction failed: Expected tensor name: inputs, got tensor name: [u'epithelial_size', u'marginal_adhesion', u'clump_thickness', u'size_uniformity', u'shape_uniformity', u'normal_nucleoli', u'mitoses', u'bland_chromatin', u'bare_nucleoli']."}
The estimator.export_savedmodel appears to create a model which requires a tensor input(in the request_data line).
When I use the model created with either serving function the following works fine:
predict_fn = tf.contrib.predictor.from_saved_model("gs://test-
203900/breastCancer_optimized_06182018/9/1529432417")
# Test inputs represented by Pandas DataFrame.
inputs = pd.DataFrame({
'clump_thickness': [2,5,4],
'size_uniformity': [1,10,8],
'shape_uniformity': [1,10,6],
'marginal_adhesion': [1,3,4],
'epithelial_size': [2,7,3],
'bland_chromatin': [1,3,4],
'bare_nucleoli': [2,8,10],
'normal_nucleoli': [1,10,6],
'mitoses': [1,2,1],
})
# Convert input data into serialized Example strings.
examples = []
for index, row in inputs.iterrows():
feature = {}
for col, value in row.iteritems():
feature[col] =
tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
example = tf.train.Example(
features=tf.train.Features(
feature=feature
)
)
examples.append(example.SerializeToString())
# Make predictions.
predictions = predict_fn({'inputs': examples})
It depends on what your serving input function is. It appears from the error message that 'inputs' needs to be a string (maybe comma-separated?)
Try this:
saved_model_cli show --dir $MODEL_LOCATION --tag_set serve --signature_def serving_default
It will tell you what your serving input function is set to.
I suspect that what you want is for your serving input function to be:
def serving_input_fn():
feature_placeholders = {
'size_uniformity' : tf.placeholder(tf.float32, [None]),
'shape_uniformity' : tf.placeholder(tf.float32, [None])
}
features = feature_placeholders # no transformation needed
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
and for your input format to be:
request_data = {'instances':
[
{
'size_uniformity': 1,
'shape_uniformity': 1
}
]
}

AttributeError: 'list' object has no attribute 'model_dir'

I'm running a wide_deep.py script for linear regression in tensorflow.
I have cloned the models directory also as a part of process.
But i'm getting a error like AttributeError: 'list' object has no attribute 'model_dir'.
If I hard code this particular variable, I m getting other errors as AttributeError: 'list' object has no attribute 'data_dir' and so on .
Code:
"""Example code for TensorFlow Wide & Deep Tutorial using tf.estimator API."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import shutil
from absl import app as absl_app
from absl import flags
import tensorflow as tf # pylint: disable=g-bad-import-order
from official.utils.flags import core as flags_core
from official.utils.logs import hooks_helper
from official.utils.misc import model_helpers
_CSV_COLUMNS = [
'age', 'workclass', 'fnlwgt', 'education', 'education_num',
'marital_status', 'occupation', 'relationship', 'race', 'gender',
'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
'income_bracket'
]
_CSV_COLUMN_DEFAULTS = [[0], [''], [0], [''], [0], [''], [''], [''], [''], [''],
[0], [0], [0], [''], ['']]
_NUM_EXAMPLES = {
'train': 32561,
'validation': 16281,
}
LOSS_PREFIX = {'wide': 'linear/', 'deep': 'dnn/'}
def define_wide_deep_flags():
"""Add supervised learning flags, as well as wide-deep model type."""
flags_core.define_base()
flags.adopt_module_key_flags(flags_core)
flags.DEFINE_enum(
name="model_type", short_name="mt", default="wide_deep",
enum_values=['wide', 'deep', 'wide_deep'],
help="Select model topology.")
flags_core.set_defaults(data_dir='/tmp/census_data',
model_dir='/tmp/census_model',
train_epochs=40,
epochs_between_evals=2,
batch_size=40)
def build_model_columns():
"""Builds a set of wide and deep feature columns."""
# Continuous columns
age = tf.feature_column.numeric_column('age')
education_num = tf.feature_column.numeric_column('education_num')
capital_gain = tf.feature_column.numeric_column('capital_gain')
capital_loss = tf.feature_column.numeric_column('capital_loss')
hours_per_week = tf.feature_column.numeric_column('hours_per_week')
education = tf.feature_column.categorical_column_with_vocabulary_list(
'education', [
'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college',
'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school',
'5th-6th', '10th', '1st-4th', 'Preschool', '12th'])
marital_status = tf.feature_column.categorical_column_with_vocabulary_list(
'marital_status', [
'Married-civ-spouse', 'Divorced', 'Married-spouse-absent',
'Never-married', 'Separated', 'Married-AF-spouse', 'Widowed'])
relationship = tf.feature_column.categorical_column_with_vocabulary_list(
'relationship', [
'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried',
'Other-relative'])
workclass = tf.feature_column.categorical_column_with_vocabulary_list(
'workclass', [
'Self-emp-not-inc', 'Private', 'State-gov', 'Federal-gov',
'Local-gov', '?', 'Self-emp-inc', 'Without-pay', 'Never-worked'])
# To show an example of hashing:
occupation = tf.feature_column.categorical_column_with_hash_bucket(
'occupation', hash_bucket_size=1000)
# Transformations.
age_buckets = tf.feature_column.bucketized_column(
age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
# Wide columns and deep columns.
base_columns = [
education, marital_status, relationship, workclass, occupation,
age_buckets,
]
crossed_columns = [
tf.feature_column.crossed_column(
['education', 'occupation'], hash_bucket_size=1000),
tf.feature_column.crossed_column(
[age_buckets, 'education', 'occupation'], hash_bucket_size=1000),
]
wide_columns = base_columns + crossed_columns
deep_columns = [
age,
education_num,
capital_gain,
capital_loss,
hours_per_week,
tf.feature_column.indicator_column(workclass),
tf.feature_column.indicator_column(education),
tf.feature_column.indicator_column(marital_status),
tf.feature_column.indicator_column(relationship),
# To show an example of embedding
tf.feature_column.embedding_column(occupation, dimension=8),
]
return wide_columns, deep_columns
def build_estimator(model_dir, model_type):
"""Build an estimator appropriate for the given model type."""
wide_columns, deep_columns = build_model_columns()
hidden_units = [100, 75, 50, 25]
# Create a tf.estimator.RunConfig to ensure the model is run on CPU, which
# trains faster than GPU for this model.
run_config = tf.estimator.RunConfig().replace(
session_config=tf.ConfigProto(device_count={'GPU': 0}))
if model_type == 'wide':
return tf.estimator.LinearClassifier(
model_dir=model_dir,
feature_columns=wide_columns,
config=run_config)
elif model_type == 'deep':
return tf.estimator.DNNClassifier(
model_dir=model_dir,
feature_columns=deep_columns,
hidden_units=hidden_units,
config=run_config)
else:
return tf.estimator.DNNLinearCombinedClassifier(
model_dir=model_dir,
linear_feature_columns=wide_columns,
dnn_feature_columns=deep_columns,
dnn_hidden_units=hidden_units,
config=run_config)
def input_fn(data_file, num_epochs, shuffle, batch_size):
"""Generate an input function for the Estimator."""
assert tf.gfile.Exists(data_file), (
'%s not found. Please make sure you have run data_download.py and '
'set the --data_dir argument to the correct path.' % data_file)
def parse_csv(value):
print('Parsing', data_file)
columns = tf.decode_csv(value, record_defaults=_CSV_COLUMN_DEFAULTS)
features = dict(zip(_CSV_COLUMNS, columns))
labels = features.pop('income_bracket')
return features, tf.equal(labels, '>50K')
# Extract lines from input files using the Dataset API.
dataset = tf.data.TextLineDataset(data_file)
if shuffle:
dataset = dataset.shuffle(buffer_size=_NUM_EXAMPLES['train'])
dataset = dataset.map(parse_csv, num_parallel_calls=5)
# We call repeat after shuffling, rather than before, to prevent separate
# epochs from blending together.
dataset = dataset.repeat(num_epochs)
dataset = dataset.batch(batch_size)
return dataset
def export_model(model, model_type, export_dir):
"""Export to SavedModel format.
Args:
model: Estimator object
model_type: string indicating model type. "wide", "deep" or "wide_deep"
export_dir: directory to export the model.
"""
wide_columns, deep_columns = build_model_columns()
if model_type == 'wide':
columns = wide_columns
elif model_type == 'deep':
columns = deep_columns
else:
columns = wide_columns + deep_columns
feature_spec = tf.feature_column.make_parse_example_spec(columns)
example_input_fn = (
tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec))
model.export_savedmodel(export_dir, example_input_fn)
def run_wide_deep(flags_obj):
"""Run Wide-Deep training and eval loop.
Args:
flags_obj: An object containing parsed flag values.
"""
# Clean up the model directory if present
shutil.rmtree(flags_obj.model_dir, ignore_errors=True)
model = build_estimator(flags_obj.model_dir, flags_obj.model_type)
train_file = os.path.join(flags_obj.data_dir, 'adult.data')
test_file = os.path.join(flags_obj.data_dir, 'adult.test')
# Train and evaluate the model every `flags.epochs_between_evals` epochs.
def train_input_fn():
return input_fn(
train_file, flags_obj.epochs_between_evals, True, flags_obj.batch_size)
def eval_input_fn():
return input_fn(test_file, 1, False, flags_obj.batch_size)
loss_prefix = LOSS_PREFIX.get(flags_obj.model_type, '')
train_hooks = hooks_helper.get_train_hooks(
flags_obj.hooks, batch_size=flags_obj.batch_size,
tensors_to_log={'average_loss': loss_prefix + 'head/truediv',
'loss': loss_prefix + 'head/weighted_loss/Sum'})
# Train and evaluate the model every `flags.epochs_between_evals` epochs.
for n in range(flags_obj.train_epochs // flags_obj.epochs_between_evals):
model.train(input_fn=train_input_fn, hooks=train_hooks)
results = model.evaluate(input_fn=eval_input_fn)
# Display evaluation metrics
print('Results at epoch', (n + 1) * flags_obj.epochs_between_evals)
print('-' * 60)
for key in sorted(results):
print('%s: %s' % (key, results[key]))
if model_helpers.past_stop_threshold(
flags_obj.stop_threshold, results['accuracy']):
break
# Export the model
if flags_obj.export_dir is not None:
export_model(model, flags_obj.model_type, flags_obj.export_dir)
def main(_):
run_wide_deep(flags.FLAGS)
if __name__ == '__main__':
tf.logging.set_verbosity(tf.logging.INFO)
define_wide_deep_flags()
absl_app.run(main)
Hunter, I tried to run without hardcoding but still faced issues with attributes , so I tried to hard code to avoid this .
But, The issue is resolved now.
I cloned the directory again and instead of copying the wide_deep.py to another directory and run from there(which I was doing before), I ran directly from the same directory and now it is working fine.

Tensorflow map_fn gives error: ValueError: No attr named '_XlaCompile'

I try to implement the 'batch hard' batches as described in https://arxiv.org/pdf/1703.07737.pdf to use with a triplet loss. So input is of shape [batch_size, 32] and output should be a list representing triplets, so [[batch_size, 32], [batch_size, 32], [batch_size, 32]] when each individual example is of size (32,).
I implemented this with the following function, so basically using tf.map_fn:
def batch_hard(inputs):
"""
Batch Hard triplets as described in https://arxiv.org/pdf/1703.07737.pdf.
For each sample in input the hardest positive and hardest negative
in the given batch will be selected. A triplet is returned.
"""
class_ids, f_anchor = inputs[0], inputs[1]
def body(x):
class_id, f = x[0], x[1]
same_class = tf.equal(class_ids, class_id)
positive = same_class
negative = tf.logical_not(same_class)
positive = tf.squeeze(positive)
negative = tf.squeeze(negative)
positive.set_shape([None])
negative.set_shape([None])
samples_pos = tf.boolean_mask(f_anchor, positive)
samples_neg = tf.boolean_mask(f_anchor, negative)
# Select hardest positive example
distances = euclidean_distance(samples_pos, f)
hardest_pos = samples_pos[tf.argmax(distances)]
# Select hardest negative example
distances = euclidean_distance(samples_neg, f)
hardest_neg = samples_neg[tf.argmin(distances)]
return [hardest_pos, hardest_neg]
[f_pos, f_neg] = tf.map_fn(body, inputs, dtype=[tf.float32, tf.float32])
return [f_anchor, f_pos, f_neg]
This works perfectly when I only perform a forward pass, with no train_op specified . However when I add this line train_op = optimizer.minimize(loss, global_step=global_step) the following error occurs:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py", line 348, in _MaybeCompile
xla_compile = op.get_attr("_XlaCompile")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2003, in get_attr
raise ValueError("No attr named '" + name + "' in " + str(self._node_def))
ValueError: No attr named '_XlaCompile' in name: "map/while/strided_slice"
op: "StridedSlice"
input: "map/while/boolean_mask/Gather"
input: "map/while/strided_slice/stack"
input: "map/while/strided_slice/stack_1"
input: "map/while/strided_slice/Cast"
attr {
key: "Index"
value {
type: DT_INT64
}
}
attr {
key: "T"
value {
type: DT_FLOAT
}
}
attr {
key: "begin_mask"
value {
i: 0
}
}
attr {
key: "ellipsis_mask"
value {
i: 0
}
}
attr {
key: "end_mask"
value {
i: 0
}
}
attr {
key: "new_axis_mask"
value {
i: 0
}
}
attr {
key: "shrink_axis_mask"
value {
i: 1
}
}
Does anyone has an idea what goes wrong?
A full example of this issue is here https://gist.github.com/anonymous/0b5e9194ebf09be7ad2f0a740bf369b8
Edit: It seems the problems is in these lines
hardest_pos = samples_pos[tf.argmax(distances)]
replacing it with something like
hardest_pos = tf.zeros(32)
gives no errors, however how to solve this?