AttributeError: 'Tensor' object has no attribute 'numpy' while mapping a function through my dataset - numpy

I'm trying to map a function process_image to the dataset. This function calls another function, get_label. In get_label, I'm trying to retrieve the label's name from images.
The file path is like this: C:\\Users\\sis\\Desktop\\test\\0002_c1s1_000451_03.jpg. The label is number 0002.
def get_lab(file_path):
parts = tf.strings.split(file_path, os.path.sep)
part=parts[-1].numpy().decode().split('_')[0]
label=tf.strings.to_number(part)
return label

I solved it! I didn't understand exactly where the error was, I think the previous code mixed eager mode and graph mode, so I changed the code of get_label function and it worked!
def get_lab(file_path):
parts = tf.strings.split(file_path, os.path.sep)[-1]
part=tf.strings.split(parts, sep='_')[0]
print(part)
label=tf.strings.to_number(part)
return label

You may applied to folder name that create sufficeints used of external programs.
[ Sample ]:
import os
import tensorflow as tf
def get_lab(file_path):
parts = tf.strings.split(file_path, os.path.sep)
part=parts[-2].numpy().decode().split('.')[0]
label=tf.strings.to_number(part)
return label.numpy()
directory = "F:\\datasets\\downloads\\Actors\\train\\Candidt Kibt\\01.tif\\"
print( 'label as number: ' + str(get_lab( directory )) )
directory = "F:\\datasets\\downloads\\Actors\\train\\"
print( 'classname: ' + str(tf.io.gfile.listdir(
directory
))
)
[ Output ]:
label as number: 1.0
classname: ['Candidt Kibt', 'Pikaploy']
F:\temp\Python>

Related

AttributeError: 'Tensor' object has no attribute 'numpy' eager execution is enabled using version 2.4.1

I've been trying to convert a generator I built to a tf.data.dataset.
I've come far and now I have something simple like this
def parse_image(filename):
file = tf.io.read_file(filename) # this will work only with filename as tensor
image = tf.image.decode_image(file)
return image
def transform_img(img):
img = parse_image(img).numpy()
img = transforms_train(image = img)["image"]
return img
transform img works as expected when I call it on a filename itself. like:
plt.imshow(transform_img(array_of_filenames[0]))
but when I map it on a dataset
dataset = tf.data.Dataset.from_tensor_slices(array_of_filenames)
dataset = dataset.map(transform_img)
I get the error in the title.
I am doing something silly again aren't I?
Thanks for helping!
It is not possible to use numpy inside the map function of tensorflow dataset. Otherwise, you need to wrap the function in tf.py_function or tf.numpy_function. So it should look like the following:
dataset = dataset.map(lambda: item: tf.py_function(transform_img, [item], [tf.float32]))
The first argument of py_function is the preprocessing function you want, the second argument is the parameter to pass to the function. The final argument is the dtype of the return of preprocess function. (same applies to tf.numpy_function)
I don't remember reading this in documentation but in a tutorial, you can find it here.

How to read parameters of layers of .tflite model in python

I was trying to read tflite model and pull all the parameters of the layers out.
My steps:
I generated flatbuffers model representation by running (please build flatc before):
flatc -python tensorflow/tensorflow/lite/schema/schema.fbs
Result is tflite/ folder that contains layer description files (*.py) and some utilitarian files.
I successfully loaded model:
in case of import Error: set PYTHONPATH to point to the folder where tflite/ is
from tflite.Model import Model
def read_tflite_model(file):
buf = open(file, "rb").read()
buf = bytearray(buf)
model = Model.GetRootAsModel(buf, 0)
return model
I partly pulled model and node parameters out and stacked in iterating over nodes:
Model part:
def print_model_info(model):
version = model.Version()
print("Model version:", version)
description = model.Description().decode('utf-8')
print("Description:", description)
subgraph_len = model.SubgraphsLength()
print("Subgraph length:", subgraph_len)
Nodes part:
def print_nodes_info(model):
# what does this 0 mean? should it always be zero?
subgraph = model.Subgraphs(0)
operators_len = subgraph.OperatorsLength()
print('Operators length:', operators_len)
from collections import deque
nodes = deque(subgraph.InputsAsNumpy())
STEP_N = 0
MAX_STEPS = operators_len
print("Nodes info:")
while len(nodes) != 0 and STEP_N <= MAX_STEPS:
print("MAX_STEPS={} STEP_N={}".format(MAX_STEPS, STEP_N))
print("-" * 60)
node_id = nodes.pop()
print("Node id:", node_id)
tensor = subgraph.Tensors(node_id)
print("Node name:", tensor.Name().decode('utf-8'))
print("Node shape:", tensor.ShapeAsNumpy())
# which type is it? what does it mean?
type_of_tensor = tensor.Type()
print("Tensor type:", type_of_tensor)
quantization = tensor.Quantization()
min = quantization.MinAsNumpy()
max = quantization.MaxAsNumpy()
scale = quantization.ScaleAsNumpy()
zero_point = quantization.ZeroPointAsNumpy()
print("Quantization: ({}, {}), s={}, z={}".format(min, max, scale, zero_point))
# I do not understand it again. what is j, that I set to 0 here?
operator = subgraph.Operators(0)
for i in operator.OutputsAsNumpy():
nodes.appendleft(i)
STEP_N += 1
print("-"*60)
Please point me to documentation or some example of using this API.
My problems are:
I can not get documentation on this API
Iterating over Tensor objects seems not possible for me, as it doesn't have Inputs and Outputs methods. + subgraph.Operators(j=0) I do not understand what j means in here. Because of that my cycle goes through two nodes: input (once) and the next one over and over again.
Iterating over Operator objects is surely possible:
Here we iterate over them all but I can not get how to map Operator and Tensor.
def print_in_out_info_of_all_operators(model):
# what does this 0 mean? should it always be zero?
subgraph = model.Subgraphs(0)
for i in range(subgraph.OperatorsLength()):
operator = subgraph.Operators(i)
print('Outputs', operator.OutputsAsNumpy())
print('Inputs', operator.InputsAsNumpy())
I do not understand how to pull parameters out Operator object. BuiltinOptions method gives me Table object, that I do not know what to map at.
subgraph = model.Subgraphs(0)
What does this 0 mean? should it always be zero? obviously no, but what is it? Id of the subgraph? If so - I'm happy. If no, please try to explain it.

Tensorflow parse_single_example returns all dataset

I'm creating a basic LinearClassifier in Tensorflow, but it seems that my input function returns the whole dataset at the first iteration, instead of just one example & its label.
My TFRecord has the following structure (obtained with print( tf.train.Example.FromString(example.SerializeToString())) )
features {
feature {
key: "attackType"
value {
int64_list {
value: 0
value: 0
...
feature {
key: "dst_ip_addr"
value {
bytes_list {
value: "OPENSTACK_NET"
value: "EXT_SERVER"
...
It seems the TFRecord file is well formatted. However, when I try to parse it with the following snippet:
def input_fn_train(repeat=10, batch_size=32):
"""
Reads dataset from tfrecord, apply parser with map
"""
# Import MNIST data
dataset = tf.data.TFRecordDataset([processed_bucket+processed_key])
# Map the parser over dataset, and batch results by up to batch_size
dataset = dataset.map(_decode)
dataset = dataset.repeat(repeat)
dataset = dataset.batch(batch_size)
return dataset
def _decode(serialized_ex):
features={
'src_ip_addr': tf.FixedLenFeature(src_ip_size,tf.string),
'src_pt': tf.FixedLenFeature(src_pt_size,tf.int64),
'dst_ip_addr': tf.FixedLenFeature(dst_ip_size,tf.string),
'dst_pt': tf.FixedLenFeature(dst_pt_size,tf.int64),
'proto': tf.FixedLenFeature(proto_size,tf.string),
'packets': tf.FixedLenFeature(packets_size,tf.int64),
'subnet': tf.FixedLenFeature(subnet_size,tf.int64),
'attackType': tf.FixedLenFeature(attack_type_size,tf.int64)
}
parsed_features = tf.parse_single_example(serialized_ex, features)
label = parsed_features.pop('attackType')
return parsed_features, label
sess = tf.Session()
it = input_fn_train().make_one_shot_iterator()
print(sess.run(it.get_next()))
It shows that it.get_next() returns
({'dst_ip_addr': array([[b'OPENSTACK_NET', b'EXT_SERVER',...
This is incorrect since it yields an array of array! The result should be
array([b'OPENSTACK_NET',...
Any thoughts ? I've been trying to change the shape parameter of FixedLenFeature, with no success.
Ok, seems it's the dataset.batch command that created this strange behavior. Removed it, and it works fine now !

how to make R datafile to Python type

I want to make R datatype to Python datatype below is the whole code
def convert_datafiles(datasets_folder):
import rpy2.robjects
rpy2.robjects.numpy2ri.activate()
pandas2ri.activate()
for root, dirs, files in os.walk(datasets_folder):
for name in files:
# sort out .RData files
if name.endswith('.RData'):
name_ = os.path.splitext(name)[0]
name_path = os.path.join(datasets_folder, name_)
# creat sub-directory
if not os.path.exists(name_path):
os.makedirs(name_path)
file_path = os.path.join(root, name)
robj = robjects.r.load(file_path)
# check out subfiles in the data frame
for var in robj:
###### error happend right here
myRData = pandas2ri.ri2py_dataframe( var )
####error happend right here
# convert to DataFrame
if not isinstance(myRData, pd.DataFrame):
myRData = pd.DataFrame(myRData)
var_path = os.path.join(datasets_folder,name_,var+'.csv')
myRData.to_csv(var_path)
os.remove(os.path.join(datasets_folder, name)) # clean up
print ("=> Success!")
I want to make R datatype to pythone type, but the error keeps popping up like this : AttributeError: 'str' object has no attribute 'dtype'
How should I do to resolve this error?
The rpy2 documentation is somewhat incomplete when it comes to interaction with pandas, but unit tests will provide examples of conversion. For example:
rdataf = robjects.r('data.frame(a=1:2, '
' b=I(c("a", "b")), '
' c=c("a", "b"))')
with localconverter(default_converter + rpyp.converter) as cv:
pandas_df = robjects.conversion.ri2py(rdataf)

Changing label name when retraining Inception on Google Cloud ML

I currently follow the tutorial to retrain Inception for image classification:
https://cloud.google.com/blog/big-data/2016/12/how-to-train-and-classify-images-using-google-cloud-machine-learning-and-cloud-dataflow
However, when I make a prediction with the API I get only the index of my class as a label. However I would like that the API actually gives me a string back with the actual class name e.g instead of
​predictions:
- key: '0'
prediction: 4
scores:
- 8.11998e-09
- 2.64907e-08
- 1.10307e-06
I would like to get:
​predictions:
- key: '0'
prediction: ROSES
scores:
- 8.11998e-09
- 2.64907e-08
- 1.10307e-06
Looking at the reference for the Google API it should be possible:
https://cloud.google.com/ml-engine/reference/rest/v1/projects/predict
I already tried to change in the model.py the following to
outputs = {
'key': keys.name,
'prediction': tensors.predictions[0].name,
'scores': tensors.predictions[1].name
}
tf.add_to_collection('outputs', json.dumps(outputs))
to
if tensors.predictions[0].name == 0:
pred_name ='roses'
elif tensors.predictions[0].name == 1:
pred_name ='tulips'
outputs = {
'key': keys.name,
'prediction': pred_name,
'scores': tensors.predictions[1].name
}
tf.add_to_collection('outputs', json.dumps(outputs))
but this doesn't work.
My next idea was to change this part in the preprocess.py file. So instead getting the index I want to use the string label.
def process(self, row, all_labels):
try:
row = row.element
except AttributeError:
pass
if not self.label_to_id_map:
for i, label in enumerate(all_labels):
label = label.strip()
if label:
self.label_to_id_map[label] = label #i
and
label_ids = []
for label in row[1:]:
try:
label_ids.append(label.strip())
#label_ids.append(self.label_to_id_map[label.strip()])
except KeyError:
unknown_label.inc()
but this gives the error:
TypeError: 'roses' has type <type 'str'>, but expected one of: (<type 'int'>, <type 'long'>) [while running 'Embed and make TFExample']
hence I thought that I should change something here in preprocess.py, in order to allow strings:
example = tf.train.Example(features=tf.train.Features(feature={
'image_uri': _bytes_feature([uri]),
'embedding': _float_feature(embedding.ravel().tolist()),
}))
if label_ids:
label_ids.sort()
example.features.feature['label'].int64_list.value.extend(label_ids)
But I don't know how to change it appropriately as I could not find someting like str_list. Could anyone please help me out here?
Online prediction certainly allows this, the model itself needs to be updated to do the conversion from int to string.
Keep in mind that the Python code is just building a graph which describes what computation to do in your model -- you're not sending the Python code to online prediction, you're sending the graph you build.
That distinction is important because the changes you have made are in Python -- you don't yet have any inputs or predictions, so you won't be able to inspect their values. What you need to do instead is add the equivalent lookups to the graph that you're exporting.
You could modify the code like so:
labels = tf.constant(['cars', 'trucks', 'suvs'])
predicted_indices = tf.argmax(softmax, 1)
prediction = tf.gather(labels, predicted_indices)
And leave the inputs/outputs untouched from the original code