Multi-column input to ML.PREDICT for a TensorFlow model in BigQueryML - tensorflow

We have trained a model in Google Cloud AutoML (a tool that we like a lot) and successfully exported it to GCS, and then created the model in BigQuery using the below command:
create or replace model my_dataset.my_bq_ml_model
options(model_type='tensorflow',
model_path='my gcs path to exported tensorflow model'))
However, when we use BigQueryML to try and run some predictions using the model we are unsure of the how to format the multiple features that our model uses into the single "inputs" string the exported Tensorflow model accepts in BigQuery.
select *
from ml.predict(model my_project.my_dataset.my_bq_ml_model,
(
select 'How do we format this?' as inputs
from my_rows_to_predict
))
Has anyone done this yet?
This is similar to this question, which remains open:
Multi-column input to ML.PREDICT for a TensorFlow model in BigQuery ML
Thank you all.

After you load the model into BigQuery ML, click on the model in the BigQuery UI and switch over to the "Schema" tab. This should tell you what columns the model wants.
Alternately, run the program saved_model_cli on the model (it's a python program that comes with tensorflow) to see what the supported signature is
saved_model_cli show --dir $export_path --all

Related

Multi-column input to ML.PREDICT for a TensorFlow model in BigQuery ML

I trained a TensorFlow classifier and created it as a model in BigQuery ML using CREATE MODEL. Now I would like to use ML.PREDICT to batch predict using this model. I get the error "Invalid table-valued function ml.predict Column inputs is not found in the input data to the PREDICT function."
Here's my query:
select * from ml.predict (
model test.digital_native_classifier_kf,
(select * from dataset_id.features_table_id)
)
In the BigQuery documentation, they give an example for a TensorFlow model with a single column aliased as input so the TensorFlow input_fn can accept it. However, this classifier accepts hundreds of features. How do I specify the query passed to ML.PREDICT so it uses all the columns in my features table?
After you load the model into BigQuery ML, click on the model in the BigQuery UI and switch over to the "Schema" tab. This should tell you what features (column names) the model wants.
It is possible that when you created the TensorFlow/Keras model you did not assign names to the input nodes. Then, the feature names might have been auto-assigned to something like int1 and float2.
Alternately, run the program saved_model_cli on the model (it's a python program that comes with tensorflow) to see what the supported signature is
saved_model_cli show --dir $export_path --all
After some research, Auto ML encodes the input Tensors as Prensors which is a serialized string format shoved into a Tensor.
This means that you can't import the AutoML model from GCS directly into BQML the way you would import a TensorFlow model that explicitly encoded the different inputs as a json struct.
So, in order to import an AutoML model into BigQuery ML, the BigQuery engineering team would need to add support for something like model_type='automl' in addition to model_type='tensorflow'.
At the moment multi-column is not possible, from the AutoML Beginners guide:
One column from your dataset, called the target, is what your model will learn to predict. Some number of the other data columns are inputs (called features) that the model will learn patterns from. You can use the same input features to build multiple kinds of models just by changing the target.
Also found this feature request to Multi-target AutoML Tables Request

Using of Estamator.evaluate() on trained sagemaker tensorflow model

After I've trained and deployed the model with AWS SageMaker, I want to evaluate it on several csv files:
- category-1-eval.csv (~700000 records)
- category-2-eval.csv (~500000 records)
- category-3-eval.csv (~800000 records)
...
The right way to do this is with using Estimator.evaluate() method, as it is fast.
The problem is - I cannot find the way to restore SageMaker model into Tensorflow Estimator, is it possible?
I've tried to restore a model like this:
tf.estimator.DNNClassifier(
feature_columns=...,
hidden_units=[...],
model_dir="s3://<bucket_name>/checkpoints",
)
In AWS SageMaker documentation a different approach is described - to test the actual endpoint from the Notebook - but it takes to much time and requires a lot of API calls to the endpoint.
if you used the built-in Tensorflow container, your model has been saved in Tensorflow Serving format, e.g.:
$ tar tfz model.tar.gz
model/
model/1/
model/1/saved_model.pb
model/1/variables/
model/1/variables/variables.index
model/1/variables/variables.data-00000-of-00001
You can easily load it with Tensorflow Serving on your local machine, and send it samples to predict. More info at https://www.tensorflow.org/tfx/guide/serving

How to generate .tf/.tflite files from python

I am trying to generate the custom tensor flow model (tf/tflite file) which i wanted to use for my mobile application.
I have gone through few machine learning and tensor flow blogs, from there I started to generate a simple ML model.
https://www.datacamp.com/community/tutorials/tensorflow-tutorial
https://www.edureka.co/blog/tensorflow-object-detection-tutorial/
https://blog.metaflow.fr/tensorflow-how-to-freeze-a-model-and-serve-it-with-a-python-api-d4f3596b3adc
https://www.youtube.com/watch?v=ICY4Lvhyobk
All these are really nice and they guided me to do the below steps,
i)Install all necessary tools (TensorFlow,Python,Jupyter,etc).
ii)Load the Training and testing Data.
iii)Run the tensor flow session for train and evaluate the results.
iv)Steps to increase the accuracy
But i am not able to generate the .tf/.tflite files.
I tried the following code, but that generates an empty file.
converter = tf.contrib.lite.TFLiteConverter.from_session(sess,[],[])
model = converter.convert()
file = open( 'model.tflite' , 'wb' )
file.write( model )
I have checked few answers in stackoverflow and according to my understanding in-order to generate the .tf files we need to create the pb files, freezing the pb file and then generating the .tf files.
But how can we achieve this?
Tensorflow provides Tflite converter to convert saved model to Tflite model.For more details find here.
tf.lite.TFLiteConverter.from_saved_model() (recommended): Converts a SavedModel.
tf.lite.TFLiteConverter.from_keras_model(): Converts a Keras model.
tf.lite.TFLiteConverter.from_concrete_functions(): Converts concrete functions.

Using model optimizer for tensorflow slim models

I am aiming to inference tensorflow slim model with Intel OpenVINO optimizer. Using open vino docs and slides for inference and tf slim docs for training model.
It's a multi-class classification problem. I have trained tf slim mobilnet_v2 model from scratch (using sript train_image_classifier.py). Evaluation of trained model on test set gives relatively good results to begin with (using script eval_image_classifier.py):
eval/Accuracy[0.8017]eval/Recall_5[0.9993]
However, single .ckpt file is not saved (even though at the end of train_image_classifier.py run there is a message like "model.ckpt is saved to checkpoint_dir"), there are 3 files (.ckpt-180000.data-00000-of-00001, .ckpt-180000.index, .ckpt-180000.meta) instead.
OpenVINO model optimizer requires a single checkpoint file.
According to docs I call mo_tf.py with following params:
python mo_tf.py --input_model D:/model/mobilenet_v2_224.pb --input_checkpoint D:/model/model.ckpt-180000 -b 1
It gives the error (same if pass --input_checkpoint D:/model/model.ckpt):
[ ERROR ] The value for command line parameter "input_checkpoint" must be existing file/directory, but "D:/model/model.ckpt-180000" does not exist.
Error message is clear, there are not such files on disk. But as I know most tf utilities convert .ckpt-????.meta to .ckpt under the hood.
Trying to call:
python mo_tf.py --input_model D:/model/mobilenet_v2_224.pb --input_meta_graph D:/model/model.ckpt-180000.meta -b 1
Causes:
[ ERROR ] Unknown configuration of input model parameters
It doesn't matter for me in which way I will transfer graph to OpenVINO intermediate representation, just need to reach that result.
Thanks a lot.
EDIT
I managed to run OpenVINO model optimizer on frozen graph of tf slim model. However I still have no idea why had my previous attempts (based on docs) failed.
you can try converting the model to frozen format (.pb) and then convert the model using OpenVINO.
.ckpt-meta has the metagraph. The computation graph structure without variable values.
the one you can observe in tensorboard.
.ckpt-data has the variable values,without the skeleton or structure. to restore a model we need both meta and data files.
.pb file saves the whole graph (meta+data)
As per the documentation of OpenVINO:
When a network is defined in Python* code, you have to create an inference graph file. Usually, graphs are built in a form that allows model training. That means that all trainable parameters are represented as variables in the graph. To use the graph with the Model Optimizer, it should be frozen.
https://software.intel.com/en-us/articles/OpenVINO-Using-TensorFlow
the OpenVINO optimizes the model by converting the weighted graph passed in frozen form.

How to convert a saved_model.pb to EvalSavedModel?

I was going through the tensorflow-model-analysis documentation evaluating TensorFlow models. The getting started guide talks about a special SavedModel called the EvalSavedModel.
Quoting the getting started guide:
This EvalSavedModel contains additional information which allows TFMA
to compute the same evaluation metrics defined in your model in a
distributed manner over a large amount of data, and user-defined
slices.
My question is how can I convert an already existing saved_model.pb to an EvalSavedModel?
EvalSavedModel is exported as SavedModel message, thus there is no need in such conversion.
EvalSavedModel uses SavedModelBuilder under the hood. It populates the estimator graph with several placeholders, creates some additional metric collections. Later on, it performs simple SavedModelBuilder procedure.
Source - https://github.com/tensorflow/model-analysis/blob/master/tensorflow_model_analysis/eval_saved_model/export.py#L228
P.S. I suppose you want to run model-analysis on your model, exported by SavedModelBuilder. Since SavedModel doesn't have neither metric nodes nor related collections, which are created in EvalSavedModel, it's useless to do so - model-analysis just simply couldn't find any metric related to your estimator.
If I understand your question correctly, you have saved_model.pb generated, either by using tf.saved_model.simple_save or tf.saved_model.builder.SavedModelBuilderor by estimator.export_savedmodel.
If my understanding is correct, then, you are exporting Training and Inference Graphs to saved_model.pb.
The Point you mentioned from the Guide of TF Org Website states that, in addition to Exporting Training Graph, we need to Export Evaluation Graph as well. That is called EvalSavedModel.
The Evaluation Graph comprises the Metrics for that Model, so that you can Evaluate the Model's performance using Visualizations.
Before we Export EvalSaved Model, we should prepare eval_input_receiver_fn, similar to serving_input_receiver_fn.
We can mention other functionalities as well, like, if you want the Metrics to be defined in a Distributed Manner or if we want to Evaluate our Model using Slices of Data, rather than the Entire Dataset. Such Options can be mentioned in eval_input_receiver_fn.
Then we can Export the EvalSavedModel using the Code below:
tfma.export.export_eval_savedmodel(estimator=estimator,export_dir_base=export_dir,
eval_input_receiver_fn=eval_input_receiver_fn)