How to use tensorflow library with sagemaker preprocessing - tensorflow

I want to use TensorFlow for preprocessing in sagemaker pipelines.
But, I haven't been able to find a way to use it.
Right now, I'm using this library for preprocessing:
from sagemaker.sklearn.processing import SKLearnProcessor
framework_version = "0.23-1"
sklearn_processor = SKLearnProcessor(
framework_version=framework_version,
instance_type=processing_instance_type,
instance_count=processing_instance_count,
base_job_name="abcd",
role=role,
)
Now, I need to use TensorFlow in preprocessing but the python module cant import TensorFlow.
Any help would be much appreciated. Thanks.

So there's no TensorFlow Processor module, the list of available processing modules are in the following doc and can be seen at the bottom of the page.
You can use a Processor and ScriptProcessor class and pass in your TF preprocessing as a script with a requirements.txt for the necessary modules.
I work for AWS and my opinions are my own.

Related

Why do we use keras back-end command in codes?

import tensorflow as tf
from tensorflow import keras
from keras import backend as K
What is the reason behind using the command—>
from keras import backend as K
What does it do? I would appreciate it if anyone explains it the simple way so that it does not get complicated in the mind.
You can find more information on what Keras backend actually is here or here.
In simpler terms to understand what Keras backend actually is
Keras is a model-level library that provides high-level building blocks for developing deep learning models. Keras does not provide low-level operations such as tensor multiplication and convolution. Instead, it relies on a specialized, well-optimized tensor library that serves as Keras' "backend engine". Instead of choosing one single tensor library and tying your Keras implementation to that library, Keras handles the problem in a modular way, allowing you to seamlessly connect multiple different backend engines to Keras.
Keras backend will allow you to write custom code or in a particular case a new "Keras module" for your use case that can support Theano and/or Tensorflow both. Like instead of tf.placeholder() you could write keras.backend.placeholder() which will work across both the libraries mentioned earlier.

How to build tensorflow from source with a model that uses custom ops which are renamed versions of existing ops?

I have a .tflite model that uses custom operations called MaxPoolingWithArgmax2D, MaxUnpooling2D, and Convolution2DTransposeBias.
These ops are actually not custom ops, since they are present in tensorflow already (MaxPoolWithArgmax, MaxUnpooling2D, conv2d_transpose)
After consulting this guide, I see that I'd have to write a kernel and an interface for these ops.
Is there a way to build tensorflow source without writing custom implementations for these ops since they're already present in the library? The only problem is that the model I'm using has renamed them due to which they're being recognized as custom ops. My goal is to perform inference with this model.
Edit: These ops are not select ops. They are built-in ops present inside the base library. However, the person who wrote this model renamed them, which makes them custom ops.
Edit 2: Photo for reference:
You can enable the existin TF ops through the Select TF option in the TFLite.
For example, during the conversion stage, you can enable them:
converter.target_spec.supported_ops = [
tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
For the inference stage, please make sure that the select tf dependency is linked. When using TF Python API, it will automatically be enabled.
Please refer to this link.
Among the custom ops, the custom operator implementations for some of them are being distributed as a perception operator package.
from tensorflow.lite.kernels.perception import pywrap_perception_ops as perception_ops_registerer
from tensorflow.lite.python import interpreter as interpreter_wrapper
interpreter = interpreter_wrapper.InterpreterWithCustomOps(
model_content=model,
custom_op_registerers=[
perception_ops_registerer.PerceptionOpsRegisterer
])
Please take a look at this link.

Does Tensorflow server serve/support non-tensorflow based libraries like scikit-learn?

Actually we are creating a platform to be able to put AI usecases in production. TFX is the first choice but what if we want to use non-tensorflow based libraries like scikit learn etc and want to include a python script to create models. Will output of such a model be served by tensorflow server. How can I make sure to be able to run both tensorflow based model and non-tensorflow based libraries and models in one system design. Please suggest.
Mentioned below is the procedure to Deploy and Serve a Sci-kit Learn Model in Google Cloud Platform.
First step is to Save/Export the SciKit Learn Model using the below code:
from sklearn.externals import joblib
joblib.dump(clf, 'model.joblib')
Next step is to upload the model.joblib file to Google Cloud Storage.
After that, we need to create our model and version, specifying that we are loading up a scikit-learn model, and select the runtime version of Cloud ML engine, as well as the version of Python that we used to export this model.
Next, we need to present the data to Cloud ML Engine as a simple array, encoded as a json file, like shown below. We can use JSON Library as well.
print(list(X_test.iloc[10:11].values))
Next, we need to run the below command to perform the Inference,
gcloud ml-engine predict --model $MODEL_NAME --version $VERSION_NAME --json-instances $INPUT_FILE
For more information, please refer this link.

Feature Importance for XGBoost in Sagemaker

I have built an XGBoost model using Amazon Sagemaker, but I was unable to find anything which will help me interpret the model and validate if it has learned the right dependencies.
Generally, we can see Feature Importance for XGBoost by get_fscore() function in the python API (https://xgboost.readthedocs.io/en/latest/python/python_api.html) I see nothing of that sort in the sagemaker api(https://sagemaker.readthedocs.io/en/stable/estimators.html).
I know I can build my own model and then deploy that using sagemaker but I am curious if anyone has faced this problem and how they overcame it.
Thanks.
As of 2019-06-17, Sagemaker XGBoost model is stored on S3 in as archive named model.tar.gz. This archive consist of single pickled model file named xgboost-model.
To load the model directly from S3 without downloading, you can use the following code:
import s3fs
import pickle
import tarfile
import xgboost
model_path = 's3://<bucket>/<path_to_model_dir>/xgboost-2019-06-16-09-56-39-854/output/model.tar.gz'
fs = s3fs.S3FileSystem()
with fs.open(model_path, 'rb') as f:
with tarfile.open(fileobj=f, mode='r') as tar_f:
with tar_f.extractfile('xgboost-model') as extracted_f:
xgbooster = pickle.load(extracted_f)
xgbooster.get_fscore()
SageMaker XGBoost currently does not provide interface to retrieve feature importance from the model. You can write some code to get the feature importance from the XGBoost model. You have to get the booster object artifacts from the model in S3 and then use the following snippet
import pickle as pkl
import xgboost
booster = pkl.load(open(model_file, 'rb'))
booster.get_score()
booster.get_fscore()
Refer XGBoost doc for methods to get feature importance from the Booster object such as get_score() or get_fscore().
Although you can write a custom script like rajesh and Lukas suggested and use XGBoost as a framework to run the script (see How to Use Amazon SageMaker XGBoost for how to use the "script mode"), SageMaker has recently launched SageMaker Debugger, which allows you to retrieve feature importance from XGBoost in real time.
Notebook demonstrates how to use SageMaker Debugger to retrieve feature importance.

Fix no module 'layer' in current Google Colab TF version

I use Google colab (python3 GPU)
I want to run for example this repo codes but I have an error when running demo ipynb in these lines:
import tensorflow as tf
from layers import (_causal_linear, _output_linear, conv1d, dilated_conv1d)
When I run these two lines I have an error "no module layer"
I don't think this is a bug or something, because this repo has over 1000 stars.
I think this is rather a tf version problem.
Any ideas how to solve this?
"Layers" is a module in the package that you linked NOT in Tensorflow. See here.
BTW if you wanted to import from a submodule of tensor flow you would have to do from tensorflow.package import ....