I know of two ways to run a TFX pipeline. First, using a Jupyter notebook with InteractiveContext in a browser:
from tfx import v1 as tfx
from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext
context = InteractiveContext(pipeline_root=_pipeline_data_folder)
example_gen = tfx.components.ImportExampleGen(input_base=_dataset_folder)
context.run(example_gen, enable_cache=True)
statistics_gen = tfx.components.StatisticsGen(examples=example_gen.outputs['examples'])
context.run(statistics_gen, enable_cache=True)
context.show(statistics_gen.outputs['statistics'])
This way, I can see the statistics artifact in the browser:
The second way to run the same pipeline is by using a python script (no browser involved):
example_gen = tfx.components.ImportExampleGen(input_base=_dataset_folder)
statistics_gen = tfx.components.StatisticsGen(examples=example_gen.outputs['examples'])
components = [
example_gen,
statistics_gen,
]
pipeline = tfx.dsl.Pipeline(
pipeline_name='sample_pipeline',
pipeline_root=_pipeline_data_folder,
metadata_connection_config=tfx.orchestration.metadata.sqlite_metadata_connection_config(
f'{_pipeline_data_folder}/metadata.db'),
components=components)
tfx.orchestration.LocalDagRunner().run(pipeline)
I understand that since there's no browser involved in the second approach, asking for a visualization is pointless. But the same artifact that was created in the first approach was also create in the second one. So my question is, after the second pipeline is over, how can visualize the created statistics artifact?
It took me a whole day to figure this out and I had to read TFX code for it (there was hardly any documentation). An older approach to address the same need can be found in TFX documentation but it's dated and does not work with the latest version of TFX. I'm sure even this solution will be short-lived and soon it won't work. But for the time being:
from tfx import types
from tfx import v1 as tfx
from tfx.orchestration.metadata import Metadata
from tfx.orchestration.experimental.interactive import visualizations
from tfx.orchestration.experimental.interactive import standard_visualizations
standard_visualizations.register_standard_visualizations()
sqlite_path = './pipeline_data/metadata.db'
pipeline_name = 'simple_pipeline'
component_name = 'StatisticsGen'
type_name = 'ExampleStatistics'
metadata_connection_config = tfx.orchestration.metadata.sqlite_metadata_connection_config(sqlite_path)
with Metadata(metadata_connection_config) as metadata:
context = metadata.store.get_context_by_type_and_name('node', f'{pipeline_name}.{component_name}')
artifacts = metadata.store.get_artifacts_by_context(context.id)
artifact_type = metadata.store.get_artifact_type(type_name)
latest_artifact = max([a for a in artifacts if a.type_id == artifact_type.id], key=lambda a: a.last_update_time_since_epoch)
artifact = types.Artifact(artifact_type)
artifact.set_mlmd_artifact(latest_artifact)
visualization = visualizations.get_registry().get_visualization(artifact.type_name)
visualization.display(artifact)
Disclaimer, this code displays the latest artifact for the statistics component of a specific pipeline. Or if you want you can point to the artifact by its folder path (uri):
from tfx import types
from tfx import v1 as tfx
from tfx.orchestration.metadata import Metadata
from tfx.orchestration.experimental.interactive import visualizations
from tfx.orchestration.experimental.interactive import standard_visualizations
standard_visualizations.register_standard_visualizations()
sqlite_path = './pipeline_data/metadata.db'
uri = './pipeline_data/StatisticsGen/statistics/16'
component_name = 'StatisticsGen'
type_name = 'ExampleStatistics'
metadata_connection_config = tfx.orchestration.metadata.sqlite_metadata_connection_config(sqlite_path)
with Metadata(metadata_connection_config) as metadata:
artifacts = metadata.store.get_artifacts_by_uri(uri)
artifact_type = metadata.store.get_artifact_type(type_name)
latest_artifact = max([a for a in artifacts if a.type_id == artifact_type.id], key=lambda a: a.last_update_time_since_epoch)
artifact = types.Artifact(artifact_type)
artifact.set_mlmd_artifact(latest_artifact)
visualization = visualizations.get_registry().get_visualization(type_name)
visualization.display(artifact)
At the end, maybe there is a better way to do this but I missed it.
Related
I'm running code to train a PPO policy on chess using PettingZoo:
import gym.vector.utils
import supersuit as ss
import stable_baselines3.ppo
import pettingzoo.classic
if __name__ == '__main__':
env = original_env = pettingzoo.classic.chess_v5.env()
env = pettingzoo.utils.turn_based_aec_to_parallel(env)
env = ss.pettingzoo_env_to_vec_env_v1(env)
env = ss.concat_vec_envs_v1(env, 8, num_cpus=4, base_class='stable_baselines3')
model = stable_baselines3.PPO(stable_baselines3.ppo.MultiInputPolicy, env,
tensorboard_log='my_logs')
model.learn(total_timesteps=100)
In the next to last line, you can see I'm outputting logs to TensorBoard, where I hope to see a nice graph. However, all I see is this:
I've used TensorBoard before and it worked. Why isn't it showing any progress now? Or even lack of progress?
Turns out I just needed to use a lower value for n_steps.
I follow the official tutotial from microsoft: https://learn.microsoft.com/en-us/azure/synapse-analytics/machine-learning/tutorial-score-model-predict-spark-pool
When I execute:
#Bind model within Spark session
model = pcontext.bind_model(
return_types=RETURN_TYPES,
runtime=RUNTIME,
model_alias="Sales", #This alias will be used in PREDICT call to refer this model
model_uri=AML_MODEL_URI, #In case of AML, it will be AML_MODEL_URI
aml_workspace=ws #This is only for AML. In case of ADLS, this parameter can be removed
).register()
I got : No module named 'azureml.automl'
My Notebook
As per the repro from my end, the above code which you have shared works as excepted and I don't see any error message which you are experiencing.
I had even tested the same code on the newly created Apache spark 3.1 runtime and it works as expected.
I would request you to create a new cluster and see if you are able to run the above code.
I solved it. In my case it works best like this:
Imports
#Import libraries
from pyspark.sql.functions import col, pandas_udf,udf,lit
from notebookutils.mssparkutils import azureML
from azureml.core import Workspace, Model
from azureml.core.authentication import ServicePrincipalAuthentication
from azureml.core.model import Model
import joblib
import pandas as pd
ws = azureML.getWorkspace("AzureMLService")
spark.conf.set("spark.synapse.ml.predict.enabled","true")
Predict function
def forecastModel():
model_path = Model.get_model_path(model_name="modelName", _workspace=ws)
modeljob = joblib.load(model_path + "/model.pkl")
validation_data = spark.read.format("csv") \
.option("header", True) \
.option("inferSchema",True) \
.option("sep", ";") \
.load("abfss://....csv")
validation_data_pd = validation_data.toPandas()
predict = modeljob.forecast(validation_data_pd)
return predict
There are pysc2(https://github.com/deepmind/pysc2) & Tensorflow(1.x) and OpenAI-Baselines(https://github.com/openai/baselines), like the following
https://github.com/chris-chris/pysc2-examples
https://github.com/llSourcell/A-Guide-to-DeepMinds-StarCraft-AI-Environment
The TF team has recently come up with a RL implementations(alternative to OpenAi-Baselines) called TF-Agents (https://github.com/tensorflow/agents).
Examples :
https://github.com/tensorflow/agents/blob/master/docs/tutorials/1_dqn_tutorial.ipynb
https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_12_05_apply_rl.ipynb
https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_12_04_atari.ipynb
For TF-Agents, you do
env_name = 'CartPole-v0'
train_py_env = suite_gym.load(env_name)
eval_py_env = suite_gym.load(env_name)
q_net = q_network.QNetwork(
train_env.observation_spec(),
train_env.action_spec(),
fc_layer_params=fc_layer_params)
optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=learning_rate)
agent = dqn_agent.DqnAgent(
train_env.time_step_spec(),
train_env.action_spec(),
q_network=q_net,
optimizer=optimizer,
td_errors_loss_fn=common.element_wise_squared_loss,
train_step_counter=train_step_counter)
agent.initialize()
For pysc2,
from pysc2.env import environment
from pysc2.env import sc2_env
from pysc2.lib import actions
from pysc2.lib import actions as sc2_actions
from pysc2.lib import features
mineral_env = sc2_env.SC2Env(
map_name="CollectMineralShards",
step_mul=step_mul,
agent_interface_format=AGENT_INTERFACE_FORMAT,
visualize=True)
How do I combine TF-Agents and Pysc2 together?
They are both Google products.
I've recently stumbled on a very similar situation where I wanted to use the hanabi-learning-environment developed by DeepMind with TF-Agents. I'm afraid I have to tell that there is no nice solution to this.
What you must do is fork the DeepMind repo and modify the environment wrapper to be compatible with what TF-Agents requires. It's gonna be quite some work to do especially if you are not familiar with how environments are defined in TF-Agents, but this is definetly something that can be done in about a week of work.
If you want to get an idea of what I did you can look at the original rl_env.py code in the Hanabi repo from DeepMind, and what I modified it into in my repo
I have no idea why DeepMind stick to their structure instead of making their code more compatible, but this is how it is.
I am currently using Kubeflow as my orchestrator. The orchestrator is actually an instance of an AI platform pipeline hosted on GCP. How do I create run-time parameters using the Tensorflow Extended SDK? I suspect that this is the class that I should use, however the documentation is not very meaningful nor does it provide any examples. https://www.tensorflow.org/tfx/api_docs/python/tfx/orchestration/data_types/RuntimeParameter
Something like the picture below.
Say, for example, you want to add the module file location as a runtime parameter that is passed to the transform component in your TFX pipeline.
Start by setting up your setup_pipeline.py and defining the module file parameter:
# setup_pipeline.py
from typing import Text
from tfx.orchestration import data_types, pipeline
from tfx.orchestration.kubeflow import kubeflow_dag_runner
from tfx.components import Transform
_module_file_param = data_types.RuntimeParameter(
name='module-file',
default=
'/tfx-src/tfx/examples/iris/iris_utils_native_keras.py',
ptype=Text,
)
Next, define a function that specifies the components used in your pipeline and pass along the parameter.
def create_pipeline(..., module_file):
# setup components:
...
transform = Transform(
...
module_file=module_file
)
...
components = [..., transform, ...]
return pipeline.Pipeline(
...,
components=components
)
Finally, setup the Kubeflow DAG runner so that it passes the parameter along to the create_pipeline function. See here for a more complete example.
if __name__ == "__main__":
# instantiate a kfp_runner
...
kfp_runner = kubeflow_dag_runner.KubeflowDagRunner(
...
)
kfp_runner.run(
create_pipeline(..., module_file=_module_file_param
))
Then you can run python -m setup_pipeline which will produce the yaml file that specifies the pipeline config, which you can then upload to the Kubeflow GCP interface.
I'm trying to put together a demo of Neptune using Neptune workbench, but something's not working right. I've got this block set up:
from __future__ import print_function # Python 2/3 compatibility
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
graph = Graph()
cluster_url = #my cluster
remoteConn = DriverRemoteConnection( f'wss://{cluster_url}:8182/gremlin','g')
g = graph.traversal().withRemote(remoteConn)
import uuid
tmp = uuid.uuid4()
tmp_id=str(id)
def get_id(name):
uid = uuid.uuid5(uuid.NAMESPACE_DNS, f"{name}.licensing.company.com")
return str(uid)
def add_sku(name):
tmp_id = get_id(name)
g.addV('SKU').property('id', tmp_id, 'name', name)
return name
def get_values():
return g.V().properties().toList()
The problem is that calling add_sku doesn't result in a vertex being added to the graph. Doing the same operation in a cell with gremlin magic works, and I can retrieve values through python, but I can't add vertices. Does anyone see what I'm missing here?
The Python code is not working because it is missing a terminal step (next() or iterate()) on the end of it which forces it to evaluate. If you add the terminal step it should work:
g.addV('SKU').property('id', tmp_id, 'name', name).next()