How do I optionally set a platform/model attribute from my run script? - vlab

I'm using the VLAB MPC5xxx Toolbox.
I have a run script that is controlling my simulation, which loads the platform in the usual way, then runs it:
import vlab
import os
import sysc
image_path = os.path.join('o5e',
'firmware.open5xxxecu-e6009bbcfcd1',
'bin', 'o5e_dbg.elf')
vlab.load('mpc.mpc5674f.sim', args=['--testbench=o5e_testbench',
"--image=%s" % image_path,
"--debugger-config=GHS_MULTI",
"--trace=+src:sc_report",
])
vcd_sink = vlab.trace.sink.vcd("mpc.mpc5674f.sim.vcd")
vlab.add_trace("mpc5674f.PBRIDGE.EDMA_B", sink=vcd_sink)
for i in range(32):
vlab.add_trace("mpc5674f.PBRIDGE.ETPU.CH_OUT_A[%d]" % i, sink=vlab.trace.sink.console)
vlab.run(11, "ms", blocking=True)
vlab.exit()
I want to give this run script an argument to turn on tracing in the core, which I know you do by setting the tracing attribute on the core. And I know I can read the options of the script using python's optparse.
The problem I have is that you have to set the attribute before the end of elaboration, but the only place I can access the simulation before elaboration is in the testbench... but there seems to be no way to pass parameters (for example script arguments) to the testbench.
How should I pass the argument from my script into the testbench, so it conditionally turns on or not the core tracing?

So, I think the answer is:
1) Don't use try to use the testbench for things that aren't to do with ... a testbench
2) Use "phase breakpoints" to do things that need to happen at... specific phases in the simulation
eg:
from optparse import OptionParser
parser = OptionParser()
parser.add_option("--core-instrumentation", dest="core_instrumentation",
action = "store_true",
help="turn on core instrumentation")
(options, args) = parser.parse_args()
if options.core_instrumentation:
vlab.add_phase_breakpoint("before_end_of_elaboration",
action = lambda bp: vlab.write_attribute("mpc5467f.Core0.log_filter","+instr"))
image_path = os.path.join('o5e',
'firmware.open5xxxecu-e6009bbcfcd1',
'bin', 'o5e_dbg.elf')
vlab.load('mpc.mpc5674f.sim', args=['--testbench=o5e_testbench',
"--image=%s" % image_path,
"--debugger-config=GHS_MULTI",
])
vcd_sink = vlab.trace.sink.vcd("mpc.mpc5674f.sim.vcd")
vlab.add_trace("mpc5674f.PBRIDGE.EDMA_B", sink=vcd_sink)
for i in range(32):
vlab.add_trace("mpc5674f.PBRIDGE.ETPU.CH_OUT_A[%d]" % i, sink=vlab.trace.sink.console)
vlab.run(11, "ms", blocking=True)
vlab.exit()

Related

Vertex AI Model Batch prediction, issue with referencing existing model and input file on Cloud Storage

I'm struggling to correctly set Vertex AI pipeline which does the following:
read data from API and store to GCS and as as input for batch prediction.
get an existing model (Video classification on Vertex AI)
create Batch prediction job with input from point 1.
As it will be seen, I don't have much experience with Vertex Pipelines/Kubeflow thus I'm asking for help/advice, hope it's just some beginner mistake.
this is the gist of the code I'm using as pipeline
from google_cloud_pipeline_components import aiplatform as gcc_aip
from kfp.v2 import dsl
from kfp.v2.dsl import component
from kfp.v2.dsl import (
Output,
Artifact,
Model,
)
PROJECT_ID = 'my-gcp-project'
BUCKET_NAME = "mybucket"
PIPELINE_ROOT = "{}/pipeline_root".format(BUCKET_NAME)
#component
def get_input_data() -> str:
# getting data from API, save to Cloud Storage
# return GS URI
gcs_batch_input_path = 'gs://somebucket/file'
return gcs_batch_input_path
#component(
base_image="python:3.9",
packages_to_install=['google-cloud-aiplatform==1.8.0']
)
def load_ml_model(project_id: str, model: Output[Artifact]):
"""Load existing Vertex model"""
import google.cloud.aiplatform as aip
model_id = '1234'
model = aip.Model(model_name=model_id, project=project_id, location='us-central1')
#dsl.pipeline(
name="batch-pipeline", pipeline_root=PIPELINE_ROOT,
)
def pipeline(gcp_project: str):
input_data = get_input_data()
ml_model = load_ml_model(gcp_project)
gcc_aip.ModelBatchPredictOp(
project=PROJECT_ID,
job_display_name=f'test-prediction',
model=ml_model.output,
gcs_source_uris=[input_data.output], # this doesn't work
# gcs_source_uris=['gs://mybucket/output/'], # hardcoded gs uri works
gcs_destination_output_uri_prefix=f'gs://{PIPELINE_ROOT}/prediction_output/'
)
if __name__ == '__main__':
from kfp.v2 import compiler
import google.cloud.aiplatform as aip
pipeline_export_filepath = 'test-pipeline.json'
compiler.Compiler().compile(pipeline_func=pipeline,
package_path=pipeline_export_filepath)
# pipeline_params = {
# 'gcp_project': PROJECT_ID,
# }
# job = aip.PipelineJob(
# display_name='test-pipeline',
# template_path=pipeline_export_filepath,
# pipeline_root=f'gs://{PIPELINE_ROOT}',
# project=PROJECT_ID,
# parameter_values=pipeline_params,
# )
# job.run()
When running the pipeline it throws this exception when running Batch prediction:
details = "List of found errors: 1.Field: batch_prediction_job.model; Message: Invalid Model resource name.
so I'm not sure what could be wrong. I tried to load model in the notebook (outside of component) and it correctly returns.
Second issue I'm having is referencing GCS URI as output from component to batch job input.
input_data = get_input_data2()
gcc_aip.ModelBatchPredictOp(
project=PROJECT_ID,
job_display_name=f'test-prediction',
model=ml_model.output,
gcs_source_uris=[input_data.output], # this doesn't work
# gcs_source_uris=['gs://mybucket/output/'], # hardcoded gs uri works
gcs_destination_output_uri_prefix=f'gs://{PIPELINE_ROOT}/prediction_output/'
)
During compilation, I get following exception TypeError: Object of type PipelineParam is not JSON serializable, though I think this could be issue of ModelBatchPredictOp component.
Again any help/advice appreciated, I'm dealing with this from yesterday, so maybe I missed something obvious.
libraries I'm using:
google-cloud-aiplatform==1.8.0
google-cloud-pipeline-components==0.2.0
kfp==1.8.10
kfp-pipeline-spec==0.1.13
kfp-server-api==1.7.1
UPDATE
After comments, some research and tuning, for referencing model this works:
#component
def load_ml_model(project_id: str, model: Output[Artifact]):
region = 'us-central1'
model_id = '1234'
model_uid = f'projects/{project_id}/locations/{region}/models/{model_id}'
model.uri = model_uid
model.metadata['resourceName'] = model_uid
and then I can use it as intended:
batch_predict_op = gcc_aip.ModelBatchPredictOp(
project=gcp_project,
job_display_name=f'batch-prediction-test',
model=ml_model.outputs['model'],
gcs_source_uris=[input_batch_gcs_path],
gcs_destination_output_uri_prefix=f'gs://{BUCKET_NAME}/prediction_output/test'
)
UPDATE 2
regarding GCS path, a workaround is to define path outside of the component and pass it as an input parameter, for example (abbreviated):
#dsl.pipeline(
name="my-pipeline",
pipeline_root=PIPELINE_ROOT,
)
def pipeline(
gcp_project: str,
region: str,
bucket: str
):
ts = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
gcs_prediction_input_path = f'gs://{BUCKET_NAME}/prediction_input/video_batch_prediction_input_{ts}.jsonl'
batch_input_data_op = get_input_data(gcs_prediction_input_path) # this loads input data to GCS path
batch_predict_op = gcc_aip.ModelBatchPredictOp(
project=gcp_project,
model=training_job_run_op.outputs["model"],
job_display_name='batch-prediction',
# gcs_source_uris=[batch_input_data_op.output],
gcs_source_uris=[gcs_prediction_input_path],
gcs_destination_output_uri_prefix=f'gs://{BUCKET_NAME}/prediction_output/',
).after(batch_input_data_op) # we need to add 'after' so it runs after input data is prepared since get_input_data doesn't returns anything
still not sure, why it doesn't work/compile when I return GCS path from get_input_data component
I'm glad you solved most of your main issues and found a workaround for model declaration.
For your input.output observation on gcs_source_uris, the reason behind it is because the way the function/class returns the value. If you dig inside the class/methods of google_cloud_pipeline_components you will find that it implements a structure that will allow you to use .outputs from the returned value of the function called.
If you go to the implementation of one of the components of the pipeline you will find that it returns an output array from convert_method_to_component function. So, in order to have that implemented in your custom class/function your function should return a value which can be called as an attribute. Below is a basic implementation of it.
class CustomClass():
def __init__(self):
self.return_val = {'path':'custompath','desc':'a desc'}
#property
def output(self):
return self.return_val
hello = CustomClass()
print(hello.output['path'])
If you want to dig more about it you can go to the following pages:
convert_method_to_component, which is the implementation of convert_method_to_component
Properties, basics of property in python.

Julia using package located in .julia/dev

I am beginner to Julia though I have experience with Python and some other languages. I get that this is probably a very simple/beginner issue, but I fail to understand how it should work in Julia.
I want to create a Julia module. I saw recommendations to create it with PkgTemplates, so that is exactly what I have done. My directory structure is thus:
It is located at the default path proposed by PkgTemplates: /home/username/.julia/dev/Keras2Flux.
I want to develop it with Revise package due to the slow start-up time of the Julia REPL. However, I fail to import my module to the Julia REPL in the terminal.
So, I cd to the directory mentioned above, use julia command and try using Keras2Flux. I get the error:
ERROR: ArgumentError: Package Keras2Flux not found in current path:
I tried both using Keras2Flux and using Keras2Flux.jl, and I also tried to call it from one level above in my directory structure (i.e. /home/username/.julia/dev). All has the same problem.
What is wrong (more importantly, why?) and how to fix it?
Current contents of the module (not really relevant to the question but still):
module Keras2Flux
import JSON
using Flux
export convert
function create_dense(config)
in = config["input_dim"]
out = config["output_dim"]
dense = Dense(in, outо)
return dense
end
function create_dropout(config)
p = config["p"]
dropout = Dropout(p)
return dropout
end
function create_model(model_config)
layers = []
for layer_config in model_config
if layer_config["class_name"] == "Dense"
layer = create_dense(layer_config["config"])
elseif layer_config["class_name"] == "Dropout"
layer = create_dropout(layer_config["config"])
else
println(layer_config["class_name"])
throw("unimplemented")
end
push!(layers, layer)
end
model = Chain(layers)
end
function convert(filename)
jsontxt = ""
open(filename, "r") do f
jsontxt = read(f, String)
end
model_params = JSON.parse(jsontxt)
if model_params["keras_version"] == "1.1.0"
create_model(model_params["config"])
else
throw("unimplemented")
end
end
end
Here is a full recipe to get you going:
cd("/home/username/.julia/dev")
using Pkg
pkg"generate Keras2Flux"
cd("Keras2Flux")
pkg"activate ."
pkg"add JSON Flux"
# now copy-paste whatever you need to Keras2Flux\src\Keras2Flux.jl
using Revise
using Keras2Flux
# happy development!

Adding a Retokenize pipe while training NER model

I am currenly attempting to train a NER model centered around Property Descriptions. I could get a fully trained model to function to my liking however, I now want to add a retokenize pipe to the model so that I can set up the model to train other things.
From here, I am having issues getting the retokenize pipe to actually work. Here is the definition:
def retok(doc):
ents = [(ent.start, ent.end, ent.label) for ent in doc.ents]
with doc.retokenize() as retok:
string_store = doc.vocab.strings
for start, end, label in ents:
retok.merge(
doc[start: end],
attrs=intify_attrs({'ent_type':label},string_store))
return doc
i am adding it into my training like this:
nlp.add_pipe(retok, after="ner")
and I am adding it into the Language Factories like this:
Language.factories['retok'] = lambda nlp, **cfg: retok(nlp)
The issue I keep getting is "AttributeError: 'English' object has no attribute 'ents'". Now I am assuming I am getting this error because the parameter that is being passed through this function is not a doc but actually the NLP model itself. I am not really sure to get a doc to flow into this pipe during training. At this point I don't really know where to go from here to get the pipe to function the way I want.
Any help is appreciated, thanks.
You can potentially use the built-in merge_entities pipeline component: https://spacy.io/api/pipeline-functions#merge_entities
The example copied from the docs:
texts = [t.text for t in nlp("I like David Bowie")]
assert texts == ["I", "like", "David", "Bowie"]
merge_ents = nlp.create_pipe("merge_entities")
nlp.add_pipe(merge_ents)
texts = [t.text for t in nlp("I like David Bowie")]
assert texts == ["I", "like", "David Bowie"]
If you need to customize it further, the current implementation of merge_entities (v2.2) is a good starting point:
def merge_entities(doc):
"""Merge entities into a single token.
doc (Doc): The Doc object.
RETURNS (Doc): The Doc object with merged entities.
DOCS: https://spacy.io/api/pipeline-functions#merge_entities
"""
with doc.retokenize() as retokenizer:
for ent in doc.ents:
attrs = {"tag": ent.root.tag, "dep": ent.root.dep, "ent_type": ent.l
abel}
retokenizer.merge(ent, attrs=attrs)
return doc
P.S. You are passing nlp to retok() below, which is where the error is coming from:
Language.factories['retok'] = lambda nlp, **cfg: retok(nlp)
See a related question: Spacy - Save custom pipeline

How to open online reference from IPython?

Is there a way to have IPython open a browser pointed at the appropriate online reference?
Especially for numpy,scipy, matplotlib?
For example, the doc for numpy.linalg.cholesky is pretty hard to read in a terminal.
I don't think there is a direct way to make IPython or any shell to open up documentation online, because the primary job of shells is to let you interact with the things they are shells to.
We could however write a script to open a new tab on a browser with the documentation. Like so:
import webbrowser
docsList = {
"numpy" : lambda x: "https://docs.scipy.org/doc/numpy/reference/generated/" + x + ".html",
"scipy" : lambda x: "https://docs.scipy.org/doc/scipy/reference/generated/" + x + ".html",
"matplotlib" : lambda x: "https://matplotlib.org/api/" + x.split('.')[1] + "_api.html",
"default" : lambda x: "https://www.google.com/search?q=documentation+" + x
}
def online(method_name):
"""
Opens up the documentation for method_name on the default browser.
If the package doesn't match any entry in the dictionary, falls back to
Google.
Usage
-------
>>> lookUp.online("numpy.linalg.cholesky")
>>> lookUp.online("matplotlib.contour")
"""
try:
url = make_url(method_name)
except AttributeError:
print("Enter the method name as a string and try again")
return
webbrowser.open(url, new = 2)
return
def make_url(method_name):
package_name = method_name.split('.')[0]
try:
return docsList[package_name](method_name)
except KeyError:
return docsList["default"](method_name)
You could save the above as "lookUp.py" at a location that Python can find it in, and then import it whenever you need to use it.
Caveats:
This method takes strings as input, so if you call it on a function it'll throw an error.
>>> lookUp.online("numpy.linalg.cholesky")
Will work.
>>> lookUp.online(numpy.linalg.cholesky)
Will ask you to give it as a string.
So use autocomplete to get to the function and then wrap it in quotes to get it to work.

How do I export PSDs as PNGs with py-appscript?

I wrote a script to export PSDs as PNGs with rb-appscript. That's fine and dandy, but I can't seem to pull it off in py-appscript.
Here's the ruby code:
#!/usr/bin/env ruby
require 'rubygems'
require 'optparse'
require 'appscript'
ps = Appscript.app('Adobe Photoshop CS5.app')
finder = Appscript.app("Finder.app")
path = "/Users/nbaker/Desktop/"
ps.activate
ps.open(MacTypes::Alias.path("#{path}guy.psd"))
layerSets = ps.current_document.layer_sets.get
# iterate through all layers and hide them
ps.current_document.layers.get.each do |layer|
layer.visible.set false
end
layerSets.each do |layerSet|
layerSet.visible.set false
end
# iterate through all layerSets, make them visible, and create a PNG for them
layerSets.each do |layerSet|
name = layerSet.name.get
layerSet.visible.set true
ps.current_document.get.export(:in => "#{path}#{name}.png", :as => :save_for_web,
:with_options => {:web_format => :PNG, :png_eight => false})
layerSet.visible.set false
end
And here's the apparently nonequivalent python code:
from appscript import *
from mactypes import *
# fire up photoshop
ps = app("Adobe Photoshop CS5.app")
ps.activate()
# open the file for editing
path = "/Users/nbaker/Desktop/"
f = Alias(path + "guy.psd")
ps.open(f)
layerSets = ps.current_document.layer_sets()
# iterate through all layers and hide them
for layer in ps.current_document.layers():
layer.visible.set(False)
#... and layerSets
for layerSet in layerSets:
layerSet.visible.set(False)
# iterate through all layerSets, make them visible, and create a PNG for them
for layerSet in layerSets:
name = layerSet.name()
layerSet.Visible = True
ps.current_document.get().export(in_=Alias(path + name + ".png"), as_=k.save_for_web,
with_options={"web_format":k.PNG, "png_eight":False})
The only part of the python script that doesn't work is the saving. I've gotten all sorts of errors trying different export options and stuff:
Connection is invalid... General Photoshop error occurred. This functionality may not be available in this version of Photoshop... Can't make some data into the expected type...
I can live with just the ruby script, but all of our other scripts are in python, so it would be nice to be able to pull this off in python. Thanks in advance internet.
Bert,
I think I have a solution for you – albeit several months later. Note that I'm a Python noob, so this is by no means elegant.
When I tried out your script, it kept crashing for me too. However by removing the "Alias" in the export call, it seems to be OK:
from appscript import *
from mactypes import *
# fire up photoshop
ps = app("Adobe Photoshop CS5.1.app")
ps.activate()
# open the file for editing
path = "/Users/ian/Desktop/"
f = Alias(path + "Screen Shot 2011-10-13 at 11.48.51 AM.psd")
ps.open(f)
layerSets = ps.current_document.layer_sets()
# no need to iterate
ps.current_document.layers.visible.set(False)
ps.current_document.layer_sets.visible.set(False)
# iterate through all layerSets, make them visible, and create a PNG for them
for l in layerSets:
name = l.name()
# print l.name()
l.visible.set(True)
# make its layers visible too
l.layers.visible.set(True)
# f = open(path + name + ".png", "w").close()
ps.current_document.get().export(in_=(path + name + ".png"), as_=k.save_for_web,
with_options={"web_format":k.PNG, "png_eight":False})
I noticed, too, that you iterate through all the layers to toggle their visibility. You can actually just issue them all a command at once – my understanding is that it's faster, since it doesn't have to keep issuing Apple Events.
The logic of my script isn't correct (with regards to turning layers on and off), but you get the idea.
Ian