how to concatenate the OutputPathPlaceholder with a string with Kubeflow pipelines? - kubeflow-pipelines

I am using Kubeflow pipelines (KFP) with GCP Vertex AI pipelines. I am using kfp==1.8.5 (kfp SDK) and google-cloud-pipeline-components==0.1.7. Not sure if I can find which version of Kubeflow is used on GCP.
I am bulding a component (yaml) using python inspired form this Github issue. I am defining an output like:
outputs=[(OutputSpec(name='drt_model', type='Model'))]
This will be a base output directory to store few artifacts on Cloud Storage like model checkpoints and model.
I would to keep one base output directory but add sub directories depending of the artifact:
<output_dir_base>/model
<output_dir_base>/checkpoints
<output_dir_base>/tensorboard
but I didn't find how to concatenate the OutputPathPlaceholder('drt_model') with a string like '/model'.
How can append extra folder structure like /model or /tensorboard to the OutputPathPlaceholder that KFP will set during run time ?

I didn't realized in the first place that ConcatPlaceholder accept both Artifact and string. This is exactly what I wanted to achieve:
ConcatPlaceholder([OutputPathPlaceholder('drt_model'), '/model'])

Related

Qlik Sense: how to specify path in Google Drive?

I have a Google drive account divided into some folders (say, Folder1, Folder2, etc.), with some subfolders in it.
I successfully managed to connect my Qlik Sense app to it.
I need to make it look for files only in a given subfolder.
At the moment, I read as follows ([...] is the location)
(URL IS [[...]connectorID=GoogleDriveConnector&table=ListSpreadsheets&appID=], qvx);
It works and reloads successfully, but I need it to filter the Spreadsheets properly. How could I get what I need?
To connect to Google Drive in fact you use web connector. Once web connector is installed it can be initialized as service or manually from its folder.
Once it i installed (recent version can be downloaded from https://qliksupport.force.com/apex/QS_Home_Page but it seems that you've got it as Google Drive is part of it ) it is much nicer to configure connection to online drives there.
You just go to http://localhost:5555/web and generate ready code.
In my implementation I used following options step by step to get data which I wanted:
1) CanAuthenticate to generate permanent token
2) ListSpreadsheets
3) ListWorksheets
4) GetWorksheet
You can't just specify path. But it's possible to retrieve the path from QWC services. Please use algorithm like that:
Use tables like ListFiles/ListWorksheets
Iter through every row with 'for' cycle:
FOR i=0 to (NoOfRows('Google_ListWorksheets')-1);
Let vWorksheetKey = Peek('worksheetKey', $(i), 'Google_ListWorksheets');
Let vTitle = left(Peek('title', $(i), 'Google_ListWorksheets'),3);
Using 'if' statement find desired folder id/worksheet key by its name (stored in vTitle variable) and use it:
load * FROM [$(vQwcConnectionName)]
(URL IS [http://localhost:5555/data?connectorID=GoogleDriveConnector&table=GetWorksheet&worksheetKey=$(vWorksheetKey)&appID=], qvx);
At the end you will get your files by their location.

use local graph in noflo.asCallback

I'm trying to execute a graph located in a local graphs/ folder using the noflo.asCallback function.
I'm getting an error that the graph or component is not available with the base specified at the baseDir.
What is the correct way to us the asCallback function with a graph?
The issue was related to the package name of the project. The name I was using was noflo-test, and the graph was available as test/GraphName, without the noflo- prepended.

How to create an op like conv_ops in tensorflow?

What I'm trying to do
I'm new to C++ and bazel and I want to make some change on the convolution operation in tensorflow, so I decide that my first step is to create an ops just like it.
What I have done
I copied conv_ops.cc from //tensorflow/core/kernels and change the name of the ops registrated in my new_conv_ops.cc. I also changed some name of the functions in the file to avoid duplication. And here is my BUILD file.
As you can see, I copy the deps attributes of conv_ops from //tensorflow/core/kernels/BUILD. Then I use "bazel build -c opt //tensorflow/core/user_ops:new_conv_ops.so" to build the new op.
What my problem is
Then I got this error.
I tried to delete bounds_check and got same error for the next deps. Then I realize that there is some problem for including h files in //tensorflow/core/kernels from //tensorflow/core/user_ops. So how can I perfectely create a new op excatcly like conv_ops?
Adding a custom operation to TensorFlow is covered in the tutorial here. You can also look at actual code examples.
To address your specific problem, note that the tf_custom_op_library macro adds most of the necessary dependencies to your target. You can simply write the following :
tf_custom_op_library(
name="new_conv_ops.so",
srcs=["new_conv_ops.cc"]
)

Tensorflow: checkpoints simple load

I have a checkpoint file:
checkpoint-20001 checkpoint-20001.meta
how do I extract variables from this space, without having to load the previous model and starting session etc.
I want to do something like
cp = load(checkpoint-20001)
cp.var_a
It's not documented, but you can inspect the contents of a checkpoint from Python using the class tf.train.NewCheckpointReader.
Here's a test case that uses it, so you can see how the class works.
https://github.com/tensorflow/tensorflow/blob/861644c0bcae5d56f7b3f439696eefa6df8580ec/tensorflow/python/training/saver_test.py#L1203
Since it isn't a documented class, its API may change in the future.

How to write summaries for multiple runs in Tensorflow

If you look at the Tensorboard dashboard for the cifar10 demo, it shows data for multiple runs. I am having trouble finding a good example showing how to set the graph up to output data in this fashion. I am currently doing something similar to this, but it seems to be combining data from runs and whenever a new run starts I see the warning on the console:
WARNING:root:Found more than one graph event per run.Overwritting the graph with the newest event
The solution turned out to be simple (and probably a bit obvious), but I'll answer anyway. The writer is instantiated like this:
writer = tf.train.SummaryWriter(FLAGS.log_dir, sess.graph_def)
The events for the current run are written to the specified directory. Instead of having a fixed value for the logdir parameter, just set a variable that gets updated for each run and use that as the name of a sub-directory inside the log directory:
writer = tf.train.SummaryWriter('%s/%s' % (FLAGS.log_dir, run_var), sess.graph_def)
Then just specify the root log_dir location when starting tensorboard via the --logdir parameter.
As mentioned in the documentation, you can specify multiple log directories when running tensorboard. Alternatively, you can create multiple run subfolder in the log directory to visualize different plots in the same graph.