Using:
Tensorflow version: 2.3.1
TFX version: 0.23.1
TFDV version: 0.24.0
TFMA version: 0.24.0
with an interactive context like so:
from tfx.orchestration.experimental.interactive.interactive_context import \
InteractiveContext
context = InteractiveContext(
pipeline_root=os.path.join(os.getcwd(), "pipeline")
)
I created an ExampleGen using:
output = example_gen_pb2.Output(
split_config=example_gen_pb2.SplitConfig(splits=[
example_gen_pb2.SplitConfig.Split(name='train', hash_buckets=7),
example_gen_pb2.SplitConfig.Split(name='test', hash_buckets=2),
example_gen_pb2.SplitConfig.Split(name='eval', hash_buckets=1)
]))
example_gen = CsvExampleGen(input_base=os.path.join(base_dir, data_dir), output_config=output)
context.run(example_gen)
and later in the code, I tried evaluating the data using an ExampleValidator but it seems the ExampleValidator doesn't resolve the proper paths to the split data sets.
Creation of the validator works as expected:
example_validator = ExampleValidator(
statistics=statistics_gen.outputs['statistics'],
schema=schema_gen.outputs['schema'])
context.run(example_validator)
No warning or errors were had, but attempting to show the results, error on the paths not being correct:
context.show(example_validator.outputs['anomalies'])
NotFoundError: /home/jovyan/pipeline/ExampleValidator/anomalies/16/anomalies.pbtxt; No such file or directory
The actual directory structure was like so:
.
└── anomalies
└── 16
├── eval
│ └── anomalies.pbtxt
├── test
│ └── anomalies.pbtxt
└── train
└── anomalies.pbtxt
5 directories, 3 files
but the code seemed to expect:
└── anomalies
└── 16
└── anomalies.pbtxt
How do I call ExampleValidator to analyze split data sets?
Thanks #Lorin S., for sharing the solution reference. For the benefit of community I am providing solution here (answer section) given by 1025KB in github.
Added split in TFX 0.23 version, but Colab is not updated in 0.23.
Colab is fixed in 0.24 here
Issue was resolved by upgrading tfx to 0.24
Related
I have two separate projects, but one of them must now incorporate aspects of the other, including the generation of some code, which done by a Python script which is called by CMake.
Here is my project structure:
repo/
├── project_top/
│ ├── stuff_and_things.cpp
│ └── CMakeLists.txt
│
└── submods/
└── project_bottom/
├── CMakeLists.txt
└── tools/
├── build_scripts
│ └── cmake_bits.cmake
└── generator
└── gen_code.py
In repo/submods/project_bottom/tools/build_scripts/cmake_bits.cmake there is a macro set_up_additional_targets(), which includes a custom target which runs repo/submods/project_bottom/tools/generator/gen_code.py in that directory. This is based on project_bottom being its own project.
add_custom_target(gen_code
COMMAND echo "Generating code"
COMMAND python3 gen_code.py args
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/tools/generator
)
Now, I need to make a new target in project_top dependent upon the gen_code target in project_bottom. How do I do this? The gen_code target needs to be run as part of the project_top build, but within the context of project_bottom, because for that target, ${CMAKE_CURRENT_SOURCE_DIR} needs to be repo/submods/project_bottom, not repo/project_top.
I've been trying to create portable snakemake wrappers that executes pre-created scripts in the "wrapper.py" script. So far though, all the examples I've found call shell from snakemake.shell to run functions from the command line. So I thought an equivalent for scripts would be using script from snakemake.script to execute the scripts. But when I use this in a rule, it throws an error like this:
Traceback (most recent call last):
File "/home/robertlink/stack_overflow_dummy_example/.snakemake/scripts/tmpqfzkhuv_.wrapper.py", line 7, in <module>
script("scripts/foo.py")
TypeError: script() missing 19 required positional arguments: 'basedir', 'input', 'output', 'params', 'wildcards', 'threads', 'resources', 'log', 'config', 'rulename', 'conda_env', 'container_img', 'singularity_args', 'env_modules', 'bench_record', 'jobid', 'bench_iteration', 'cleanup_scripts', and 'shadow_dir'
Is there a way to easily retrieve the information required for using script? Or am I mistaken that I should even use script in this fashion? Here's a dummy example to replicate the message:
Directory structure:
.
├── Snakefile
└── wrapper
└── path
├── scripts
│ ├── bar.py
│ └── foo.py
└── wrapper.py
Snakefile:
rule foobar:
output:
"foobar.txt"
wrapper:
"file:wrapper/path"
wrapper.py
from snakemake.script import script
script("scripts/foo.py")
script("scripts/bar.py")
foo.py
with open("foo_intermediate.txt", 'w') as handle:
handle.write("foo")
bar.py
with open("foo_intermediate.txt", 'w') as handle:
foo = handle.read()
foo += 'bar'
with open(snakemake.output) as handle:
handle.write(foo)
command run:
$ snakemake --cores 3
Any insight into this would be wonderful. Thanks!
You don't have to write a wrapper to call your scripts - the scripts can be the wrapper. Maybe take a look at this wrapper based on an Rscript to get the idea:
https://snakemake-wrappers.readthedocs.io/en/latest/wrappers/tximport.html
I have to run a bokeh script as a module using the -m option from the top directory, because it needs to import some other portable module under the same directory
python -m bokeh_module.bokeh_sub_module
The directory tree is shown below. By running the above command, it doesn't show the image no matter where the png file is placed, is there a way to resolved the issue? Thank you for any help.
.
├── other_module
│ ├── __init__.py
│ └── other_sub_module.py
├── bokeh_module
│ ├── __init__.py
│ ├── image.png # not showing
│ └── bokeh_sub_module.py
└── image.png # not showing either
bokeh_sub_module.py
from other_module import other_sub_module
from bokeh.plotting import figure, show
# do something with other_sub_module
p = figure(match_aspect=True)
p.image_url( ['image.png'], 0, 0, 1, 1 ) # not showing
p.image_url( ['./bokeh_module/image.png'], 0, 0, 1, 1 ) # not showing either
show(p)
By the way, if I run python bokeh_sub_module.py from the bokeh_module directory, the image.png inside the same directory can be found with no problems.
image_url requires URLs, and neither of your calls to image_url use URLs.
Try to use absolute URLs and add file:// in front of them.
I am trying to write a Find Module for a package that I have installed. But I am having trouble understanding the CMake functions.
Here is a snippet of my code.
find_package(PkgConfig)
pkg_check_modules(PC_zcm QUIET zcm)
find_path(zcm_INCLUDE_DIR
NAMES zcm.h
PATHS $ENV{PATH}
)
mark_as_advanced(zcm_FOUND zcm_INCLUDE_DIR)
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(zcm DEFAULT_MSG
REQUIRED_VARS zcm_INCLUDE_DIR
)
find_path() is able to find my zcm_INCLUDE_DIR just fine: /usr/bin/zcm/usr/local/include
But find_package_handle_standard_args() gives
-- Could NOT find zcm (missing: REQUIRED_VARS)
My directory tree looks like this:
└── zcm
├── eventlog.h
├── json
│ ├── json-forwards.h
│ └── json.h
├── message_tracker.hpp
├── tools
│ ├── IndexerPlugin.hpp
│ └── TranscoderPlugin.hpp
├── transport
│ └── generic_serial_transport.h
├── transport.h
├── transport_register.hpp
├── transport_registrar.h
├── url.h
├── util
│ └── Filter.hpp
├── zcm-cpp-impl.hpp
├── zcm-cpp.hpp
├── zcm.h
└── zcm_coretypes.h
My understanding is find_package_handle_standard_args() attempts to find the package at the path, which sounds like it would be straightforward as the path is already determined.
As for REQUIRED_VARS the docs just say "Specify the variables which are required for this package." Which doesn't tell much for a noobie like me.
Description of find_package_handle_standard_args notes about two signatures of given function, one signature accepts DEFAULT_MSG option and another one accepts REQUIRED_VARS option.
You are trying to mix these signatures, and this is wrong.
Proper usage of the first signature:
# Everything after DEFAULT_MSG is treated as required variable.
find_package_handle_standard_args(zcm DEFAULT_MSG
zcm_INCLUDE_DIR
)
Proper usage of the second signature:
# By default, the standard error message is used.
find_package_handle_standard_args(zcm REQUIRED_VARS
zcm_INCLUDE_DIR
)
For my NSIS uninstaller, I want to check if a process is running. FindProcDLL is not working under Windows 7 x64, so I tried nsProcess.
I've downloaded the version 1.6 from the website: http://nsis.sourceforge.net/NsProcess_plugin
If I start the nsProcessTest.nsi in the Example folder, I get the following errors:
Section: "Find process" ->(FindProcess)
!insertmacro: nsProcess::FindProcess
Invalid command: nsProcess::_FindProcess
Error in macro nsProcess::FindProcess on macroline 1
Error in script "C:\Users\Sebastian\Desktop\nsProcess_1_6\Example\nsProcessTest.nsi" on line 14 -- aborting creation process
This is line 14 of the example script:
${nsProcess::FindProcess} "Calc.exe" $R0
Do somebody know what is wrong? How can I check if a process is running with NSIS?
NSIS does not find the plug-in, so make sure you copied its files to the correct folder.
NSIS 2.x:
NSIS/
├── Include/
│ └── nsProcess.nsh
└── Plugins/
└── nsProcess.dll
NSIS 3.x:
NSIS/
├── Include/
│ └── nsProcess.nsh
└── Plugins/
├── x86-ansi/
│ └── nsProcess.dll
└── x86-unicode/
└── nsProcess.dll
The file inside Plugins\x86-unicode is nsProcessW.dll renamed to nsProcess.dll (blame the author for making it overly complicated!)
More generally, refer to How can I install a plugin? on the NSIS Wiki.