Perl6 generic code to test if modules load - testing

This is a generic code code in /t to test if .pm6 modules in /lib load.
use lib $*PROGRAM.sibling('../lib');
use Test;
my #dir = dir($*PROGRAM.sibling('../lib'), test => { $_ ~~ /.*pm6/ } );
plan #dir.elems;
sub module( IO $dir ) {
$dir.basename.Str ~~ /(\w+)\.pm6/;
return $0.Str;
}
for #dir.map(&module) -> $module {
use-ok $module, "This module loads: $module";
}
Before going any further (recursively looking at lib sub-folders ), I wonder in this is the right approach.
Thanks!

If you are testing a well-formed distribution then you should be using:
use lib $*PROGRAM.parent(2);
By pointing use lib at the directory containing your META6.json instead of the lib directory you help ensure that the provides entry of the META6.json file is up to date (since files not listed in the META6.json but that do exist inside lib won't be seen).
(I'd even take it one step further and say don't use use lib '...' at all, and instead run your tests using perl6 -I .... For instance -- what if you want to run these tests (for whatever reason) on the installed copy of some distribution?)
With that said you could skip the directory recursion by using the META6 data. One method would be to read the META6.json directly, but a better way of doing so would be to get the module names from the distribution itself:
# file: zef/t/my-test.t
# cwd: zef/
use lib $*PROGRAM.parent(2); # or better: perl6 -I. t/my-test.t
use Test;
my $known-module = CompUnit::DependencySpecification.new(short-name => "Zef");
my $comp-unit = $*REPO.resolve($known-module);
my #module-names = $comp-unit.distribution.meta<provides>.keys;
use-ok($_) for #module-names;

Using #ugexe feedback and META6 distribution, the following code in t/ tests that modules defined in META6.json load.
use META6;
use Test;
my $m = META6.new( file => $*PROGRAM.sibling('../META6.json') );
my #modules = $m<provides>.keys;
plan #modules.elems;
for $m<provides>.keys -> $module {
use-ok $module, "This module loads: $module";
}
This test has been pulled into the META6 distribution.

Related

How do I access a module's symbol table dynamically at runtime in Raku?

I want to be able to pass my script the path to a .rakumod file, say <blah>/Mod.rakumod, and be able to access the symbol table as a hash as if I'd used the module instead:
The module:
$ cat Mod.rakumod
unit module Mod;
sub bag is export { ... }
use lib <dir-containing-Mod>
use Mod;
say Mod::EXPORT::.keys
works, as expected, returning (ALL DEFAULT).
On the other hand:
use lib <dir-containing-Mod>
require Mod;
say Mod::EXPORT::.keys
fails with
Could not find symbol '&EXPORT' in 'Mod'
in block <unit> at <blah>
This is despite the fact that even with require, say Mod::.keys does see EXPORT:
use lib <dir-containing-Mod>
require Mod;
say Mod::.keys
---
(EXPORT Mod)
I need to use require to make this dynamic, as I don't know which module I'll want.
I can actually think of one thing to do, but it is absolutely disgusting:
save my module name into a variable $mod
have my script write another script using that module:
my $mod = <whatever>
my $cd = qq:to/END/;
use v6;
use lib qq\|\$\*CWD\|;
use $mod;
say {$mod}::EXPORT::ALL::.keys;
END
'aux.p6'.IO.spurt($cd);
and then have the initial script call the auxiliary one:
shell("raku aux.p6")
What worked, per raiph's answer (which I've accepted and am here paraphrasing):
pass the path to the module and save it as $path;
use lib the relevant directory:
my $dir = $path.IO.dirname;
use lib $dir;
extract the plain file name:
my $modFile = S/(.*)\..*/$0/ with $path.IO.basename;
finally, require that and pull the .WHO trick from raiph's answer, slightly adapted:
require ::($modFile);
say ::("{$modFile}::EXPORT::ALL").WHO.keys;
Run with <script> <path> that returned (&bag) all right.
Actually, the above doesn't quite work: use lib $dir will fail saying $dir is empty, because use lib isn't dynamic.
So instead I am now resorting to the unappealing solution of
copying the module file to a temporary directory ./TMP
having called use './TMP';
and then removing that directory when done.
TL;DR Use dynamic symbol lookup to get the symbol of the package whose symbols you want at run-time; then .WHO to get its stash; then .keys to get that stash's symbols.
For example:
use lib '.';
require Mod;
say ::Mod::EXPORT::('ALL').WHO.keys; # (&bag)
I'm going to make breakfast. I'll elaborate later.

cmake, pass result of external program as preprocessor definitions

I'm new to cmake, so correct me if I've messed things up and this should be solved using something other than cmake.
I have main_program, that requires multiple other subprograms in form of bindata to be specified at build phase. Right now I build it by running
cmake -DBINDATA1="\xde\xad..." -DBINDATA2="\xbe\xef" -DBINDATA3="..."
and in code I use them as:
// main_program.cpp
int main() {
#ifdef BINDATA1
perform_action1(BINDATA1);
#endif
#ifdef BINDATA2
perform_action2(BINDATA2);
#endif
[...]
This is rather unclean method as any time I'm changing one of subprograms I have to generate bindata from it and pass it to cmake command.
What I would like to do, is have a project structure:
/
-> main_program
-> subprograms
-> subprogram1
-> subprogram2
-> subprogram3
and when I run cmake, I would like to
compile each of subprograms
generate shellcode from each of them, by running generate_bindata program on them
build main_program passing bindatas from step 2
and when I run cmake, I would like to
compile each of subprograms
generate shellcode from each of them, by running generate_shellcode program on them
build main_program passing shellcodes from step 2
Then let's do that. Let's first write a short script to generate a header:
#!/bin/sh
# ./custom_script.sh
# TODO: Find out proper quoting and add `"` is necessarily. Ie. details.
# Prefer to use actual real variables like `static const char *shellcode[3]`
# instead of raw macro defines.
cat > "$1" <<EOF
#define SHELLCODE1 $(cat "$2")
#define SHELLCODE2 $(cat "$3")
#define SHELLCODE3 $(cat "$4")
EOF
To be portable, write this script in cmake. This script will be run at build phase to generate the header needed for compilation. Then, "model dependencies" - find out what depends on what exactly. Then write it in cmake:
add_executable(subprogram1 sources.c...)
add_executable(subprogram2 sources.c...)
add_executable(subprogram3 sources.c...)
for(i IN ITEMS 1 2 3)
add_custom_target(
COMMENT Generate shellcode${i}.txt with the content of shellcode
# TODO: redirection in COMMAND should be removed, or the command
# should be wrapped in `sh -c ...`.
COMMAND $<TARGET_FILE:subprogram${i}> | generate_shellcode > ${CMAKE_CURRENT_BINARY_DIR}/shellcode${i}.txt
OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/shellcode${i}.txt
DEPENDS $<TARGET_FILE:subprogram${i}> generate_shellcode
)
endfor()
add_custom_command(
COMMENT Generate shellcodes.h from shellcode1.txt shellcode2.txt and shellcode3.txt
COMMAND sh custom_script.sh
${CMAKE_CURRENT_BINARY_DIR}/shellcodes.h
${CMAKE_CURRENT_BINARY_DIR}/shellcode1.txt
${CMAKE_CURRENT_BINARY_DIR}/shellcode2.txt
${CMAKE_CURRENT_BINARY_DIR}/shellcode3.txt
OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/shellcodes.h
DEPENDS
${CMAKE_CURRENT_BINARY_DIR}/shellcode1.txt
${CMAKE_CURRENT_BINARY_DIR}/shellcode2.txt
${CMAKE_CURRENT_BINARY_DIR}/shellcode3.txt
)
# Then compile the final executable
add_executable(main main.c ${CMAKE_CURRENT_BINARY_DIR}/shellcodes.h)
# Don't forget to add includes!
target_include_directories(main PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
# or you may add dependency to a single file instead of target
# Like below only to a single shellcodeswrapper.c file only
# This should help build parallelization.
set_source_files_properties(main.c OBJECT_DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/shellcodes.h)
# Or you may add a target for shelcodes header file and depend on it
add_custom_target(shellcodes DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/shellcodes.h)
add_executable(main main.c)
target_include_directories(main PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
add_dependencies(main shellcodes)
Then your main file:
#include <shellcodes.h> // compiler will find it in BINARY_DIR
int main() {
perform_action1(SHELLCODE1);
perform_action2(SHELLCODE2);
}
So that all your source files are not recompiled each time, I suggest to write a wrapper:
// shellcodewrapper.c
#include <shellcodes.h>
// preserve memory by not duplicating code in each TU
static const char shellcode1[] = SHELLCODE1;
// only this file will be recompiled when SHELLCODE changes
const char *get_shellcode1(void) {
return shellcode1;
}
// shellcodewrapper.h
const char *get_shellcode1(void);
// main.c
#include <shellcodewrapper.h>
int main() {
perform_action1(get_shellcode1());
perform_action2(get_shellcode2());
}
That way when you change the "SHELLCODE" generators, only shellcodewrapper.c will be compiled, resulting in super fast compilation times.
Note how dependency is transferred and how it works - I used files inside BINARY_DIR to transfer result from one command to another, then these files track what was changed and transfer dependency below in the chain. Track dependencies in DEPENDS and OUTPUT in add_custom_command and cmake will properly compile in proper order.

CMake : add executable with unknown genererated source files

I have a tool that generates a set source files whose name I am not able to know beforehand.
How to write a proper CMakeLists.txt script for scenario? This question has been asked before here CMake Compiling Generated Files. But it does not have a proper solution.
For instance, in the first answer (https://stackoverflow.com/a/8748478/2912478), OP can predict which files will generated based on the input .idl files. The second answer (https://stackoverflow.com/a/39258996/2912478) shows three different ways to solve but I really couldn't get his solution working.
Test case
I prepared a simple test case. Suppose I have this static file where the main resides (main.cpp).
# main.cpp
void foo(void);
int main() {
foo();
return 0;
}
Currently, I am using this CMakeLists.txt. The custom command generates the source file under src.
# CMakeLists.txt
add_executable(a.out main.cpp)
add_custom_command(
OUTPUT mylib.cpp
COMMAND ${CMAKE_SOURCE_DIR}/genf.sh ${CMAKE_SOURCE_DIR}/src
DEPENDS ${CMAKE_SOURCE_DIR}/genf.sh
)
add_custom_target(GenFile DEPENDS mylib.cpp)
add_dependencies(a.out GenFile)
Here is the hypothetical code generator genf.sh. I use a random number to mimic the fact that we do NOT know which files the generator will generate.
#!/bin/bash
rm -rf $1/*
fname=$(echo $((1 + RANDOM % 100)))
echo "Generating src$fname.cpp"
echo "void foo(void) {}" > $1/src$fname.cpp
Attempt 1
I tried to use GLOB to find all files generated. So I put the following lines at the end of my CMakeLists.txt. This doesn't work because at the moment of running cmake .. there is no files under src. So this solution never links the generated source files.
file(GLOB GeneratedSourceFiles ${CMAKE_SOURCE_DIR}/src/*.cpp)
target_sources(a.out PUBLIC ${GeneratedSourceFiles})

rpm spec file skeleton to real spec file

The aim is to have skeleton spec fun.spec.skel file which contains placeholders for Version, Release and that kind of things.
For the sake of simplicity I try to make a build target which updates those variables such that I transform the fun.spec.skel to fun.spec which I can then commit in my github repo. This is done such that rpmbuild -ta fun.tar does work nicely and no manual modifications of fun.spec.skel are required (people tend to forget to bump the version in the spec file, but not in the buildsystem).
Assuming the implied question is "How would I do this?", the common answer is to put placeholders in the file like ##VERSION## and then sed the file, or get more complicated and have autotools do it.
We place a version.mk file in our project directories which define environment variables. Sample content includes:
RELPKG=foopackage
RELFULLVERS=1.0.0
As part of a script which builds the RPM, we can source this file:
#!/bin/bash
. $(pwd)/Version.mk
export RELPKG RELFULLVERS
if [ -z "${RELPKG}" ]; then exit 1; fi
if [ -z "${RELFULLVERS}" ]; then exit 1; fi
This leaves us a couple of options to access the values which were set:
We can define macros on the rpmbuild command line:
% rpmbuild -ba --define "relpkg ${RELPKG}" --define "relfullvers ${RELFULLVERS}" foopackage.spec
We can access the environment variables using %{getenv:...} in the spec file itself (though this can be harder to deal with errors...):
%define relpkg %{getenv:RELPKG}
%define relfullvers %{getenv:RELFULLVERS}
From here, you simply use the macros in your spec file:
Name: %{relpkg}
Version: %{relfullvers}
We have similar values (provided by environment variables enabled through Jenkins) which provide the build number which plugs into the "Release" tag.
I found two ways:
a) use something like
Version: %(./waf version)
where version is a custom waf target
def version_fun(ctx):
print(VERSION)
class version(Context):
"""Printout the version and only the version"""
cmd = 'version'
fun = 'version_fun'
this checks the version at rpm build time
b) create a target that modifies the specfile itself
from waflib.Context import Context
import re
def bumprpmver_fun(ctx):
spec = ctx.path.find_node('oregano.spec')
data = None
with open(spec.abspath()) as f:
data = f.read()
if data:
data = (re.sub(r'^(\s*Version\s*:\s*)[\w.]+\s*', r'\1 {0}\n'.format(VERSION), data, flags=re.MULTILINE))
with open(spec.abspath(),'w') as f:
f.write(data)
else:
logs.warn("Didn't find that spec file: '{0}'".format(spec.abspath()))
class bumprpmver(Context):
"""Bump version"""
cmd = 'bumprpmver'
fun = 'bumprpmver_fun'
The latter is used in my pet project oregano # github

In Rust, what is the purpose of a mod.rs file?

In some Rust projects I've seen (i.e pczarn/rustboot), I've seen mod.rs files in directories for whatever reason. I've not been able to find documentation about this, and I've seen it in many other Rust projects.
What is the purpose of a mod.rs file, and when should I use it?
Imagine the following directory structure:
code/
`- main.rs
- something/
`- mod.rs
If in main.rs you do mod something;, then it'll look in the something/mod.rs file to use as the contents of the module declaration for something.
The alternative to this is to have a something.rs file in the code/ directory.
So to recap, when you write an empty module declaration such as mod something;, it looks either in:
a file called something.rs in the same directory
a file called mod.rs in a folder called something in the same directory
It then uses the contents of either of those files to use as the contents of the module declaration.
Modules are important to understand, but I find most documentations often leave you scratching your head on that matter.
Coming from Python or Javascript?
Roughly, mod.rs is kind of like __init__.py in python or index.js in javascript.
But only kind of. This is a bit more complicated in Rust.
Rust is different
Folders are not immediately ready to use as modules in Rust.
You have to add a file named mod.rs in a folder to expose a new module named like that folder.
The code in mod.rs is the content of that module.
All other files in the folder may in turn be exposed as submodules (more on that below).
Wait, there is another way
You may also use a file at the same level as a folder and named after that folder (<folder_name>.rs).
This is the preferred way since rustc 1.30. (Credits to MarkusToman in the comments)
From the Rust reference:
Note: Previous to rustc 1.30, using mod.rs files was the way to load
a module with nested children. It is encouraged to use the new
naming convention as it is more consistent, and avoids having many
files named mod.rs within a project.
Complete example
src
utils
bar.rs
foo.rs
main.rs
At this point, the compiler doesn't know about src/utils/foo.rs and src/utils/bar.rs.
First, you must expose src/utils/. As seen above, you have 2 options:
add the file: src/utils/mod.rs
add the file src/utils.rs (named exactly like the folder, without the extension)
Now, relative to the src folder (aka the crate level), a module named utils is available.
Second, you must expose the files src/utils/foo.rs and src/utils/bar.rs.
To do that, the utils module must declare 2 new submodules named after these files.
So the content of src/utils/mod.rs (or src/utils.rs) should be:
pub mod bar;
pub mod foo;
Now whatever is public in those 2 files is available in other modules! 🎉
And you may write the following in src/main.rs:
mod utils;
use utils::{foo, bar};
Resulting file structure
Option 1 • mod.rs (the old way):
src
utils
bar.rs
foo.rs
mod.rs
main.rs
Option 2 • <folder_name>.rs (the preferred way):
src
utils
bar.rs
foo.rs
utils.rs
main.rs
More advanced details on how modules work
This remains a surface explanation, your next destination is the official documentation 🧑‍🎓
There is a third way to declare modules (core language):
mod utils {
// module code goes here. That's right, inside of the file.
}
But it is also possible to just write mod utils;.
In that case, as seen above, Rust knows to search for either of src/utils.rs or src/utils/mod.rs.
See, when you try to use a module in a file (in src/main.rs for example),
you may reference it in the following ways:
from inside: src/main.rs
mod module { ... }
from nested modules inside: src/main.rs
mod module { pub mod sub_module { ... } }
from sybling files: src/*.rs
from mod.rs files in sybling folders: src/*/mod.rs
(and infinite reccursive combinations of the above)
A file or a folder containing mod.rs does not become a module.
Rather, the Rust language lets you organize modules (a language feature) with a file hierarchy.
What makes it really interesting is that you are free to mix all approaches together.
For example, you may think you can't directly reference src/utils/foo.rs from main.rs.
But you can:
// src/main.rs
mod utils {
pub mod foo;
}
Important notes:
modules declared in files will always take precedence (because you in fact never need to search the file hierarchy)
you can't use the other 2 approaches to reference the same module
For example, having both src/utils.rs and src/utils/mod.rs will raise the following error at compile time:
error[E0761]: file for module `utils` found at both "src/utils.rs" and "src/utils/mod.rs"
--> src/main.rs:1:1
|
1 | mod utils;
| ^^^^^^^^^^
|
= help: delete or rename one of them to remove the ambiguity
Let's wrap up. Modules are exposed to the compiler:
from top to bottom
by reference only
(That's why you don't have intellisense until your modules are "imported")
starting from an entry point
(which is src/main.rs or src/lib.rs by default.
But it may be anything that you configure in Cargo.toml.
This has little to do with this question however)
With our previous example we get in order:
src/main.rs -> crate
Because the crate module contains mod utils; we next get:
src/utils.rs OR src/utils/mod.rs -> crate::utils
Because the utils module contains mod foo; we next get:
src/utils/foo.rs -> crate::utils::foo
Each rs file, except lib.rs and main.rs, which always match the crate module, gets its own module.
There is only one way to declare a module:
/* pub */ mod sub_module1;
A module cannot be declare outside the root/crate module tree (i.e., going up the module tree, a submodule must always have a parent that is declared directly in lib.rs or main.rs, so the first program submodule must always be declared there — a tree data structure if it isn't already obvious enough).
There are 2 ways to nest a module inside the module where it is declared:
in <module_where_it_is_declared>/<module_name.rs>
in <module_where_it_is_declared>/module_name/mod.rs
If module_where_it_is_declared is the crate module, then this corresponding subfolder is not needed (disappears from the scheme above).
Here is an example, valid for both lib and binary crates:
src
|---lib.rs ( contains: pub mod b2; )
|---b2.rs ( contains: pub mod bb2; )
|---b2
| |---bb2.rs
. .
Alternatively:
src
|---lib.rs ( contains: pub mod b2; )
|---b2
| |---mod.rs ( contains: pub mod bb2; )
| |---bb2.rs
. .
You can see that you can mix and match (b2 uses mod.rs way, bb2 uses the "file"-way).
Here's a way to only use the file pattern that is also valid:
src
|---lib.rs ( contains: pub mod b2; )
|---b2.rs ( contains: pub mod bb2; )
|---b2
| |---bb2.rs (contains: pub mod bbb2; )
| |---bbb2.rs (contains: pub mod bbbb2; )
| |---bbb2
| | |---bbbb2.rs
. . .
I guess it depends on you how you want to nest modules.
I like the mod.rs syntax for modules that just export other submodules and don't have any other (or very little) code in them, although you can put whatever you want in mod.rs.
I use mod.rs similar to the barrel pattern from JS/TS world, to rollup several submodules into a single parent module.
Also don't forget modules can be defined (not only declared) inline by adding a scope block:
pub mod my_submodule {
// ...
}