Vim - proper handling of multiple syntax in a single shell script - awk

I write many scripts where there is complex awk logic incorporated into the script directly. I prefer the "one-stop-shop" approach and not have those logic segments as external files.
I have tried what I have seen in this reference about specifying the multiple syntaxes that are to be evaluated (and hopefully colorized per syntax colorscheme). I tried the suggested method as both a VIM command and as a setting in the .vimrc file. The use of "setfiletype sh.awk.sed" did not work in either case.
Is there a special way to make this work properly ?
Another reference seems to provide what looks like awk-related syntax declarations. However, those are presented out of context, and it is not clear if the info provided is complete and self-contained, or whether that is an extract from an unknown, unpublished larger syntax definition, and whether that is added to the sh.vim or the awk.vim syntax files ?
Can anyone shed light on this ?
This first image is a test file created to show my colorscheme use-cases for Bourne shell.
This second image is a sample of awk logic displayed by the same scheme.

Related

Snakemake: parameterized runs that re-use intermediate output files

here is a complex problem that I am struggling to find a clean solution for:
Imagine having a Snakemake workflow with several rules that can be parameterized in some way. Now, we might want to test different parameter settings for some rules, to see how the results differ. However, ideally, if these rules depend on the output of other rules that are not parameterized, we want to re-use these non-changing files, instead of re-computing them for each of our parameter settings. Furthermore, if at all possible, all this should be optional, so that in the default case, a user does not see any of this.
There is inherent complexity in there (to specify which files are re-used, etc). I am also aware that this is not exactly the intended use case of Snakemake ("reproducible workflows"), but is more of a meta-feature for experimentation.
Here are some approaches:
Naive solution: Add wildcards for each possible parameter to the file paths. This gets ugly, hard to maintain, and hard to extend really quickly. Not a solution.
A nice approach might be to name each run, and have an individual config file for that name which contains all settings that we need. Then, we only need a wildcard for such a named set of parameter settings. That would probably require to read some table of meta-config file, and process that. That doesn't solve the re-use issue though. Also, that means we need multiple config files for one snakemake call, and it seems that this is not possible (they would instead update each other, but not considered as individual configs to be run separately).
Somehow use sub-workflows, by specifying individual config files each time, e.g., via a wildcard. Not sure that this can be done (e.g., configfile: path/to/{config_name}.yaml). Still not a solution for file re-using.
Quick-and-dirty: Run all the rules up to the last output file that is shared between different configurations. Then, manually (or with some extra script) create directories with symlinks to this "base" run, with individual config files that specify the parameters for the per-config-runs. This still necessitates to call snakemake individually for each of these directories, making cluster usage harder.
None of these solve all issues though. Any ideas appreciated!
Thanks in advance, all the best
Lucas
Snakemake now offers the Paramspace helper to solve this! https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html?highlight=parameter#parameter-space-exploration
I have not tried it yet, but it seems like the solution to the issue!

How (if possible) to use PostgreSQL's parser (in C) independently?

I need a parser (mainly for the "select" type of queries) and avoid the hassle of doing it from scratch. Does anybody know how to use the scan.l/gram.y of pgsql for this purpose? I've looked up pgpool too, but it seems similar. Currently, it might be very helpful if someone could give instructions to compile the parser (using the makefile provided maybe) without errors so that it can be supplied (valid?) queries and outputs the parse tree (in whatever form)!
You probably cannot take any file from postgres source tarball and compile it separately. Parser use internal OOP structures (implemented in C). But there is some possibility (not simple) - ecpg preprocessor try to transform PostgreSQL gram file to secondary gram file - and you can use same mechanism. It use a small utility parse.pl (it is part of PostgreSQL source code (src/postgresql/src/interfaces/ecpg/preproc))
PostgreSQL compiles the language parser using yacc. Presumably you could take the yacc files and create a compatible parser with very little effort. Note you must have flex and yacc installed to do this.
Note this is not taking a .c file from source and transplanting it into your system. All you are getting is the parser, not the planner or anything else.
Given the level of detail in the question no more detail can be possible. Perhaps you could start there and post another question when you get stuck.

CDash Custom Dynamic Analysis

I'm trying to integrate custom dynamic analysis tools to CDash. Such as KWStyle, CppCheck and Visual Leak Detector.
I'v figured out that I need to generate a DynamicAnalysis.xml file and submit it to CDash, from CTest scripts.
I think I know how to run the external tool as a part of the ctest script.
Either by using these variables to change how ctest_memcheck() works
CTEST_MEMORYCHECK_COMMAND
CTEST_MEMORYCHECK_SUPPRESSIONS_FILE
CTEST_MEMORYCHECK_COMMAND_OPTIONS
or by running the tool from the execute_process() command.
But I'm a bit uncertain which one to use.
The main problem I think I have is, how can I extract errors from the output of the custom tool and include that information into the DynamicAnalysis.xml to submit?
The extreme solution i see is that i'd need to make a program that generates a valid DynamicAnalysis.xml file.
But the problem is that I don't know the syntax of the DefectList element in the XML file. I have found no answer from google and even the XML Schema for that file is unhelpful.
EDIT:
Looking at this:
http://www.cdash.org/CDash/viewDynamicAnalysis.php?buildid=987149
What draws my attention are the labels, especially the empty ones. I don't see how these would come from the DynamicAnalysis.xml file. Maybe it tracks any labels that have ever appearred? Can i create my own custom labels somehow?
Does CDash create the labels automatically, depending on the tool type? Does this block custom defect types?
I'm just guessing here, so the question is; can i create custom labels for my custom tool, just by generating a DynamicAnalysis.xml - file.
It occurred to me that the amount of different errors from CppCheck (static code analysis) is huge, compared to valgrind for instance. I'm not that certain that I should use the dynamic analysis. Maybe a custom build type (Continuous / Experimental / Nightly) thing would work better. Like this:
http://www.cdash.org/CDash/buildSummary.php?buildid=930174
I have no idea how to do this, i guess it requires meddling around with CDash code?
Which one would work better?
If you are using valgrind, you can simply set CTEST_MEMORYCHECK_COMMAND to the full path to valgrind, and ctest will generate the DynamicAnalysis.xml file for you from the valgrind output when you call ctest_memcheck.
The best way to understand the possible values that can appear in the DynamicAnalysis.xml file is to analyze the source code of CTest.
The file CMake/Source/CTest/cmCTestMemCheckHandler.cxx has the list of defect types in a variable named "cmCTestMemCheckResultLongStrings". Search through that file for references to that variable to see what the possible values are and how they are used to generate "<Defect/>" xml elements.
EDIT (for additional information):
You can also easily see what XML elements CDash is expecting by inspecting its source code. Specifically, the file "CDash/xml_handlers/dynamic_analysis_handler.php".
From what I'v learned so far, is that for a tool that runs on the tests made in the cmake script, the Dynamic Analysis is the thing.
For tools that run on the entire program, a custom Build.xml is the thing you need.
I found out that i can commit those files from the ctest_submit command by using the FILES parameter.
I also found out that you can add custom "build names" to the side of Continuous, Nightly, and others.
And that you can set the builds from certain machines to be automatically transferred under these.
The custom labels under DynamicAnalysis did come from somewhere in CDash, i can't remember where anymore.

Should I merge .pbxproj files with git using merge=union?

I'm wondering whether the merge=union option in .gitattributes makes sense for .pbxproj files.
The manpage states for this option:
Run 3-way file level merge for text files, but take lines from both versions, instead of leaving conflict markers. This tends to leave the added lines in the resulting file in random order and the user should verify the result.
Normally, this should be fine for the 90% case of adding files to the project. Does anybody have experience with this?
Not a direct experience, but:
This SO question really advices again merging .pbxproj files.
The pbxproj file isn't really human mergable.
While it is plain ASCII text, it's a form of JSON. Essentially you want to treat it as a binary file.
(hence a gitignore solution)
Actually, Peter Hosey adds in the comment:
It's a property list, not JSON. Same ideas, different syntax.
Yet, according to this question:
The truth is that it's way more harmful to disallow merging of that .pbxproj file than it is helpful.
The .pbxproj file is simply JSON (similar to XML). From experience, just about the ONLY merge conflict you were ever get is if two people have added files at the same time. The solution in 99% of the merge conflict cases is to keep both sides of the merge.
So a merge 'union' (with a gitattributes merge directive) makes sense, but do some test to see if it does the same thing than the script mentioned in the last question.
See also this question for potential conflicts.
See the Wikipedia article on Property List
I have been working with a large team lately and tried *.pbxproj merge=union, but ultimately had to remove it.
The problem was that braces would become out of place on a regular basis, which made the files unreadable. It is true that tho does work most of the time - but fails maybe 1 out of 4 times.
We are back to using *.pbxproj -crlf -merge for now. This seems to be the best solution that is workable for us.

Batch source-code aware spell check

What is a tool or technique that can be used to perform spell checks upon a whole source code base and its associated resource files?
The spell check should be source code aware meaning that it would stick to checking string literals in the code and not the code itself. Bonus points if the spell checker understands common resource file formats, for example text files containing name-value pairs (only check the values). Super-bonus points if you can tell it which parts of an XML DTD or Schema should be checked and which should be ignored.
Many IDEs can do this for the file you are currently working with. The difference in what I am looking for is something that can operate upon a whole source code base at once.
Something like a Findbugs or PMD type tool for mis-spellings would be ideal.
As you mentioned, many IDEs have this functionality already, and one such IDE is Eclipse. However, unlike many other IDEs Eclipse is:
A) open source
B) designed to be programmable
For instance, here's an article on using Eclipse's code formatting functionality from the command line:
http://www.peterfriese.de/formatting-your-code-using-the-eclipse-code-formatter/
In theory, you should be able to do something similar with it's spell-checking mechanism. I know this isn't exactly what you're looking for, and if there is a program for doing spell-checking in code then obviously that'd be better, but if not then Eclipse may be the next best thing.
This seems little old but seems to do a good job
Source Code Spell Checker