Ignore includes with #pycparser and define multiple Subgraphs in #pydot - pydot

I am new to stackoverflow, but I got a lot of help until now, thanks to the community for that.
I'm trying to create a software showing me caller depandencys for legacycode.
I'parsing a directory with c code with pycparcer, and for each file i want to create a subgraph with pydot.
Two questions:
When parsing a c file, the parser references the #includes, an i get also functions in my AST, from the included files. How can i know, if the function is included, or originaly from this actual file/ or ignore the #includes??
For each file i want to create a subgraph, an then add all functions in this file to this subgraph. I don't know how many subgraphs i have to create...
I have a set of files, where each file is a frozenset with the functions of this file
somthing like this is pssible?
for files in SetOfFiles:
#how to create subgraph with name of files?
for function in files:
self.graph.add_node(pydot.Node(funktion)) #--> add node to subgraph "files"
I hope you got my challange... any ideas?
Thanks!
EDIT:
I solved the question about pydot, it was quiet easy... So I stay with my pycparser problem :(
for files in ListOfFuncs:
cluster_x = pydot.Cluster(files, label=files)
for functions in files:
cluster_x.add_node(pydot.Node(functions))
graph.add_subgraph(cluster_x)

I can address the pycparser part. The preprocessor leaves #line directives that specify which file & line code came for, and pycparser consumes those. You can get that information from the AST it creates (see tests for an example).

Related

How to open and read a .gz file in Nim (preferably line by line)

I just sat down to write my first Nim script to parse a .vcf (Variant Call Format) file. This file format stores genetic mutations from sequencing data.
For scripting languages, I 'grew up' on Perl and later migrated to Python, but I would love to use a language with the speed that Nim offers. I realize Nim is still young, but I couldn't even find a clear example for how to open and read a .gz (gzip) file (preferably line by line).
Can anyone provide a simple example to open and read a gzip file using Nim, line by line?
In Python, I'm accustomed to the following (uber-simple) code:
import gzip
my_file = gzip.open('my_file.vcf.gz', 'w')
for line in my_file:
# do something
my_file.close()
I have seen related questions, but they're not clear. The posts are also relatively old and I hope/suspect something better has come about. Here's what I've found:
Read gzip-compressed file line by line
File, FileStream, and GZFileStream
Reading files from tar.gz archive in Nim
Really appreciate it.
P.S. I also think it would be useful if someone created a Nim tag in StackOverflow. I do not have the reputation to create tags.
Just in case you need to handle VCF rather than .gz, there's a nice wrapper for htslib written by Brent Pedersen:
https://github.com/brentp/hts-nim
You need to install the htslib in your system, and then require the library in your .nimble file with requires "hts", or install the library with nimble install hts. If you are going to do NGS analysis in Nim you'll need it.
The code you need:
import hts
var v:VCF
doAssert open(v, "myfile.vcf.gz")
# Here you have the VCF file loaded in v, and can access the headers through
# v.header property
for record in v:
# Here you get a Record object per line, e.g. extract the Ref and Alts:
echo v.REF, " ", v.ALT
v.close()
Be sure to follow the docs, because some things differ from python, specially when getting the INFO and FORMAT fields.
Checkout the whole Brent repo. It has plenty of wrappers, code samples and utilities to handle NGS problems (e.g. an ultrafast coverage tool utility called Mosdepth).
Per suggestion from Maurice Meyer, I looked at the tests for the Nim zip package. It turned out to be quite simple. This is my first Nim script, so my apologies if I didn't follow convention, etc.
import zip/gzipfiles # Import zip package
block:
let vcf = newGzFileStream("my_file.vcf.gz") # Open gzip file
defer: outFile.close() # Close file (like a 'final' statement in 'try' block)
var line: string # Declare line variable
# Loop over each line in the file
while not vcf.atEnd():
line = vcf.readLine()
# Cure disease with my VCF file
To install the zip package, I simply ran because it is already in the Nim package library:
> nimble refresh
> nimble install zip
I tried to use Nim some time ago to parse a fastq or fastq.gz file.
The code should be available here:
https://gitlab.pasteur.fr/bli/qaf_demux/blob/master/Nim/src/qaf_demux.nim
I don't remember exactly how this works, but apparently, I did an import zip/gzipfiles and used newGZFileStream on the input file name to obtain a Stream from which lines can be read using .readLine() in this piece of code:
proc fastqParser(stream: Stream): iterator(): Fastq =
result = iterator(): Fastq =
var
nameLine: string
nucLine: string
quaLine: string
while not stream.atEnd():
nameLine = stream.readLine()
nucLine = stream.readLine()
discard stream.readLine()
quaLine = stream.readLine()
yield [nameLine, nucLine, quaLine]
It is used in something that amounts to this piece of code:
let inputFqs = fastqParser(newGZFileStream($inFastqFilename))
Hopefully you can adapt this to your case.
My .nimble file has a requires "zip#head". I suppose this triggers the installation of zip/gzipfiles.

In CMake how do I deal with generated source files which number and names are not known before?

Imagine a code generator which reads an input file (say a UML class diagram) and produces an arbitrary number of source files which I want to be handled in my project. (to draw a simple picture let's assume the code generator just produces .cpp files).
The problem is now the number of files generated depends on the input file and thus is not known when writing the CMakeLists.txt file or even in CMakes configure step. E.g.:
>>> code-gen uml.xml
generate class1.cpp..
generate class2.cpp..
generate class3.cpp..
What's the recommended way to handle generated files in such a case? You could use FILE(GLOB.. ) to collect the file names after running code-gen the first time but this is discouraged because CMake would not know any files on the first run and later it would not recognize when the number of files changes.
I could think of some approaches but I don't know if CMake covers them, e.g.:
(somehow) define a dependency from an input file (uml.xml in my example) to a variable (list with generated file names)
in case the code generator can be convinced to tell which files it generates the output of code-gen could be used to create a list of input file names. (would lead to similar problems but at least I would not have to use GLOB which might collect old files)
just define a custom target which runs the code generator and handles the output files without CMake (don't like this option)
Update: This question targets a similar problem but just asks how to glob generated files which does not address how to re-configure when the input file changes.
Together with Tsyvarev's answer and some more googling I came up with the following CMakeList.txt which does what I want:
project(generated)
cmake_minimum_required(VERSION 3.6)
set(IN_FILE "${CMAKE_SOURCE_DIR}/input.txt")
set_property(DIRECTORY APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS "${IN_FILE}")
execute_process(
COMMAND python3 "${CMAKE_SOURCE_DIR}/code-gen" "${IN_FILE}"
WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
INPUT_FILE "${IN_FILE}"
OUTPUT_VARIABLE GENERATED_FILES
OUTPUT_STRIP_TRAILING_WHITESPACE
)
add_executable(generated main.cpp ${GENERATED_FILES})
It turns an input file (input.txt) into output files using code-gen and compiles them.
execute_process is being executed in the configure step and the set_property() command makes sure CMake is being re-run when the input file changes.
Note: in this example the code-generator must print a CMake-friendly list on stdout which is nice if you can modify the code generator. FILE(GLOB..) would do the trick too but this would for sure lead to problems (e.g. old generated files being compiled, too, colleagues complaining about your code etc.)
PS: I don't like to answer my own questions - If you come up with a nicer or cleaner solution in the next couple of days I'll take yours!

Bazel Checkers Support

What options do Bazel provide for creating new or extending existing targets that call C/C++-code checkers such as
qac
cppcheck
iwyu
?
Do I need to use a genrule or is there some other target rule for that?
Is https://bazel.build/versions/master/docs/be/extra-actions.html my only viable choice here?
In security critical software industries, such as aviation and automotive, it's very common to use the results of these calls to collect so called "metric reports".
In these cases, calls to such linters must have outputs that are further processed by the build actions of these metric report collectors. In such cases, I cannot find a useful way of reusing Bazel's "extra-actions". Ideas any one?
I've written something which uses extra actions to generate a compile_commands.json file used by clang-tidy and other tools, and I'd like to do the same kind of thing for iwyu when I get around to it. I haven't used those other tools, but I assume they fit the same pattern too.
The basic idea is to run an extra action which generates some output for each file (aka C/C++ compilation command), and then find all the output files afterwards (outside of Bazel) and aggregate them. A reasonably complete example is here for reference. Basically, the action listener (written in Python) decodes the extra action proto and extracts the source files, compiler options, etc:
action = extra_actions_base_pb2.ExtraActionInfo()
with open(argv[1], 'rb') as f:
action.MergeFromString(f.read())
cpp_compile_info = action.Extensions[extra_actions_base_pb2.CppCompileInfo.cpp_compile_info]
compiler = cpp_compile_info.tool
options = ' '.join(cpp_compile_info.compiler_option)
source = cpp_compile_info.source_file
output = cpp_compile_info.output_file
print('%s %s -c %s -o %s' % (compiler, options, source, output))
If you give the extra action an output template, then it can write that output to a file. If you give the output files distinctive names, you can find them all in the output tree and merge them together however you want.
A more sophisticated option is to use bazel query --output=proto and write code to calculate the extra action output filenames of the targets you're interested in from there. That requires writing more code, but you don't have problems with old output files in the output tree that are accidentally included when aggregating.
FWIW, Aspects are another possibility. However, I think extra actions work acceptably for this.

Ansys multiphysics: blank output file

I have a model of a heating process on Ansys Multiphysics, V11.
After running the simulation, I have a script to plot a temperature profile:
!---------------- POST PROCESSING -----------------------
/post1 ! tdatabase postprocessor
!---define profile temperature
path,s_temp1,2,,100 ! define a path
ppath,1,,dop/2,0,0 ! create a path point
ppath,2,,dop/2,1.5,0 ! create a path point
PDEF,surf_t1,TEMP, ,noav ! print a path
plpath,surf_t1 ! plot a path
What I now need, is to save the resulting path in a text file. I have already looked online for a solution, and found the following code to do it, which I appended after the lines above:
/OUTPUT,filename,extension
PRPATH,surf_t1
/OUTPUT
Ansys generates the file filename.extension but it is empty. I tried to place the OUTPUT command in a few locations in the script, but without any success.
I suspect I need to define something else, but I have no idea where to look, as Ansys documentation online is terribly chaotic, and all internet pages I've opened before writing this question are not better.
A final note: Ansys V11 is an old version of the software, but I don't want to upgrade it and fit the old model to the new software.
For the output of the simulation (which includes all calculation steps, and sub-steps description and node-by-node results) the output must be declared in the beginning of the code, and not in the postprocessing phase.
Declaring
/OUTPUT,filename,extension
in the preamble of the main script makes such that the output is stored in the right location, with the desired extension. At the end of the scripts, you must then declare
/OUTPUT
to reset the output file location for ANSYS.
The output to the PATH call made in the postprocessing script is however not printed in the file.
It is convenient to use
*CFOPEN,file,ext
*VWRITE,Vector(1,1).Vector(1,2)
(2F12.6)
*CFCLOSE
where Vector(1,1) is a two column array created by *DIM, and stores your data to output to file
As this is a special command, run it from file i.e. macro_output.mac

Matlab can't find member functions when directory changes. What can I do?

I have a Matlab program that does something like this
cd media;
for i =1:files
d(i).r = %some matlab file read command
d(i).process();
end
cd ..;
When I change to my "media" directory I can still access member properties (such as 'r'), but Matlab can't seem to find functions like process(). How is this problem solved? Is there some kind of global function pointer I can call? My current solution is to do 2 loops, but this is somewhat deeply chagrining.
There are two solutions:
don't change directories - instead give the file path the your file read command, e.g.
d(i).r = load(['media' filesep 'yourfilename.mat']);
or
add the directory containing your process() to the MATLAB path:
addpath('C:\YourObjectsFolder');
As mentioned by tdc, you can use
addpath(genpath('C:\YourObjectsFolder'));
if you also want to add all subdirectories to your path.
Jonas already mentioned addpath, but I usually use it in combination with genpath:
addpath(genpath('path_to_folder'));
which also adds all of the subdirectories of 'path_to_folder' as well.