Is it possible to reuse docs generated by Doxygen and merge with new documentation? - documentation

Since the codebase is quite large, Doxygen takes a really long time to run. If I could obtain the modified files from some version control system and run Doxygen on them, is it possible to merge the existing documentation with the new pages generated?
If so, how can this be done?

Doxygen does not have an incremental build, though there are some mechanisms that do speedup the generation slightly:
Generated images (e.g. call graphs, inheritance graphs are "cached" i.e. in short a md5sum is stored and when this does not change for the graph the image is not regenerated)
for "independent" parts it is possible to create "tag" files (see documentation e.g. TAGFILES)

Related

Using a static website generator with forms and data-driven content?

I am looking at using a static generator to generate up to hundreds of thousands of pages (on S3) of data-driven content from json or csv files, each of which has an html form that posts to an external API. Is this a feasible undertaking?
It depends on your requirements, but at minimum, you might even get away with a simple node program that uses fs to read/write. Going up the complexity spectrum, you might do with a Gulp setup. Going even further up the spectrum, you can use static website generator to read/write your data files (but that's probably worth the trouble only if you already know static generator and/or you will want to have a blog on S3 as well, driven by .MD files, besides hundreds of thousands of data-driven pages).
If going simple node script route, you would create your local application in a js file, run it through command line in node. It would generate thousands of pages locally, then you would upload them to S3. You can either use standard fallbacks or fancier way using promises (like using Bluebird). This way is the most manual but you have the most control over the result.
For the record, you could whip up a script in any programming language that you are proficient, like for example, PHP. JavaScript is popular these days, that's why I'm assuming you would use JS.
If going Gulp route, I imagine a custom function that reads data files from the location, parses their contents into an array, and writes the contents into files.
If going Hugo route, simply use data driven content reference, getCSV function. You'll still need to work in the context of a website, this means the more you stray off the website's setup, the more you'll have to fight Hugo.
As I mentioned, arguments against static website generator would be if you don't need the website part, only to perform operations on data and write files, it might stand in a way.
Hugo is a good option for thousands of files because it's fast.
Solution also depends on if your CSV files are going to change, or is it one-off thing; also how much automation you need. Gulp approach might be handy even if you go Hugo route.
So, yes, it is a very feasible undertaking.

Adding a two new phases to an Xcode framework project

I am building a project on Github written in Objective-C. It resolves MAC addresses down to manufacturer details. The lookup table is currently stored as text file manuf.txt (from the Wireshark project), which is parsed at run-time, which is costly. I would prefer to compile this down to archived objects at build-time, and load that instead.
I would like to amend the build phases such that I:
Build a simple compiler
Run the compiler, parsing manuf.txt and outputting archived objects
Build the framework
Copy the archived objects into the framwork
I am looking for wisdom on how to achieve steps 1 and 2 using Xcode v7.3 as Xcode provides only a Copy Files phase or a Run Script phase. An example of other projects achieving similar goals would be inspiring.
I suspect that what you are asking is possible, but tricky. The reason is that you will need to write a bunch of class files and then dynamically add them to the project.
Firstly you will need to employ a run script phase to run various tools from the command line to parse your file and generate a number of class files from it. I would suggest looking into various templating engines. For example appledoc uses moustache templates to generate API documentation files. You could use the same technique to generate header and implementation files.
Next, rather than generating archived objects an trying to import into a framework. I think you may be better off generating raw source code, adding it to a project and compiling into a framework. Probably simpler in the long run.
To automatically include the generated code I would look into (which means I haven't actually tried this :-) adding a folder reference to the project rather than an Xcode group. Folder references are an option in the 'Add files to ...' dialog.
Folder references refer to a directory and automatically add the entire contents of that directory to a project. So you can use one to point to the directory where you have generated the source code. This is a much better option than trying to manipulate the project or injecting things into an established framework.
I would prefer to parse the file at runtime. After launch you can look for an already existing output, otherwise parse it one time.
However, I have to do something similar at Objective-Cloud. I simply added a run script build phase and put the compiler call into it.

Is it possible to add a whole directory of source files to CMake command add_executable?

The documentation of CMake's add_executable gives the following specification of the command:
add_executable(<name> [WIN32] [MACOSX_BUNDLE]
[EXCLUDE_FROM_ALL]
source1 [source2 ...])
I now have a rather large project with a lot of sources and was wondering if it is possible to add a directory as a parameter for add_executable instead of specifying each source file individually? If not, are there any best practices or recommendations on how to approach this situation? I can't imagine the only way this would work is by adding each source file individually? How would this work for (really) large projects then, this doesn't seem like an elegant approach...
The best practice is indeed to list all files manually.
In particular, the CMake docs warn about using GLOB for this purpose:
We do not recommend using GLOB to collect a list of source files from
your source tree. If no CMakeLists.txt file changes when a source is
added or removed then the generated build system cannot know when to
ask CMake to regenerate.
This point is somewhat controversial, as many developers prefer that the build system just adjusts automatically to newly added files. The price for this automation is an increase in fragility of the build scripts.
You will have to remember to manually re-run CMake whenever files were added or removed. You also have to ensure that the physical layout of the files on disk matches the logical layout of the projects that you want to build. The latter point is arguably the bigger problem here. By decoupling the build system from the files on disk you add an additional safety net, but you have to pay for it with increased build script maintenance costs.
The biggest disadvantage of the explicit approach is imho that if you forget to add a new file to the CMakeLists, you might be wondering over weird linker errors for a while before realizing your mistake. I personally find the maintenance overhead for this approach acceptable. Sure, you will have a lengthy filelist in your build script, but you do not have to touch it that often and the changes will usually be trivial.
Since this point is somewhat controversial, I won't blame you if you want to use a GLOB for your project. Just be aware of the consequences and be prepared that all the cool kids will laugh at you if your build breaks one day because of this.

CDash Custom Dynamic Analysis

I'm trying to integrate custom dynamic analysis tools to CDash. Such as KWStyle, CppCheck and Visual Leak Detector.
I'v figured out that I need to generate a DynamicAnalysis.xml file and submit it to CDash, from CTest scripts.
I think I know how to run the external tool as a part of the ctest script.
Either by using these variables to change how ctest_memcheck() works
CTEST_MEMORYCHECK_COMMAND
CTEST_MEMORYCHECK_SUPPRESSIONS_FILE
CTEST_MEMORYCHECK_COMMAND_OPTIONS
or by running the tool from the execute_process() command.
But I'm a bit uncertain which one to use.
The main problem I think I have is, how can I extract errors from the output of the custom tool and include that information into the DynamicAnalysis.xml to submit?
The extreme solution i see is that i'd need to make a program that generates a valid DynamicAnalysis.xml file.
But the problem is that I don't know the syntax of the DefectList element in the XML file. I have found no answer from google and even the XML Schema for that file is unhelpful.
EDIT:
Looking at this:
http://www.cdash.org/CDash/viewDynamicAnalysis.php?buildid=987149
What draws my attention are the labels, especially the empty ones. I don't see how these would come from the DynamicAnalysis.xml file. Maybe it tracks any labels that have ever appearred? Can i create my own custom labels somehow?
Does CDash create the labels automatically, depending on the tool type? Does this block custom defect types?
I'm just guessing here, so the question is; can i create custom labels for my custom tool, just by generating a DynamicAnalysis.xml - file.
It occurred to me that the amount of different errors from CppCheck (static code analysis) is huge, compared to valgrind for instance. I'm not that certain that I should use the dynamic analysis. Maybe a custom build type (Continuous / Experimental / Nightly) thing would work better. Like this:
http://www.cdash.org/CDash/buildSummary.php?buildid=930174
I have no idea how to do this, i guess it requires meddling around with CDash code?
Which one would work better?
If you are using valgrind, you can simply set CTEST_MEMORYCHECK_COMMAND to the full path to valgrind, and ctest will generate the DynamicAnalysis.xml file for you from the valgrind output when you call ctest_memcheck.
The best way to understand the possible values that can appear in the DynamicAnalysis.xml file is to analyze the source code of CTest.
The file CMake/Source/CTest/cmCTestMemCheckHandler.cxx has the list of defect types in a variable named "cmCTestMemCheckResultLongStrings". Search through that file for references to that variable to see what the possible values are and how they are used to generate "<Defect/>" xml elements.
EDIT (for additional information):
You can also easily see what XML elements CDash is expecting by inspecting its source code. Specifically, the file "CDash/xml_handlers/dynamic_analysis_handler.php".
From what I'v learned so far, is that for a tool that runs on the tests made in the cmake script, the Dynamic Analysis is the thing.
For tools that run on the entire program, a custom Build.xml is the thing you need.
I found out that i can commit those files from the ctest_submit command by using the FILES parameter.
I also found out that you can add custom "build names" to the side of Continuous, Nightly, and others.
And that you can set the builds from certain machines to be automatically transferred under these.
The custom labels under DynamicAnalysis did come from somewhere in CDash, i can't remember where anymore.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.