How to extract licensing information from a bitbake recipe - automation

I will keep it short. I have been handed a yocto repository and asked to audit it for the licences used by the build. My end goal is to:
List all the licenses used by the distro (i.e. licenses used by all the tools and utilities built with distro)
Get a copy of the license file
Get the URL on the internet, where that licence text can be found. (if someone else wants to compare it with what I have provided them)
Being a lazy "software engineer" I am, I want to avoid doing this task and just parse all the .bb files to extract all that information.
I have seen some recipes, which include headers, which in turn have the license information. It'd be nice to be able to follow the trail.
This project on GitHub looks promising. But might not get me exactly what I need.
I also have the entire source code and the license file text distributed with the source code. I should be able to write a simple script to achieve this, but the text in some licenses don't contain the type of license itself.
Any pointers will be greatly appreciated.

First of all, you probably want licenses used in your image, not distro, as you can build all kinds of recipes within any distro, so what matters is only what you ship, which is your image. The way to find out licenses used by software in an image is already described here, but your question differentiates a bit in that you also want full license texts. That's also easy, it's all there in per-package directories in build/tmp/deploy/licenses.
As for your third subquestion, it's not that easy because even something standard like GPLv2 has little variations from project to project, some have exceptions, some have "(c) $YEARS" written in different way, so what the Openembedded build system gives you is actually more reliable as it's extracted from the source. What is possible is to provide the source code itself (via archiver class) along with license information, anyone really curious could cross-check sources and licenses that way.

You should be able to address 1) and 2) via https://www.yoctoproject.org/docs/1.8/dev-manual/dev-manual.html#maintaining-open-source-license-compliance-during-your-products-lifecycle .

Related

How could i get the complete list of puppet modules availabe from puppet repo

I'm looking for a way to get the list of complete puppet modules from the puppet repo
As far as I am aware, there is no direct way to get such a list. I would consider inquiring directly of Puppet, Inc., as it's not out of the question that they would be willing to run a one-off special query for you. I'm sure they would want to know what you want to do with the list, though. And, of course, this is by no means a sure thing.
You could also use a bot to screen-scrape the multiple HTML pages of the all-module list, and process the results to extract the list you want. But that would be a lot more work than just asking.
Note well that the Forge contents are not static. New modules are added regularly, and module versions are updated from time to time. I'm uncertain about the policy for removals, but it seems that they generally do not remove modules, but rather deprecate them. In any case, any list of the Forge contents would necessarily be a snapshot from a single point in time.
You should also understand that although the Forge itself is operated by Puppet, most of the modules are contributed and maintained by community members, if they are maintained at all. There is also an unknown but probably large number of modules in use in the world that are not available from the Forge. Thus, the list you are asking for cannot be construed as a list of official modules, nor as a list of all the modules there are.

What is meant by 'listings for your program'?

I am writing a program in Java for a university project, part of the write up report states:
'You must provide listings for your program'
Can anyone provide me with some clarification on what is meant by this?
I have looked high and wide online but nothing i've come across has helped clear this up for me. I found a definition 'With computer programming, a program listing is the complete listing of a computer program, source code, and all files that make up the software program', but his hasn't helped with my understanding of what is being asked in the report.
Should I be providing screen-grabs of my code? Or a screen-grab of the folder with all related files?
Any help would be appreciated, thanks.
A listing of your program used to mean the code of your program rendered into printed form; i.e. on paper. These days, it could also mean that the source code is formatted and included as a PDF file, or a Word document or something else.
Should I be providing screen-grabs of my code?
It is unclear if that is what your lecturer wants. I don't expect so, because screenshots are harder to read than formatted text.
Or a screen-grab of the folder with all related files?
That is highly unlikely, IMO. If that is what your lecturer wanted they would have said "directory listing" not "listings for your program". (And that would be useless for assessment purposes.)
But my advice is to ask your lecturer if you are at all unclear what is required of you.
And if your lecturer is unwilling to explain, just do what you think is correct.
What you find is correct, you need to give source code and any other resources needed to build and run the softaware.
One options could be to :
- pack my project with some build manager (maven, gradle, etc)
- push it to some repository (like github) with a README.md for building and running
- give the github project reference.
If you prefer not to make public the code, just pack it and send an archive with the maven project.
They are looking for a nice printed output of your source code. In olden days (pre 2K) compilers would produce a output, well formatted (often with a list of symbols and line number to aid in understanding the code.

Software configuration management tool for hundreds of binary files, many are large

Note: I've tried searching, Stackoverflows near useless. I am not sure what kind of tool I need.
At my organization we need to keep track of the software configuration for many types of computers including the binary installers and automation scripts. Change is infrequent but the size of latest version of the configuration is several gigs.
We are trying to use Mercurial to store changes but it is just too slow, even without many revisions at all. I did an hg status but killed it after it took 10 minutes without finishing.
We are looking for a way to store the current configuration as well as having the old configurations there just in case. I have never done anything like this before and do not know what tools are available or even suitable for such tasks. Can someone point me in the right direction or tell me how the are solving this problem? Thanks
Since hard disk space is cheap and being able to view binary differences isn't very helpful, perhaps the best option you have is to store each configuration in a new directory that is indexed somehow. Example below:
/software/configs/2009-03-15
/software/configs/2009-09-28
/software/configs/2009-09-30
Given the size of your files and the infrequent number of changes, this would allow you to pick a configuration from a given 'tag' without the overhead of revision control.
If you pack your files into a single tar file and generate a SHA-512 hash, then you can be reasonably sure that no one has tampered with your files since they were archived.
While I don't know specific details about how to implement this strategy in mercurial, I have been working with git and git-fat. It sets up a general procedure that is likely to be feasible on mercurial as well. Basically the idea is whenever you add a binary file to the repository, under the hood, the repo creates a symlink to the file that is actually stored in another location as a checksummed object.
This allows large files to be tracked by the repo, without storing the actual data inside. It requires the data to be stored in some other location (perhaps in a binary management system).
It might take some configuration to do it in mercurial, but I think it's an elegantly simple solution.

Best approach to perform a CMMI Physical Configuration Audit?

The organization I currently work for an organization that is moving into the whole CMMI world of documenting everything. I was assigned (along with one other individual) the title of Configuration Manager. Congratulations to me right.
Part of the duties is to perform on a regular basis (they are still defining regular basis, it will either by quarterly or monthly) a physical configuration audit. This is basically a check of source code versions deployed in production to what we believe to be the source code versions in production.
Our project is a relatively small web application with written in Java. The file types we work with are java, jsp, xml, property files, and sql packages.
The problem I have (and have expressed but seem to be going ignored) is how am I supposed to physical log on to the production server and verify file versions and even if I could it would take a ridiculous amount of time?
The file versions are not even currently in the file(i.e. in a comment or something). It was suggested that we place visible version numbers on each screen that is visible to the users also. I thought this ridiculous also, since the screens themselves represent only a small fraction of the code we maintain.
The tools we currently use are Netbeans for our IDE and Serena Dimensions as our versioning tool.
I am specifically looking for ideas on how to perform this audit in a hopefully more automated way, that will be both accurate and not time consuming.
My idea is currently to add a comment to the top of each file that contains the version number of that file, a script that runs when a production build is created to create an XML file or something similar containing the file name and version file of each file in the build. Then when I need to do an audit I go to the production server grab the the xml file with the info, and compare it programmatically to what we believe to be in production, and output a report.
Any better ideas. I know this has to have been done already, and seems crazy to me that I have not found any other resources.
You could compute a SHA1 hash of the source files on the production server, and compare that hash value to the versions stored in source control. If you can find the same hash in source control, then you know what version is in production. If you can't find the same hash in source control, then there are untracked modifications in production and your new job title is justified. :)
The typical trap organizations fall into with the CMMI is trying to overdo everything. If I could suggest anything, it'd be start small & only do what you need. So consider any problems that you may have had in the CM area peviously.
The CMMI describes WHAT an organisation should do, but leaves the HOW up to you. The CMMI specification, chapter 2 is well worth a read - it describes the required, expected, and informative components of the specification - basically the goals are required, the practices are expected, and everything else is informative. This means there is only a small part of the specification which a CMMI appraiser can directly demand - the goals. At the practice level, it is permissable to have either the practices as described, or acceptable alternatives to them.
In the case of configuration audits, goal SG3 is "Integrity of baselines is established and maintained". SP3.2 says "Perform configuration audits to maintain integrity of the configuration baselines." There is nothing stated here about how often these are done, or how long they may take.
In my previous organisation, FCA/PCA was usually only done as part of the product release process, and we used ClearCase as the versioning tool, with labels applied across the codebase to define baselines. We didn't have version numbers in all the source files, nor did we have version numbers on all the products screens - the CM activity was doing the right thing & was backed up by audits, and this was never an issue in any CMMI appraisal.
We could use the deltas between labels to look at what files had changed, perform diffs to see the actual code changes. An important part of the process is being able to link those changes back to either a requirement/bug report/whatever the reason was which initiated the change.
Our auditing did use scripts to automate the process, but these were in-house developed scripts are specific to ClearCase - basically they would list all the files, their versions in the CM system, and the baseline/config item to which they belonged.
can't you use your source control for this? if you deploy a version and tag your sourcecontrol with that deployment, you can then verify against the source control system

Batch source-code aware spell check

What is a tool or technique that can be used to perform spell checks upon a whole source code base and its associated resource files?
The spell check should be source code aware meaning that it would stick to checking string literals in the code and not the code itself. Bonus points if the spell checker understands common resource file formats, for example text files containing name-value pairs (only check the values). Super-bonus points if you can tell it which parts of an XML DTD or Schema should be checked and which should be ignored.
Many IDEs can do this for the file you are currently working with. The difference in what I am looking for is something that can operate upon a whole source code base at once.
Something like a Findbugs or PMD type tool for mis-spellings would be ideal.
As you mentioned, many IDEs have this functionality already, and one such IDE is Eclipse. However, unlike many other IDEs Eclipse is:
A) open source
B) designed to be programmable
For instance, here's an article on using Eclipse's code formatting functionality from the command line:
http://www.peterfriese.de/formatting-your-code-using-the-eclipse-code-formatter/
In theory, you should be able to do something similar with it's spell-checking mechanism. I know this isn't exactly what you're looking for, and if there is a program for doing spell-checking in code then obviously that'd be better, but if not then Eclipse may be the next best thing.
This seems little old but seems to do a good job
Source Code Spell Checker