Best approach to perform a CMMI Physical Configuration Audit? - configuration-management

The organization I currently work for an organization that is moving into the whole CMMI world of documenting everything. I was assigned (along with one other individual) the title of Configuration Manager. Congratulations to me right.
Part of the duties is to perform on a regular basis (they are still defining regular basis, it will either by quarterly or monthly) a physical configuration audit. This is basically a check of source code versions deployed in production to what we believe to be the source code versions in production.
Our project is a relatively small web application with written in Java. The file types we work with are java, jsp, xml, property files, and sql packages.
The problem I have (and have expressed but seem to be going ignored) is how am I supposed to physical log on to the production server and verify file versions and even if I could it would take a ridiculous amount of time?
The file versions are not even currently in the file(i.e. in a comment or something). It was suggested that we place visible version numbers on each screen that is visible to the users also. I thought this ridiculous also, since the screens themselves represent only a small fraction of the code we maintain.
The tools we currently use are Netbeans for our IDE and Serena Dimensions as our versioning tool.
I am specifically looking for ideas on how to perform this audit in a hopefully more automated way, that will be both accurate and not time consuming.
My idea is currently to add a comment to the top of each file that contains the version number of that file, a script that runs when a production build is created to create an XML file or something similar containing the file name and version file of each file in the build. Then when I need to do an audit I go to the production server grab the the xml file with the info, and compare it programmatically to what we believe to be in production, and output a report.
Any better ideas. I know this has to have been done already, and seems crazy to me that I have not found any other resources.

You could compute a SHA1 hash of the source files on the production server, and compare that hash value to the versions stored in source control. If you can find the same hash in source control, then you know what version is in production. If you can't find the same hash in source control, then there are untracked modifications in production and your new job title is justified. :)

The typical trap organizations fall into with the CMMI is trying to overdo everything. If I could suggest anything, it'd be start small & only do what you need. So consider any problems that you may have had in the CM area peviously.
The CMMI describes WHAT an organisation should do, but leaves the HOW up to you. The CMMI specification, chapter 2 is well worth a read - it describes the required, expected, and informative components of the specification - basically the goals are required, the practices are expected, and everything else is informative. This means there is only a small part of the specification which a CMMI appraiser can directly demand - the goals. At the practice level, it is permissable to have either the practices as described, or acceptable alternatives to them.
In the case of configuration audits, goal SG3 is "Integrity of baselines is established and maintained". SP3.2 says "Perform configuration audits to maintain integrity of the configuration baselines." There is nothing stated here about how often these are done, or how long they may take.
In my previous organisation, FCA/PCA was usually only done as part of the product release process, and we used ClearCase as the versioning tool, with labels applied across the codebase to define baselines. We didn't have version numbers in all the source files, nor did we have version numbers on all the products screens - the CM activity was doing the right thing & was backed up by audits, and this was never an issue in any CMMI appraisal.
We could use the deltas between labels to look at what files had changed, perform diffs to see the actual code changes. An important part of the process is being able to link those changes back to either a requirement/bug report/whatever the reason was which initiated the change.
Our auditing did use scripts to automate the process, but these were in-house developed scripts are specific to ClearCase - basically they would list all the files, their versions in the CM system, and the baseline/config item to which they belonged.

can't you use your source control for this? if you deploy a version and tag your sourcecontrol with that deployment, you can then verify against the source control system

Related

Optionally leave old version of component on upgrade

I've been trying to set up a WiX component such that the user can specify that the installer should not upgrade that component on a MajorUpgrade. I had the following code, but this means that if the condition is met then the new version is not installed, but the old version is also removed.
<Component Id="ExampleComponent" GUID="{GUID here}">
<Condition>NOT(KEEPOLDFILE="TRUE")</Condition>
<File Id="ExampleFile" Name="File.txt" KeyPath="yes" Source="File.txt"/>
</Component>
Ideally, if the user specifies "KEEPOLDFILE=TRUE", then the existing version of "File.txt" should be kept. I've looked into using the Permanent attribute, but this doesn't look relevant.
Is this possible to achieve without using CustomActions?
A bit more background information would be useful, however:
If your major upgrade is sequenced early (e.g. afterInstallInitialize) the upgrade is an uninstall followed by a fresh install, so saving the file is a tricky proposition because you'd save it, then do the new install, then restore it.
If the upgrade is late, then file overwrite rules apply during the upgrade, therefore it won't be replaced anyway. You'd need to do something such as make the creation and modify timestamps identical so that Windows will overwrite it with the new one. The solution in this case would be to run a custom action conditioned on "keep old file", so you'd do the reverse of this:
https://blogs.msdn.microsoft.com/astebner/2013/05/23/updating-the-last-modified-time-to-prevent-windows-installer-from-updating-an-unversioned-file/
And it's also not clear if that file is ALWAYS updated, so if in fact it has not been updated then why bother to ask the client whether to keep it?
It might be simpler to ignore the Windows Installer behavior by setting its component id to null, as documented here:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa368007(v=vs.85).aspx
Then you can do what you want with the file. If you've already installed it with a component guid it's too late for this solution.
There are better solutions that require the app to get involved where you install a template version of this file. The app makes a copy of it that it always uses. At upgrade time that template file is always replaced, and when the app first runs after the upgrade it asks whether to use the new file (so it copies and overwrites the one it was using) or continue to use the existing file. In my opinion delegating these issues to the install is not often an optimal solution.
Setting attributes like Permanent is typically not a good idea because they are not project attributes you can turn on and off on a whim - they apply to that component id on the system, and permanent means permanent.
I tried to make this a comment, it became to long. I prefer option 4 that Phil describes. Data files should not be meddled with by the setup, but managed by your application exe (if there is one) during its launch sequence. I don't know about others, but I feel like a broken record repeating this advice, but hear us out...
There is a description of a way to manage your data file's overwriting or preservation here. Essentially you update your exe to be "aware" of how your data file should be managed - if it should be preserved or overwritten, and you can change this behavior per version of your application exe if you like. The linked thread describes registry keys, but the concept can be used for files as well.
So essentially:
Template: Install your file per-machine as a read-only template
Launch Sequence: Copy it in place with application.exe launch sequence magic
Complex File Revision: Update the logic for file overwrite or preservation for every release as you see fit along the lines as the linked thread proposes
Your setup will "never know" about your data file, only the template file. It will leave your data file alone in all cases. Only the template file it will deal with.
Liberating your data files from the setup has many advantages:
Setup.exe bugs: No unintended accidental file overwrites or file reset problems from problematic major upgrade etc... this is a very common problem with MSI.
Setup bugs are hard to reproduce and debug since the conditions found on the target systems can generally not be replicated and debugging involves a lot of unusual technical complexity.
This is not great - it is messy - but here is a list of common MSI problems: How do I avoid common design flaws in my WiX / MSI deployment solution? - "a best effort in the interest of helping sort of thing". Let's be honest, it is a mess, but maybe it is helpful.
Application.exe Bugs: Keep in mind that you can make new bugs in your application.exe file, so you can still see errors - obviously. Bad ones too - if you are not careful - but you can easily implement a backup feature as well - one that always runs in a predictable context.
You avoid the complicated sequencing, conditioning and impersonation concerns that make custom actions and modern setups so complicated to do right and make reliable.
Following from that and other, technical and practical reasons: it is much easier to debug problems in the application launch sequence than bugs in your setup.
You can easily set up test conditions and test them interactively. In other words you can re-create problem conditions easily and test them in seconds. It could take you hours to do so with a setup.
Error messages can be interactive and meaningful and be shown to the user.
QA people are more familiar with testing application functionality than setup functionality.
And I repeat it: you are always in the same impersonation context (user context) and you have no installation sequence to worry about.

Source control in SSIS and Concurrent work on dtsx file

I am working on building a new SSIS project from scratch. I want to work with couple of my teammates. I was hoping to get a suggestion on how we can have some have some source control, so that few of us can work concurrently on the same SSIS project (same dtsx file, building new packages.)
Version:
SQL Server Integration Service v11
Microsoft Visual Studio 2010
It is my experience that there are two opportunities for any source control system and SSIS projects to get out of whack: adding new items to the project and concurrent changes to an existing package.
Adding new items
An SSIS project has the .dtproj extension. Inside there, it's "just" XML defining what all belongs to the project. At least for 2005/2008 and 2012+ on the package deployment model. The 2012+ project deployment model carries a good bit more information about the state of the packages in the project.
When you add new packages (or project level connection managers or .biml files) the internal structure of the .dtproj file is going to change. Diff tools generally don't handle merging XML well. Or at all really. So, to prevent the need for merging the project definition, you need to find a strategy that works for you team.
I've seen two approaches work well. The first is to upfront define all the packages you think you'll need. DimFoo, DimDate, DimFoo, DimBar, FactBlee. Check that project and the associated empty packages in and everyone works on what is out there. When the initial cut of packages is complete, then you'll ensure everyone is sync'ed up and then add more empty packages to the project. The idea here is that there is one person, usually the lead, who is responsible for changing the "master" project definition and everyone consumes from their change.
The other approach requires communication between team members. If you discover a package needs to be added, communicate with your mates "I need to add a new package - has anyone modified the project?" The answer should be No. Once you've notified that a change to the project definition is coming, make it and immediately commit it. The idea here is that people commit and sync/check in whatever terminology with great frequency. If you as a developer don't keep your local repository up to date, you're going to be in for a bad time.
Concurrent edits
Don't. Really, that's about it. The general problem with concurrent changes to an SSIS package is that in addition to the XML diff issue above, SSIS also includes layout data alongside tasks so I can invert the layout and make things flow from bottom to top or right to left and there's no material change to SSIS package but as Siyual notes "Merging changes in SSIS is nightmare fuel"
If you find your packages are so large and that developers need to make concurrent edits, I would propose that you are doing too much in there. Decompose your packages into smaller, more tightly focused units of work and then control their execution through a parent package. That would allow a better level of granularity to your development and debugging process in addition to avoiding the concurrent edit issue.
A dtsx file is basically just an xml file. Compare it to a bunch of people trying to write the same book. The solution I suggest is to use Team Foundation Server as a source control. That way everyone can check in and out and merge packages. If you really dont have that option try to split your ETL process in logical parts and at the end create a master package that calls each sub packages in the right order.
An example: Let's say you need to import stock data from one source, branches and other company information from an internal server and sale amounts from different external sources. After u have all information gathered, you want to connect those and run some analyses.
You first design the target database entities that you need and the relations. One of your member creates a package that does all the import to staging tables. Another guy maybe handles external sources and parallelizes / optimizes the loading. You would build a package that in merges your staging and production tables, maybe historicizing and so on.
At the end you have a master package that calls each of the mentioned packages and maybe some additional logging or such.
In our multi-developer operation, we follow this rough plan:
Each dev has their own branch, separate from master branch
Once a week, devs push all their changes to remote
One of us pulls all changes, and merges all branches into master, manually resolving .dtproj conflicts as we go
Merge master in all dev branches - now all branches agree
Test in VS
Push all branches to remote, other devs can now pull and keep working
It's not a perfect solution, but it helps quarantine the amount of merge pain we have to experience.
We have large ssis solutions with 20+ packages in one solution, with TFS Git. One project required adding a bunch of new packages to the existing solution. We thought we were smart and knew to assign only one person to work on each new package, 2 people working on the same package would be suicide. Wasn't good enough. When 2 people tried add a different named, new, package at the same time, each showed dtproj as a file that had changed/needed to be checked in and suddenly I found myself looking at the xml for dtproj and trying to figure out which lines to keep (Microsoft should never ask end users to manually edit their internal files, which only they wrote and understand). Billinkc's solutions here are very good and the problem is very real. You may think that Microsoft is the great Wise One, and that your team can always add new packages to an existing solution without conflicts, but you'd be wrong. It also doesn't work to put dtproj in .gitignore. If you do that, you won't see other peoples new packages (actually the .dtsx file will come down in git, but you won't see that package in Solution Explorer because dtproj is what feeds Solution Explorer). This is a current problem (2021) and we are using Visual Studio 2017 Enterprise with SSDT.
To explain this problem to people, git obviously can handle a group of independent, individual files in a directory (like say .bat files) and can add, change, and delete those files easily. The problem comes in when you have a file that is naming, describing, and counting all the files in a directory (what dtproj does). When you have a file like dtproj you are creating a conflict on dtproj itself, when 2 people try to a add a new package at the same time. Your dtproj file has a line that shows the package you added, and my dtproj file shows the package I added, and tfs/git sees that as a Conflict.
Some are suggesting ways to deal with this if you have to add a lot of new packages, my idea is a little different. For the people who have to add new packages, don't work in the primary solution where this problem is, work somewhere else. Probably best to work in the "Projects" directory you get when you install Visual Studio, outside of TFS/Git. Obviously follow all the standards, Variable naming, and Package Configuration conventions for the target Solution. Then when the new packages are ready, give the .dtsx files to your Solution Gatekeeper for them to check in. Only the Gatekeeper can check in new packages using Add From Existing, avoiding conflicts. Once the package is checked in, developers can work on them in the main Solution.

What is the MSI component generation best practice?

Visual Studio Installer states that it is a best practice to install each file as an installer component. The heat utility provided with Wix also seems to follow the practice of putting every file in its own component.
InstallShield's component wizard uses InstallShield's setup best practice of placing portable executable files in their own component but groups all other files (e.g. unversioned files) by the common destination folder.
The advantage of practice one (each file in its own component) is that each file is set up as a key file which is important if you want these files to trigger repairs. It also allows automation of creating the components (e.g. heat) easier since you are creating a component for each file.
The disadvantages of practice one include the overhead of managing so many components and the bloating of the registry after the application is installed.
An advantage of practice 2 could be seen in an install that installs hundreds of graphics files to one directory. If you do not care about repair functionality, is there any reason to create hundreds of components for this install?
These 2 different practices are conflicting and I want to know which one that people actually use and why.
I always use the Microsoft approach (something similar to what InstallShield does):
http://msdn.microsoft.com/en-us/library/aa368269(VS.85).aspx
I think it's the best because:
- important files (EXE, DLL etc.) have their own component, so they can be repaired easily
- resource files are grouped together
- it allows an optimum components count (not too many to get a long install, but enough to allow an easy repair)
I also noticed that most commercial setup authoring tools use this approach.
I've written about this in the past and I'll try to find a link to it. I think you already understand the question and it's just time for you to decide what is important to you.
For me, I work on installs with 15,000+ files and we only service with major upgrades. For "Program Executables" we follow 1:1 principals ( a must for COM, Services, ShortCuts and so on anyways ) but for content/data files we actually do a 1 to many with no key file approach to cut down on our number of components. Sure, that means we won't be able to create an MSP that services just one or two content files here and there but for our business needs that's simply not important to us.
Resilency was a bit of a 4 letter word to us so having less key files makes us happier anyways. :-) BTW, VDPROJ also makes every registry key a keyfile of it's own component and that was quite painful for us triggering unneeded repairs.
All of this aside, for anyone who doesn't fully understand all of this, I'd stick to the 1:1 pattern until you come across a situation where you don't want to anymore and you understand the impact of making that choice.

What is your review process for Rhapsody development?

My team is using the IBM's Rhapsody tool to do real-time embedded development. Unfortunately, we are unhappy with our current review process.
More specifically, we've had difficulty because:
there is a lack of a good diff tool for diagram changes
the Rhapsody diff tool doesn't generate reports that you can use in a review
source file history is spotty because source files are products in MDD thus not configured in a VCS at a high granularity
running diffs on source code sometimes pulls in unrelated changes made by other devs
sometimes changing a property of a model element changes dozens of source files
it's easy to change a source file through a property change and not know it
Does anyone have any tips for making peer reviews on Rhapsody development robust but low-hassle? Any best practices and lessons learned you would like to share? I'm not looking for a mature process write-up; tidbits I didn't know about would be great.
We use Rhapsody for the same purpose at my workplace. Reviews of model changes are done with a script that opens diffmerge on two copies of our repository (one at the start of the changes, one at the latest). That shows all of the pertinent changes, without any of the internal cruft Rhapsody adds.
Our repo doesn't track the generated sources, but we see plenty of irrelevant changes in Rhapsody's sbs files frequently. We've started setting sbs files as read-only on the filesystem, and then changing them to read/write from the properties panel in Rhapsody. That doesn't stop the files you mark as read/write from having cruft inserted, but it prevents unrelated files from being modified.
I still haven't found a way to make Rhapsody stop inserting irrelevant changes (for example: it sometimes adds and removes filename fields between saves, despite minimal changes to the model). It creates a lot of merge conflicts, and I've personally started taking 5 or so minutes per commit to only add the changes that matter.
We have been using Rhapsody for development for the past 5 years. Our current process involves using the Rhapsody COM interface and the Microsoft Word COM interface to dump review packages to Word for design reviews. We also do this to generate the reference manual portion of our SUM.
For code we review the generated source.
We put the model into our version control system, and lock down model elements after they have been reviewed. If your version control tool makes things read only when they are checked in, it prevents you from accidentally changing a model element.
The COM interface is also good for dumping the model to make PowerPoint slides of diagrams if you want to present your design to a customer. You will have to tweak the slides after they are generated, as the pictures usually end up looking a little funny, but it gives a quick starting point.
It is also possible to prevent Rhapsody from writing timestamps to the sbs files by setting the property CG::General::IncrementalCodeGenAcrossSession to false. This can help reduce the amount of unnecessary data.
See this link

Software configuration management tool for hundreds of binary files, many are large

Note: I've tried searching, Stackoverflows near useless. I am not sure what kind of tool I need.
At my organization we need to keep track of the software configuration for many types of computers including the binary installers and automation scripts. Change is infrequent but the size of latest version of the configuration is several gigs.
We are trying to use Mercurial to store changes but it is just too slow, even without many revisions at all. I did an hg status but killed it after it took 10 minutes without finishing.
We are looking for a way to store the current configuration as well as having the old configurations there just in case. I have never done anything like this before and do not know what tools are available or even suitable for such tasks. Can someone point me in the right direction or tell me how the are solving this problem? Thanks
Since hard disk space is cheap and being able to view binary differences isn't very helpful, perhaps the best option you have is to store each configuration in a new directory that is indexed somehow. Example below:
/software/configs/2009-03-15
/software/configs/2009-09-28
/software/configs/2009-09-30
Given the size of your files and the infrequent number of changes, this would allow you to pick a configuration from a given 'tag' without the overhead of revision control.
If you pack your files into a single tar file and generate a SHA-512 hash, then you can be reasonably sure that no one has tampered with your files since they were archived.
While I don't know specific details about how to implement this strategy in mercurial, I have been working with git and git-fat. It sets up a general procedure that is likely to be feasible on mercurial as well. Basically the idea is whenever you add a binary file to the repository, under the hood, the repo creates a symlink to the file that is actually stored in another location as a checksummed object.
This allows large files to be tracked by the repo, without storing the actual data inside. It requires the data to be stored in some other location (perhaps in a binary management system).
It might take some configuration to do it in mercurial, but I think it's an elegantly simple solution.