How can I find out what's causing differences in generated Sandcastle docs? - documentation

In Noda Time, we generate our documentation using Sandcastle and SHFB. We then commit the documentation back into the source repository - primarily because that makes it easy to view the latest (and historical) docs.
I'm the primary developer for the project, but I use two computers - and unfortunately, at the moment they're building different documentation even though they're both updated to the same source.
The two computers are the same in every important way I can think of:
Sandcastle 2.7.2.0
SHFB 1.9.6.0
VS 2012 Professional (both reported version 11.0.50727.1 in "Programs", both "Version 11.0.51106.01 Update 1" in the "About" page)
Latest version of local help content for .NET Framework 4.5 (and no local help content for other framework versions)
Steps taken to ensure a clean build:
Deleted the SHFB cache folder (C:\Users\Jon\AppData\Local\EWSoftware\Sandcastle Help File Builder\Cache)
Deleted the folder the documentation is generated into
Deleted the user settings file related to the SHFB project file
Deleted the symbol cache in Visual Studio
Still the differences remain. They appear to be limited to documentation inherited from MSDN itself, in particular Object.Finalize.
Version 1 (generated on machine "Chubby"):
<div class="summary">Allows an object to try to free resources and perform
other cleanup operations before it is reclaimed by garbage collection.</div>
Version 2 (generated on machine "Sandy"):
<div class="summary">Allows an <a
href="http://msdn2.microsoft.com/en-us/library/e5kfa45b" target="_blank">
Object</a> to attempt to free resources and perform other cleanup operations
before the <a href="http://msdn2.microsoft.com/en-us/library/e5kfa45b"
target="_blank">Object</a> is reclaimed by garbage collection.</div>
Both link to the same MSDN documentation, which looks like version 1 (no links to Object).
Looking at a few of the changed files, the change is consistent and restricted to this member.
Where might Sandcastle be getting this documentation from, and how can I get both computers to behave the same way?
EDIT: One more fragment of information - after cleaning the cache and rebuilding the docs on both machines, there are three files in the SHFB Cache directory:
Reflection.cache has the same size on both machines
MsdnUrl.cache has the same size on both machines
.NETFramework_4.0.0319_E8879A28.cache has size 13,377,733 bytes on Chubby, and 13,337,949 bytes on Sandy
EDIT: Significant progress! I've found where the difference is probably coming from...
The file c:\Windows\Microsoft.NET\Framework\v2.0.50727\en\mscorlib.xml:
On Chubby is 8,005,263 bytes with a date of 12th December 2011, and has the non-linked text for Finalize
On Sandy is 9,740,370 bytes with a date of 31st August 2009, and has the text for Finalize which includes links
On both machines, mscorlib.dll itself is the same size (4,550,656 bytes) and has a modified date of 13th September 2012.
But how can I get them to be the same? Where does that difference come from? (Service packs?)
EDIT: Okay, the version in c:\Windows was a red herring - it's the version in c:\Program Files (x86)\Reference Assemblies\Microsoft\Framework that's to blame. I'm going to see if I can find out why that might be different between installations...

A couple of ideas considering your recent edits, although I agree it is a bit shooting in the dark...
I would use a tool like "Beyond Compare" to compare the .Net Framework files and XML files on both machines ("folder compare" profile).
Favour the binary level comparison to be perfectly sure... if both of your machines are local, it should be very fast.
You can also try to run Mark Russinovich's Process Monitor ( http://live.sysinternals.com/procmon.exe ) on both machines and run the documentation building process.
This way, you will see which files are being read from and involved in the help file building process, and where they are coming from...
You will get a lot of output as it will show everything that happens in your system; you may want to disable registry and network monitoring, to only leave file monitoring, and also exclude any process unrelated to the documentation building process.
I'm not an help generation expert, but I would think that the text comes from the XML files, so you may want to put a filter on only showing the xml files as well.
If you can identify the files involved, then you might just have to copy them from one machine over to the other.

Related

Quick backup system for large projects

I've always backed up all my source codes into .zip files and put it in my usb drive and uploaded to my server somewhere else in the world.. however I only do this once every two weeks, because my project is a little big.
Right now my project directories (I have a few of them) contains a hierarchy of c++ files in it, and interspersed with them are .o files which would make backing up take a while if not ignored.
What tools exist out there that will let me just back things up efficiently, conveniently and lets me specify which file types to back up (lots of .png, .jpg and some text types in there), and which directories to be ignored (esp. the build dirs)?
Or is there any ingenious methods out there that people use?
Though not a backup solution, a version control manager on a remote server responds to most of your needs:
only changes are saved, not the whole project
you can filter out what you don't want to save
Moreover, you can create archives of your repository for true backup purposes.
If you want to learn about version control, take a look at Eric Sink's weblog, in particular:
Source Control HOWTO, for the basics of source control
Mercurial, Subversion, and Wesley Snipes for the links to articles on distributed version control systems
I use dropbox, im a single developer developing software. In some projects I work out from my dropbox which means they synchronize every time i build. Other projects i copy the source code there my self. But most important is that i can work on all my computers with dropbox installed on them... works for my simple needs
Agree with mouviciel. If you do not want that, consider rsync or unison to efficiently keep an up-to-date copy, be it on the same or a different machine.

What is your review process for Rhapsody development?

My team is using the IBM's Rhapsody tool to do real-time embedded development. Unfortunately, we are unhappy with our current review process.
More specifically, we've had difficulty because:
there is a lack of a good diff tool for diagram changes
the Rhapsody diff tool doesn't generate reports that you can use in a review
source file history is spotty because source files are products in MDD thus not configured in a VCS at a high granularity
running diffs on source code sometimes pulls in unrelated changes made by other devs
sometimes changing a property of a model element changes dozens of source files
it's easy to change a source file through a property change and not know it
Does anyone have any tips for making peer reviews on Rhapsody development robust but low-hassle? Any best practices and lessons learned you would like to share? I'm not looking for a mature process write-up; tidbits I didn't know about would be great.
We use Rhapsody for the same purpose at my workplace. Reviews of model changes are done with a script that opens diffmerge on two copies of our repository (one at the start of the changes, one at the latest). That shows all of the pertinent changes, without any of the internal cruft Rhapsody adds.
Our repo doesn't track the generated sources, but we see plenty of irrelevant changes in Rhapsody's sbs files frequently. We've started setting sbs files as read-only on the filesystem, and then changing them to read/write from the properties panel in Rhapsody. That doesn't stop the files you mark as read/write from having cruft inserted, but it prevents unrelated files from being modified.
I still haven't found a way to make Rhapsody stop inserting irrelevant changes (for example: it sometimes adds and removes filename fields between saves, despite minimal changes to the model). It creates a lot of merge conflicts, and I've personally started taking 5 or so minutes per commit to only add the changes that matter.
We have been using Rhapsody for development for the past 5 years. Our current process involves using the Rhapsody COM interface and the Microsoft Word COM interface to dump review packages to Word for design reviews. We also do this to generate the reference manual portion of our SUM.
For code we review the generated source.
We put the model into our version control system, and lock down model elements after they have been reviewed. If your version control tool makes things read only when they are checked in, it prevents you from accidentally changing a model element.
The COM interface is also good for dumping the model to make PowerPoint slides of diagrams if you want to present your design to a customer. You will have to tweak the slides after they are generated, as the pictures usually end up looking a little funny, but it gives a quick starting point.
It is also possible to prevent Rhapsody from writing timestamps to the sbs files by setting the property CG::General::IncrementalCodeGenAcrossSession to false. This can help reduce the amount of unnecessary data.
See this link

Software configuration management tool for hundreds of binary files, many are large

Note: I've tried searching, Stackoverflows near useless. I am not sure what kind of tool I need.
At my organization we need to keep track of the software configuration for many types of computers including the binary installers and automation scripts. Change is infrequent but the size of latest version of the configuration is several gigs.
We are trying to use Mercurial to store changes but it is just too slow, even without many revisions at all. I did an hg status but killed it after it took 10 minutes without finishing.
We are looking for a way to store the current configuration as well as having the old configurations there just in case. I have never done anything like this before and do not know what tools are available or even suitable for such tasks. Can someone point me in the right direction or tell me how the are solving this problem? Thanks
Since hard disk space is cheap and being able to view binary differences isn't very helpful, perhaps the best option you have is to store each configuration in a new directory that is indexed somehow. Example below:
/software/configs/2009-03-15
/software/configs/2009-09-28
/software/configs/2009-09-30
Given the size of your files and the infrequent number of changes, this would allow you to pick a configuration from a given 'tag' without the overhead of revision control.
If you pack your files into a single tar file and generate a SHA-512 hash, then you can be reasonably sure that no one has tampered with your files since they were archived.
While I don't know specific details about how to implement this strategy in mercurial, I have been working with git and git-fat. It sets up a general procedure that is likely to be feasible on mercurial as well. Basically the idea is whenever you add a binary file to the repository, under the hood, the repo creates a symlink to the file that is actually stored in another location as a checksummed object.
This allows large files to be tracked by the repo, without storing the actual data inside. It requires the data to be stored in some other location (perhaps in a binary management system).
It might take some configuration to do it in mercurial, but I think it's an elegantly simple solution.

WIX MSBuild automation help - solution best practices

I know there are many questions out there regarding this same information. I have read them all, but my brain is all turned around and I don't know which way to go. Plus the lack of documentation really hurts.
Here is my scenerio. We are trying to use WIX to create an installer for our application that goes out to our dealers for our product information. The app includes about 2000 images and documents of our products and a SQL CE database that are updated via Microsoft Sync Framework. The data changes so often that keeping these 2000 as content files in the app's project is very undesirable. The app relies on .NET Framework 3.5 SP1, SQL Server CE 3.5, Microsoft Sync Framework 1.0 and ADO.NET Sync Services 2.0.
Here are the requirements for the app:
The dealers will be given the app on a CD every year for any updates (app or data updates).
The app must update itself from the internet to get any new images, documents or data.
The prerequisites must be installed if they do not exist on the client machine.
The complete installer should be generated from an MSBuild script with as little human interaction as possible (we don't want to be manually updating the 2000+ file list).
What we have accomplished so far is that we have a Votive project in our solution. We have manually specified the binaries in a .wxs file. Web have modified the .wixproj file to use the HeatDirectory task to gather our data (images and documents and database) from a specified location (This is broken and giving an ICE38 error). This seems all right, but still is a lot of work. We have to manually update our data by running the program in release mode and copying it to the specified directory.
I am looking to see what other people would do in this situation.
How would you arrange your solution with regards to the 2000+ data files? Would you create a custom build script that gets the current data from the server or would you include them as content files in the main project?
How would you get WIX to include all of the project output (including the referenced assemblies) and all of the data files? If you have any complete samples, that would be great. All I have found are little clips here and there and not an entire example from start to finish.
How would you deal with the version numbers? Would you put them as a constant in the build script and reference them through the $(var.VersionNumberName)? Would you have the version number automatically picked up from the project being deployed? If so, How?
If there is any better information than what I am finding, please include. I have read numerous articles, blogs, Stackoverflow questions, the tuturial, the wiki, etc. Everything seems to be in bits and pieces. The tutorial is nice, but doesn't explain anything about MSBuild and Votive. I would like to see a start to finish tutorial on using MSBuild and Votive and all the WIX MSBuild targets. If no one knows of a tutorial like this I may put one together. I have already spent the entire week gathering info and reading. I'm new to MSBuild as well, so if anyone has any great articles on MSBuild, please include them.
The key is to isolate the different types of complexities into separate merge modules and put them altogether into an MSI as part of the build. That way things that change often can change without impacting things that hardly change at all.
1) For the data files:
We use Paraffin to generate the WiX and hence the merge modules for an html + Flash based help system consisting of thousands of files (I can't convince the customer to go to CHM).
Compile these into a merge module all by themselves.
2) Assemblies: assuming that this is a set that changes less often just make a merge module by hand or with WixEdit with the correct files and dependencies.
3) For the version number there a lot of ways to manage this depending on your build system. The AssemblyInfoTask is pretty straight forward way to make sure all your assemblies are versioned appropriately. The MSBuild Extension Pack has some versioning stuff if you are using TFS.
I had a similar scenario and was unable to find a drop in solution so ended up with the following:
I wrote a custom command line program called wixgen.exe for generating wxs manifest files. It is pretty specific to our implementation in that it only knows how to create 2 types of wxs files. One for IIS Website/Virtual Directory deployments and another for Windows Service deployments.
Each time a build is triggered by our continuous integration server a post-build task runs wixgen with the right args to generate a new manifest.wxs for the project being changed. It automatically includes all the files needed for the deployment. These builds also version the dlls using a variation of the technique at: http://richardsbraindump.blogspot.com/2007/07/versioning-builds-with-tfs-and-msbuild.html
A seperate build which is manually triggered is then used to build the wixproj projects containing the generated wxs files and produce the msi's.
I would ditch the CD delivery (so 90's) and got with ClickOnce. This solution seems to fit well since you already use the .NET framework. With ClickOnce you should be able to just keep updating the content of your solution and make updates available to your heart's content. Let me know if you need, sample ClickOnce deployment code.
You can find more ClickOnce information here.
Similar to dkackman's answer, you should seperate your build into several components, isolating build components to be built seperately.
I come from a mainly Java background, however for building MSIs and NET executables we use maven; with the 'maven-wix-plugin' plugin for building the installers, and using the NMaven plugin for compiling any NET code. However, as we're only performing very basic development in NET, with most development in Java, we don't need too much complexity from the NMaven plugin (which is probably a 'good thing' (TM) as it's only at version 0.17).
If you're a purely NET house, you could also look into Blydan (http://www.codeplex.com/byldan), which seems to be the focus of development there at the moment (it's the same team for NMaven and Byldan).
If you do want more information on NMaven or Byldan raise another question and I'll give as much info as I can (which is not a huge amount, as stated I only do very limited NET development).

Best approach to perform a CMMI Physical Configuration Audit?

The organization I currently work for an organization that is moving into the whole CMMI world of documenting everything. I was assigned (along with one other individual) the title of Configuration Manager. Congratulations to me right.
Part of the duties is to perform on a regular basis (they are still defining regular basis, it will either by quarterly or monthly) a physical configuration audit. This is basically a check of source code versions deployed in production to what we believe to be the source code versions in production.
Our project is a relatively small web application with written in Java. The file types we work with are java, jsp, xml, property files, and sql packages.
The problem I have (and have expressed but seem to be going ignored) is how am I supposed to physical log on to the production server and verify file versions and even if I could it would take a ridiculous amount of time?
The file versions are not even currently in the file(i.e. in a comment or something). It was suggested that we place visible version numbers on each screen that is visible to the users also. I thought this ridiculous also, since the screens themselves represent only a small fraction of the code we maintain.
The tools we currently use are Netbeans for our IDE and Serena Dimensions as our versioning tool.
I am specifically looking for ideas on how to perform this audit in a hopefully more automated way, that will be both accurate and not time consuming.
My idea is currently to add a comment to the top of each file that contains the version number of that file, a script that runs when a production build is created to create an XML file or something similar containing the file name and version file of each file in the build. Then when I need to do an audit I go to the production server grab the the xml file with the info, and compare it programmatically to what we believe to be in production, and output a report.
Any better ideas. I know this has to have been done already, and seems crazy to me that I have not found any other resources.
You could compute a SHA1 hash of the source files on the production server, and compare that hash value to the versions stored in source control. If you can find the same hash in source control, then you know what version is in production. If you can't find the same hash in source control, then there are untracked modifications in production and your new job title is justified. :)
The typical trap organizations fall into with the CMMI is trying to overdo everything. If I could suggest anything, it'd be start small & only do what you need. So consider any problems that you may have had in the CM area peviously.
The CMMI describes WHAT an organisation should do, but leaves the HOW up to you. The CMMI specification, chapter 2 is well worth a read - it describes the required, expected, and informative components of the specification - basically the goals are required, the practices are expected, and everything else is informative. This means there is only a small part of the specification which a CMMI appraiser can directly demand - the goals. At the practice level, it is permissable to have either the practices as described, or acceptable alternatives to them.
In the case of configuration audits, goal SG3 is "Integrity of baselines is established and maintained". SP3.2 says "Perform configuration audits to maintain integrity of the configuration baselines." There is nothing stated here about how often these are done, or how long they may take.
In my previous organisation, FCA/PCA was usually only done as part of the product release process, and we used ClearCase as the versioning tool, with labels applied across the codebase to define baselines. We didn't have version numbers in all the source files, nor did we have version numbers on all the products screens - the CM activity was doing the right thing & was backed up by audits, and this was never an issue in any CMMI appraisal.
We could use the deltas between labels to look at what files had changed, perform diffs to see the actual code changes. An important part of the process is being able to link those changes back to either a requirement/bug report/whatever the reason was which initiated the change.
Our auditing did use scripts to automate the process, but these were in-house developed scripts are specific to ClearCase - basically they would list all the files, their versions in the CM system, and the baseline/config item to which they belonged.
can't you use your source control for this? if you deploy a version and tag your sourcecontrol with that deployment, you can then verify against the source control system