Strategies for maintaining a Symbol Store for nightly builds & release builds

Strategies for maintaining a Symbol Store for nightly builds & release builds - symstore

I am trying to set up a central symbol server for my organization and its various products. Each product has a nightly build, as well as "one-off" beta, RC, and release builds.
The goal I have is to keep about a month's worth of nightly build symbols, as we do a lot of "dogfooding' here so people use internal builds, and we'd like to easily debug files we get from our internal winqual when possible.
I also need to be able to permanently keep all beta, RC, and release build symbols.
After doing much research, I think the best approach here is to have two symbol servers: one for the nightly builds (which have the previous ~30 builds registered), and another to permanently store the beta, RC, and release symbols. I would have the build scripts add to the symbol store using the product and version tags to record the product and build number. After a successful build, a script would use history.txt from the symbol server to identify the oldest build not deleted, then delete it from the symstore.
In the case of the "one off" builds for betas, RCs, and release versions, they would be identified by a build & install person once they're created, and added to the 2nd symbol server (for permanent storage) as well.
So I've a few questions: Does this seem at all reasonable? There must be an easier way to do this, won't most organizations with a symbol server need to tackle this problem?
Secondly, if I am to go ahead with this approach, is there a fool-proof way to identify the oldest known symbol set registered with the server? I'd thought about using last modified dates, but history.txt seems most appropriate but a script parsing that may be error-prone. I was hoping it'd be possible to just add a symbol with product & version info, as well as delete one with product & version info.
Thanks in advance for any help. I'll gladly answer any questions anyone may have, or provide any clarifications.

I think two separate symbol stores is indeed going to be your best bet. For managing the store for your nightly builds, I'd recommend taking a look at AgeStore: http://msdn.microsoft.com/en-us/library/ff560046(v=vs.85).aspx

I wrote a widget in Wise Script that runs for every build that does the following.
Assuming our versioning is 1.0.1.0
1.) 1.0.1.0 through 1.0.6.0 symbols are added to the store (5 runs)
2.) every time there's a build, the symbol store history file is parsed...
3.) when 1.0.7.0 is built and the symbols are added, 1.0.1.0 symbols are deleted from the symbol store.
I'm basically parsing out the version number and if the third place is more than 5 less than current, I parse the transaction ID and run a symstore.exe del /i %TRANS_ID%
This prunes my symbols for daily/CI builds to only the last 5 build symbols.
Any notable symbols such as a release, hotfix, patch...I simply change the product name and add the symbols manually...that way, I am only pruning the dailies.
If you'd like, I could cut/paste the code in here as SMS Installer code (same as Wise). I also wrote a similar widget that keeps my local archive 5 builds deep. That way, I'm not wasting space for my local archive for CI builds. I use both, but you could use either. The both use a simple .INI file for runtime navigation. That way, I can put them both in my Jenkins/Jobs/ folder and simply edit the .INI file for each. They're very light weight as the .EXE is only 161K for the archive_prunerator and 161 for the symstore_widget, plus a 4 or 5 line .INI file.
AJ

Related

Source control in SSIS and Concurrent work on dtsx file

I am working on building a new SSIS project from scratch. I want to work with couple of my teammates. I was hoping to get a suggestion on how we can have some have some source control, so that few of us can work concurrently on the same SSIS project (same dtsx file, building new packages.)
Version:
SQL Server Integration Service v11
Microsoft Visual Studio 2010

It is my experience that there are two opportunities for any source control system and SSIS projects to get out of whack: adding new items to the project and concurrent changes to an existing package.
Adding new items
An SSIS project has the .dtproj extension. Inside there, it's "just" XML defining what all belongs to the project. At least for 2005/2008 and 2012+ on the package deployment model. The 2012+ project deployment model carries a good bit more information about the state of the packages in the project.
When you add new packages (or project level connection managers or .biml files) the internal structure of the .dtproj file is going to change. Diff tools generally don't handle merging XML well. Or at all really. So, to prevent the need for merging the project definition, you need to find a strategy that works for you team.
I've seen two approaches work well. The first is to upfront define all the packages you think you'll need. DimFoo, DimDate, DimFoo, DimBar, FactBlee. Check that project and the associated empty packages in and everyone works on what is out there. When the initial cut of packages is complete, then you'll ensure everyone is sync'ed up and then add more empty packages to the project. The idea here is that there is one person, usually the lead, who is responsible for changing the "master" project definition and everyone consumes from their change.
The other approach requires communication between team members. If you discover a package needs to be added, communicate with your mates "I need to add a new package - has anyone modified the project?" The answer should be No. Once you've notified that a change to the project definition is coming, make it and immediately commit it. The idea here is that people commit and sync/check in whatever terminology with great frequency. If you as a developer don't keep your local repository up to date, you're going to be in for a bad time.
Concurrent edits
Don't. Really, that's about it. The general problem with concurrent changes to an SSIS package is that in addition to the XML diff issue above, SSIS also includes layout data alongside tasks so I can invert the layout and make things flow from bottom to top or right to left and there's no material change to SSIS package but as Siyual notes "Merging changes in SSIS is nightmare fuel"
If you find your packages are so large and that developers need to make concurrent edits, I would propose that you are doing too much in there. Decompose your packages into smaller, more tightly focused units of work and then control their execution through a parent package. That would allow a better level of granularity to your development and debugging process in addition to avoiding the concurrent edit issue.

A dtsx file is basically just an xml file. Compare it to a bunch of people trying to write the same book. The solution I suggest is to use Team Foundation Server as a source control. That way everyone can check in and out and merge packages. If you really dont have that option try to split your ETL process in logical parts and at the end create a master package that calls each sub packages in the right order.
An example: Let's say you need to import stock data from one source, branches and other company information from an internal server and sale amounts from different external sources. After u have all information gathered, you want to connect those and run some analyses.
You first design the target database entities that you need and the relations. One of your member creates a package that does all the import to staging tables. Another guy maybe handles external sources and parallelizes / optimizes the loading. You would build a package that in merges your staging and production tables, maybe historicizing and so on.
At the end you have a master package that calls each of the mentioned packages and maybe some additional logging or such.

In our multi-developer operation, we follow this rough plan:
Each dev has their own branch, separate from master branch
Once a week, devs push all their changes to remote
One of us pulls all changes, and merges all branches into master, manually resolving .dtproj conflicts as we go
Merge master in all dev branches - now all branches agree
Test in VS
Push all branches to remote, other devs can now pull and keep working
It's not a perfect solution, but it helps quarantine the amount of merge pain we have to experience.

We have large ssis solutions with 20+ packages in one solution, with TFS Git. One project required adding a bunch of new packages to the existing solution. We thought we were smart and knew to assign only one person to work on each new package, 2 people working on the same package would be suicide. Wasn't good enough. When 2 people tried add a different named, new, package at the same time, each showed dtproj as a file that had changed/needed to be checked in and suddenly I found myself looking at the xml for dtproj and trying to figure out which lines to keep (Microsoft should never ask end users to manually edit their internal files, which only they wrote and understand). Billinkc's solutions here are very good and the problem is very real. You may think that Microsoft is the great Wise One, and that your team can always add new packages to an existing solution without conflicts, but you'd be wrong. It also doesn't work to put dtproj in .gitignore. If you do that, you won't see other peoples new packages (actually the .dtsx file will come down in git, but you won't see that package in Solution Explorer because dtproj is what feeds Solution Explorer). This is a current problem (2021) and we are using Visual Studio 2017 Enterprise with SSDT.
To explain this problem to people, git obviously can handle a group of independent, individual files in a directory (like say .bat files) and can add, change, and delete those files easily. The problem comes in when you have a file that is naming, describing, and counting all the files in a directory (what dtproj does). When you have a file like dtproj you are creating a conflict on dtproj itself, when 2 people try to a add a new package at the same time. Your dtproj file has a line that shows the package you added, and my dtproj file shows the package I added, and tfs/git sees that as a Conflict.
Some are suggesting ways to deal with this if you have to add a lot of new packages, my idea is a little different. For the people who have to add new packages, don't work in the primary solution where this problem is, work somewhere else. Probably best to work in the "Projects" directory you get when you install Visual Studio, outside of TFS/Git. Obviously follow all the standards, Variable naming, and Package Configuration conventions for the target Solution. Then when the new packages are ready, give the .dtsx files to your Solution Gatekeeper for them to check in. Only the Gatekeeper can check in new packages using Add From Existing, avoiding conflicts. Once the package is checked in, developers can work on them in the main Solution.

What are your build and release steps? When to increment build numbers?

I am having trouble defining and automating my build process despite simple requirements:
Every build should have a unique build number.
Every tagged release should be reproducible
What I have:
A C++, Red Hat Enterprise Linux 5.x, Subversion development environment.
A build machine ( actually a virtual machine )
A version.h file with #defines for major, minor, and buildnumber.
A script for incrementing the version.h buildnumber.
A rpmbuild spec file that exports the tagged Subversion source, builds, and makes the rpm installer packages.
Questions:
Assuming multiple developers per project, when should the build number be incremented and version.h file be checked-in? The build machine? Some sort of Subversion hook? Pre-build or post-build?
Thanks in advance for those willing to take the time to share their experience with build processes.
-Ed
Linux newbie. Former Windows C++/.NET developer.

Why not modify your build process so that it grabs the latest revision number from the repository and uses that as the build number?
Assuming svn incorporates all of the elements that go into a build of your product, that should give you a unique number per potential differing build, and make it easy to match up what the state of the codebase was at the time of building. If there are other elements that can vary with time, you could add another item concatenated to the revision number - perhaps a date/time value.
You'll never have to worry about manually incrementing it, because each time a developer commits they'll increment the revision number automatically.

Don't store the build number directly in your file, use the subversion revision number (or some other monotonically-increasing value, like the date/time) as your build identifier. In the past, I've used the value of date -u +"%Y%m%d%H%M%S", as we were using CVS not SVN.

We have teams of upto 40+ developers adding code to our product. Each developer submits their change to a central location for the product. This forms an ordered list of code submissions. Scripts then take each submission and integrates it into a test build based on the last released configuration and any previous submitted changes in the current round. Unit tests are run after each code submission is compiled, also any acceptance tests added in the current round are also run. At the end of each day unit tests and regression tests are run against the evolving build.
Twice a week the list of submitted code changes is wrapped up and that becomes the a new released configuration of the product and the codebase is updated.

WIX MSBuild automation help - solution best practices

I know there are many questions out there regarding this same information. I have read them all, but my brain is all turned around and I don't know which way to go. Plus the lack of documentation really hurts.
Here is my scenerio. We are trying to use WIX to create an installer for our application that goes out to our dealers for our product information. The app includes about 2000 images and documents of our products and a SQL CE database that are updated via Microsoft Sync Framework. The data changes so often that keeping these 2000 as content files in the app's project is very undesirable. The app relies on .NET Framework 3.5 SP1, SQL Server CE 3.5, Microsoft Sync Framework 1.0 and ADO.NET Sync Services 2.0.
Here are the requirements for the app:
The dealers will be given the app on a CD every year for any updates (app or data updates).
The app must update itself from the internet to get any new images, documents or data.
The prerequisites must be installed if they do not exist on the client machine.
The complete installer should be generated from an MSBuild script with as little human interaction as possible (we don't want to be manually updating the 2000+ file list).
What we have accomplished so far is that we have a Votive project in our solution. We have manually specified the binaries in a .wxs file. Web have modified the .wixproj file to use the HeatDirectory task to gather our data (images and documents and database) from a specified location (This is broken and giving an ICE38 error). This seems all right, but still is a lot of work. We have to manually update our data by running the program in release mode and copying it to the specified directory.
I am looking to see what other people would do in this situation.
How would you arrange your solution with regards to the 2000+ data files? Would you create a custom build script that gets the current data from the server or would you include them as content files in the main project?
How would you get WIX to include all of the project output (including the referenced assemblies) and all of the data files? If you have any complete samples, that would be great. All I have found are little clips here and there and not an entire example from start to finish.
How would you deal with the version numbers? Would you put them as a constant in the build script and reference them through the $(var.VersionNumberName)? Would you have the version number automatically picked up from the project being deployed? If so, How?
If there is any better information than what I am finding, please include. I have read numerous articles, blogs, Stackoverflow questions, the tuturial, the wiki, etc. Everything seems to be in bits and pieces. The tutorial is nice, but doesn't explain anything about MSBuild and Votive. I would like to see a start to finish tutorial on using MSBuild and Votive and all the WIX MSBuild targets. If no one knows of a tutorial like this I may put one together. I have already spent the entire week gathering info and reading. I'm new to MSBuild as well, so if anyone has any great articles on MSBuild, please include them.

The key is to isolate the different types of complexities into separate merge modules and put them altogether into an MSI as part of the build. That way things that change often can change without impacting things that hardly change at all.
1) For the data files:
We use Paraffin to generate the WiX and hence the merge modules for an html + Flash based help system consisting of thousands of files (I can't convince the customer to go to CHM).
Compile these into a merge module all by themselves.
2) Assemblies: assuming that this is a set that changes less often just make a merge module by hand or with WixEdit with the correct files and dependencies.
3) For the version number there a lot of ways to manage this depending on your build system. The AssemblyInfoTask is pretty straight forward way to make sure all your assemblies are versioned appropriately. The MSBuild Extension Pack has some versioning stuff if you are using TFS.

I had a similar scenario and was unable to find a drop in solution so ended up with the following:
I wrote a custom command line program called wixgen.exe for generating wxs manifest files. It is pretty specific to our implementation in that it only knows how to create 2 types of wxs files. One for IIS Website/Virtual Directory deployments and another for Windows Service deployments.
Each time a build is triggered by our continuous integration server a post-build task runs wixgen with the right args to generate a new manifest.wxs for the project being changed. It automatically includes all the files needed for the deployment. These builds also version the dlls using a variation of the technique at: http://richardsbraindump.blogspot.com/2007/07/versioning-builds-with-tfs-and-msbuild.html
A seperate build which is manually triggered is then used to build the wixproj projects containing the generated wxs files and produce the msi's.

I would ditch the CD delivery (so 90's) and got with ClickOnce. This solution seems to fit well since you already use the .NET framework. With ClickOnce you should be able to just keep updating the content of your solution and make updates available to your heart's content. Let me know if you need, sample ClickOnce deployment code.
You can find more ClickOnce information here.

Similar to dkackman's answer, you should seperate your build into several components, isolating build components to be built seperately.
I come from a mainly Java background, however for building MSIs and NET executables we use maven; with the 'maven-wix-plugin' plugin for building the installers, and using the NMaven plugin for compiling any NET code. However, as we're only performing very basic development in NET, with most development in Java, we don't need too much complexity from the NMaven plugin (which is probably a 'good thing' (TM) as it's only at version 0.17).
If you're a purely NET house, you could also look into Blydan (http://www.codeplex.com/byldan), which seems to be the focus of development there at the moment (it's the same team for NMaven and Byldan).
If you do want more information on NMaven or Byldan raise another question and I'll give as much info as I can (which is not a huge amount, as stated I only do very limited NET development).

Does MSBuild know if a project needs to be recompiled?

First, I have a base assumption from watching Visual Studio compile things with its default .*proj files that, if you build the same solution twice in a row, it detects that nothing has changed and seems to fly through the solution build. Does this mean it knows that nothing was changed in a project and doesn't have to make a new DLL output?
If that's the case, I have a question. Say I have a solution with multiple class libraries, and an MSBuild task in each project that automatically increments the build's version by modifying AssemblyInfo.cs. Thing is (if my previous assumption is correct) it does this every time and triggers a new rebuild of each class library. Is there a target or property in MSBuild that can tell if the project needs recompilation, and skip my versioning step if so?
I ask because let's say I update project A, but not project B in a solution. If I run a build on the solution, I want it to update the version on project A, but since project B hasn't changed, I want to leave it alone.

Found something: http://msdn.microsoft.com/en-us/library/ms171483.aspx
MSBuild can compare the timestamps of
the input files with the timestamps of
the output files and determine whether
to skip, build, or partially rebuild a
target. In the following example, if
any file in the #(CSFile) item
collection is newer than the hello.exe
file, MSBuild will run the target;
otherwise it will be skipped:
<Csc
Sources="#(CSFile)"
OutputAssembly="hello.exe"/> </Target>
...that worked. But then got me thinking, what if someone pulls the code down from source control without the assemblies (which is how we do it)? Since it has no output to compare against, it'll do a compile and increment the version anyway. I think the complexities might lead me to abandon this approach.

It doesn't really matter if you increment on a developers box - what's important is that your daily/CI build is only incremented when needed. So, what I've done in the past is have some small XML file contain the next build number, and have an MSBuild task take this xml file and create a file called Version.cs (containing the versioning attributes you'd usually find in AssemblyInfo.cs).
Version.cs is never checked into your soure control - it's generated by the build.
Developers will sync the current XML file, build their binaries, and get the current version number. The continous integration build may also do the same thing. But a daily/official build will check out the XML file, increment the version information, and then check it in. From that moment on the version number has officially changed.
There are variations on this theme, but the general idea works.

Best approach to perform a CMMI Physical Configuration Audit?

The organization I currently work for an organization that is moving into the whole CMMI world of documenting everything. I was assigned (along with one other individual) the title of Configuration Manager. Congratulations to me right.
Part of the duties is to perform on a regular basis (they are still defining regular basis, it will either by quarterly or monthly) a physical configuration audit. This is basically a check of source code versions deployed in production to what we believe to be the source code versions in production.
Our project is a relatively small web application with written in Java. The file types we work with are java, jsp, xml, property files, and sql packages.
The problem I have (and have expressed but seem to be going ignored) is how am I supposed to physical log on to the production server and verify file versions and even if I could it would take a ridiculous amount of time?
The file versions are not even currently in the file(i.e. in a comment or something). It was suggested that we place visible version numbers on each screen that is visible to the users also. I thought this ridiculous also, since the screens themselves represent only a small fraction of the code we maintain.
The tools we currently use are Netbeans for our IDE and Serena Dimensions as our versioning tool.
I am specifically looking for ideas on how to perform this audit in a hopefully more automated way, that will be both accurate and not time consuming.
My idea is currently to add a comment to the top of each file that contains the version number of that file, a script that runs when a production build is created to create an XML file or something similar containing the file name and version file of each file in the build. Then when I need to do an audit I go to the production server grab the the xml file with the info, and compare it programmatically to what we believe to be in production, and output a report.
Any better ideas. I know this has to have been done already, and seems crazy to me that I have not found any other resources.

You could compute a SHA1 hash of the source files on the production server, and compare that hash value to the versions stored in source control. If you can find the same hash in source control, then you know what version is in production. If you can't find the same hash in source control, then there are untracked modifications in production and your new job title is justified. :)

The typical trap organizations fall into with the CMMI is trying to overdo everything. If I could suggest anything, it'd be start small & only do what you need. So consider any problems that you may have had in the CM area peviously.
The CMMI describes WHAT an organisation should do, but leaves the HOW up to you. The CMMI specification, chapter 2 is well worth a read - it describes the required, expected, and informative components of the specification - basically the goals are required, the practices are expected, and everything else is informative. This means there is only a small part of the specification which a CMMI appraiser can directly demand - the goals. At the practice level, it is permissable to have either the practices as described, or acceptable alternatives to them.
In the case of configuration audits, goal SG3 is "Integrity of baselines is established and maintained". SP3.2 says "Perform configuration audits to maintain integrity of the configuration baselines." There is nothing stated here about how often these are done, or how long they may take.
In my previous organisation, FCA/PCA was usually only done as part of the product release process, and we used ClearCase as the versioning tool, with labels applied across the codebase to define baselines. We didn't have version numbers in all the source files, nor did we have version numbers on all the products screens - the CM activity was doing the right thing & was backed up by audits, and this was never an issue in any CMMI appraisal.
We could use the deltas between labels to look at what files had changed, perform diffs to see the actual code changes. An important part of the process is being able to link those changes back to either a requirement/bug report/whatever the reason was which initiated the change.
Our auditing did use scripts to automate the process, but these were in-house developed scripts are specific to ClearCase - basically they would list all the files, their versions in the CM system, and the baseline/config item to which they belonged.

can't you use your source control for this? if you deploy a version and tag your sourcecontrol with that deployment, you can then verify against the source control system

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas