WiX: Paraffin and repository/build server integration - wix

Short version: How can I make sure that my component GUIDs remain stable using Paraffin on a build server?
I am currently working on a project that should be deployed via WiX. As this is a web project, it contains many files (still in early stage and already almost 200 files). Also, during development, files are constantly added and deleted, so maintaining the WiX component lists manually is simply not an option.
Since I read a lot about component rules and that people breaking them go to hell, I decided to go with Paraffin as a harvester. This tool is capable of updating an existing component list, thus not re-creating new GUIDs for existing components.
However, when a new component is created, the tool assigns a new GUID. Even if the component files are identical, then initial GUIDs will be different on different machines or even only at different times.
So, obviously, I need a central authority for fixing the initial GUIDs. My idea was to commit empty component lists, which are then filled by the build server calling Paraffin on build. So when I only distribute the MSIs created by the build server, I can be sure that component rules are being followed.
However, the problem with this approach is, that I have no means of tracking my GUIDs, should the build server crash or empty its local repository. I was thinking about having the build server commit the generated component list to my repository, but that doesn't seem like a clean idea.
Another solution I thought of was having all developers build (and thus call Paraffin) before commiting. Thus, each developer would create the initial GUIDs for their newly added files and commit them to the component list.
The obvious problem with this approach: People (e.g. developer A) will forget to build before they commit. So in these cases the build server will create the initial GUIDs for the new files, but those will also only be stored locally. A few commits later, developer B will come along and build the solution, creating a new GUID for the files created by developer A. He will then commit the component list containing this GUID and the build server will check it out. Now the build server has obtained a GUID (created by developer A) for a package, for which it had previously used a different (self-created) GUID, even though the files didn't change in the meantime.
So, how can I make sure, that my GUIDs remain stable between builds without relying on developers to build their solution before they commit? The approaches outlined above both seem unsatisfying to me, but are all I can think of right now.

As far as I am concerned component rules only really come into play when you have multiple installers that share components with the same guids (which should then be exactly the same resource(s)) or you are using a wixlib or a merge module which is then included as part of different installers.
From what you have said above, to me it doesn't sound like you will so, there is no harm in having different component guids for each build. It will just mean that when you upgrade the website, files that have not changed will be removed and re-installed under a different component guid. IMHO that doesn't really matter as long as the installer correctly installs all files that are required for the site to function and doesn't remove components from other products.
If you use the MajorUpgrade element, the old product will be completely removed before the new one is installed so any component guid's that are shared between the two versions will be removed and then re-installed anyway.
I always just leave my guid elements as Guid='*' that way I know that the there will never^ be any guid clashes in any of my components across my multiple products.
^ I know this is not theoretically true but in this use case it is.

Not entirely true. Changing your component GUIDs from build to build is fine if and only if you schedule RemoveExistingProducts early so that the files are off the system before you reinstall the new GUIDs. This approach works well for smallish installers with not so many files, but as your installer grows you will feel the pinch of having twice as much IO to do, as you remove and then reinstall, rather than just overwrite your files. In short, it's up to you, but you should think carefully about how large your application is likely to get before jumping in with the suggested approach.

Related

Optionally leave old version of component on upgrade

I've been trying to set up a WiX component such that the user can specify that the installer should not upgrade that component on a MajorUpgrade. I had the following code, but this means that if the condition is met then the new version is not installed, but the old version is also removed.
<Component Id="ExampleComponent" GUID="{GUID here}">
<Condition>NOT(KEEPOLDFILE="TRUE")</Condition>
<File Id="ExampleFile" Name="File.txt" KeyPath="yes" Source="File.txt"/>
</Component>
Ideally, if the user specifies "KEEPOLDFILE=TRUE", then the existing version of "File.txt" should be kept. I've looked into using the Permanent attribute, but this doesn't look relevant.
Is this possible to achieve without using CustomActions?
A bit more background information would be useful, however:
If your major upgrade is sequenced early (e.g. afterInstallInitialize) the upgrade is an uninstall followed by a fresh install, so saving the file is a tricky proposition because you'd save it, then do the new install, then restore it.
If the upgrade is late, then file overwrite rules apply during the upgrade, therefore it won't be replaced anyway. You'd need to do something such as make the creation and modify timestamps identical so that Windows will overwrite it with the new one. The solution in this case would be to run a custom action conditioned on "keep old file", so you'd do the reverse of this:
https://blogs.msdn.microsoft.com/astebner/2013/05/23/updating-the-last-modified-time-to-prevent-windows-installer-from-updating-an-unversioned-file/
And it's also not clear if that file is ALWAYS updated, so if in fact it has not been updated then why bother to ask the client whether to keep it?
It might be simpler to ignore the Windows Installer behavior by setting its component id to null, as documented here:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa368007(v=vs.85).aspx
Then you can do what you want with the file. If you've already installed it with a component guid it's too late for this solution.
There are better solutions that require the app to get involved where you install a template version of this file. The app makes a copy of it that it always uses. At upgrade time that template file is always replaced, and when the app first runs after the upgrade it asks whether to use the new file (so it copies and overwrites the one it was using) or continue to use the existing file. In my opinion delegating these issues to the install is not often an optimal solution.
Setting attributes like Permanent is typically not a good idea because they are not project attributes you can turn on and off on a whim - they apply to that component id on the system, and permanent means permanent.
I tried to make this a comment, it became to long. I prefer option 4 that Phil describes. Data files should not be meddled with by the setup, but managed by your application exe (if there is one) during its launch sequence. I don't know about others, but I feel like a broken record repeating this advice, but hear us out...
There is a description of a way to manage your data file's overwriting or preservation here. Essentially you update your exe to be "aware" of how your data file should be managed - if it should be preserved or overwritten, and you can change this behavior per version of your application exe if you like. The linked thread describes registry keys, but the concept can be used for files as well.
So essentially:
Template: Install your file per-machine as a read-only template
Launch Sequence: Copy it in place with application.exe launch sequence magic
Complex File Revision: Update the logic for file overwrite or preservation for every release as you see fit along the lines as the linked thread proposes
Your setup will "never know" about your data file, only the template file. It will leave your data file alone in all cases. Only the template file it will deal with.
Liberating your data files from the setup has many advantages:
Setup.exe bugs: No unintended accidental file overwrites or file reset problems from problematic major upgrade etc... this is a very common problem with MSI.
Setup bugs are hard to reproduce and debug since the conditions found on the target systems can generally not be replicated and debugging involves a lot of unusual technical complexity.
This is not great - it is messy - but here is a list of common MSI problems: How do I avoid common design flaws in my WiX / MSI deployment solution? - "a best effort in the interest of helping sort of thing". Let's be honest, it is a mess, but maybe it is helpful.
Application.exe Bugs: Keep in mind that you can make new bugs in your application.exe file, so you can still see errors - obviously. Bad ones too - if you are not careful - but you can easily implement a backup feature as well - one that always runs in a predictable context.
You avoid the complicated sequencing, conditioning and impersonation concerns that make custom actions and modern setups so complicated to do right and make reliable.
Following from that and other, technical and practical reasons: it is much easier to debug problems in the application launch sequence than bugs in your setup.
You can easily set up test conditions and test them interactively. In other words you can re-create problem conditions easily and test them in seconds. It could take you hours to do so with a setup.
Error messages can be interactive and meaningful and be shown to the user.
QA people are more familiar with testing application functionality than setup functionality.
And I repeat it: you are always in the same impersonation context (user context) and you have no installation sequence to worry about.

How exactly are files removed during MSI uninstall?

I'd like to know what exactly happens to installed files / components during the uninstall procedure.
For the install and upgrade procedure, reliable documentation exists at MSDN (see File Versioning Rules and Default File Versioning, for example).
Anyways, I couldn't find a documentation of the uninstall remove logic at MSDN or at WiX´s documentation.
So, my question is simple: I would like to know when exactly a file is removed from the system (which isn't always the case - for example if a SharedDLLRefCount exists/remains for that file).
The closest I found was the following MSDN link, which gives some advice, but basically says: "Test it yourself".
This isn't satisfying for me, because I would like to know if I can rely on a - maybe current - behaviour of MSI before I ship any installers using this behaviour to customers.
I'm searching for reliable answers to the following questions:
Under which circumstances - aside from an explicit "permanent" definition or using SharedDllRefCount - will a file/component survive an uninstall Action?
If a DLL has now a higher Version than at the time it was installed (because of hot patching or so) will it safely be removed? Note: I tested this and the current answer is yes, but I need to know if this is expected behaviour and if I can rely on it.
Component referencing for MSI files is done on a Windows Installer component basis - and not based on the old SharedDLL ref count found in the registry here: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\SharedDLLs.
SharedDLL: Strangely this SharedDLL ref-counter is sometimes also used with MSI, but this is only to provide compatibility with legacy installer and deployment technologies - I will clarify later. Legacy technologies used this SharedDLL counter as the sole way to determine if a file could be uninstalled. As soon as ref-count dropped to 0, the file could be removed.
The actual reference counting for Windows Installer is done based on Windows installer components and not shared dll ref-counters. These components are "atomic installation bundles" of files and registry settings. It is always installed or uninstalled as a whole unit. A component can basically contain "anything", but there are rules with regards to best practice when decomposing the files and registry settings you want deployed into a collection of components. Personally I always use one file per component since this avoids all kinds of problems during Windows Installer upgrades.
Key Path: Essentially each component has a "key-path" - a single file or registry key / value which is used to determine if the component is installed or not. The overall concept of MSI is that there is a 1-to-1 mapping between this absolute component key path and a unique component GUID. The GUID essentially reference counts that absolute path. I explained this in an answer several years ago, and it seems to be a comprehensible explanation for people, perhaps give it a quick read to understand this component referencing in more detail: Change my component GUID in wix?
Merge Modules: This component GUID - for that particular absolute disk or registry location - should be used by all setups that seek to deploy the file or component in question. Window Installer's mechanism to allow this is referred to as a "merge module". This is a partial MSI database that can be merged into several MSI files at build time - allowing the same components to be shared between MSI files with the correct component GUID used in all of them so that reference counting is possible. This allows these shared components to increment the ref-count each time they are installed by different products, and then the component will remain on the system until the ref-count is reduced to 0 as the products that use it are uninstalled in sequence. It should be noted that if the legacy ref-counter at HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\SharedDLLs is not 0 at the same time, then the component will not be uninstalled. Nor will it be uninstalled if the component is set permanent or if it was installed with a blank component GUID (a special feature to "install and forget" a component - it is never dealt with again).
GUIDs: So to repeat yet again, one GUID for one absolute path (one GUID to rule them all) - one GUID has an associated ref-counter which counts the number of products that have registered the component, with its associated absolute key-path, for its use. So, as an example three products might register use of a certain component GUID, making its reference count 3, and hence making its associated key-path file or registry value stick around until all 3 products are uninstalled.
SharedDLL Enabled?: Note that the legacy SharedDLL ref-counter is not necessarily enabled for your MSI components. Some tools, such as Installshield, enable a flag to increment the legacy, shared DLL ref-counter for all files installed and you actually have to turn it off for each component to get rid of this behavior. This is in contrast to other tools, such as WiX, which does not default the shared dll ref-counter to be on for all files (I am not sure for what files they do enable it - if any). Advanced Installer also does not enable the SharedDLL ref-count flag for all components (thanks to Bogdan Mitrache for verifying this - see his comment below).
Messed up legacy reference counters - which can happen during development and test installation - may cause a Windows Installer
component that should be uninstalled to be left on disk unexpectedly.
If you see this, check on a clean system to determine if messed up
legacy ref-counters is the problem on your main machine. You then need
to manually tweak the registry to fix the ref-count for your
development machine. That will involve all applicable items under this
key: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\SharedDLLs.
Not a fun job - I had it happen when using Installshield Developer 7
way back in the day.
Failing to keep a consistent component GUID for each absolute key path will cause mysterious and unpredictable problems such as an MSI
uninstall removing files that are still shared with other products,
but the reference counting has been messed up. The MSI files
mistakenly believe they "own" the shared component and happily deletes
them. A case of mistaken identity (the same absolute path has multiple
component GUIDs pointing to it - each one reference counted to 1).
This is one of the key problems people face with Windows Installer -
hence the advice to stick with one file per component.
Update: Let's be concrete about your specific questions.
You have already pretty much answered the question. The file will remain on disk if the component's ref-count (for the component GUID) is higher than 0 after your MSI file has decremented its "registration" of it on uninstall. It will also remain if its MSI component is set to be permanent, or if it had a blank component GUID, or if the legacy SharedDLL ref-count was enabled for the component (it may not be) and the ref-count here is higher than 0: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\SharedDLLs. Those are all the conditions I know about. I suppose there are other aspects such as advertised products, but I am honestly not sure how they will affect uninstall. Advertised products are not really installed, but "installable" by the user on demand. Reading Phil's answer I also recall that transitive components can also be uninstalled in the fashion he describes in his answer - by having the associated condition evaluate to false during reinstall.
Yes, as long as the component GUID has remained stable throught the lifetime of a specific absolute path (the full installation path for a file), then that file can have gone through any number of updates, and the reference count is only incremented if another MSI with another product code also installs it. In other words: if you have delivered 4 updates to your original MSI, and you have maintained a consistent component GUID for a specific file and updated it each time with a new version, then the component ref-count for this file is still 1 - as long as no other MSI has also installed the component - in which case it will be 2 or more, and an uninstall of your product will not uninstall it, but reduce the ref-count by 1.
Please do try to read this answer since it seems to have clarified things for other people: Change my component GUID in wix? (same as recommended above).
Finally I should note that shared components can also be installed via WiX include files - a new concept entirely introduced with WiX. These are like regular include files in C++, you simply define them once and they can be included in several WiX source files at compile time. I have honestly never made use of this, but conceptually it is similar to merge modules - the built-in Windows Installer concept to deal with shared components. There is one important difference though: merge modules are versioned as a whole, whereas WiX include files include files dynamically from your source folder. I find this better, but this is certainly a large discussion and a matter of preference. I will not elaborate this here.
If you are using WiX I would recommend you try to use WiX include files to manage your shared components. They seem - to me - to be a more flexible implementation of merge modules. On a subjective note I have never been a huge fan of merge modules, though they are essential to use if you have lots of shared files to install with different products. Why do I not like merge modules? They seem like an extra complication and a binary blob that needs extra maintenance. In effect they amount to a weird form of static linking - with all the problems we know from regular static linking. This may be getting too subjective, so I will end on that note, but for shared files and components use either merge modules or WiX include files or whatever other construct exist to achieve the same that I am not aware of.
There really are only a few rules that apply, but the difficulty is that they apply in a number of sometimes complicated scenarios.
A resource (a file) is removed if its component id ref count goes to zero and:
a) there is no remaining SharedDllRefcount for it.
b) the component has never been marked permanent in the installing MSI (because that makes it stick and can't be turned off).
c) the component was set to Transitive and an install/maintenance operation takes place that sets the associated property value to "false". Again, this is sticky to the system, not a project setting.
d) it has not been marked as msidbComponentAttributesShared by an install.
Permanent, transitive, component shared, and shared Dll can be set on a component.
Versioned shared files do not revert to the previous binary if product A installs version 1 and product B installs version 2 and then product B is uninstalled. By definition (it not always in practice) shared files need to support older clients.
During a major upgrade with RemoveExistingProducts (REP) at the "end", the upgrade applies file versioning rules for files, and increases the ref counts of components that are already installed (or installs the component if it's new with a ref count of 1). In this kind of upgrade the components are effectively shared with the same component ids that are already installed by the older product. When REP uninstalls the older product the ref counts are decremented.
So in the simplest case of an upgrade containing all identical component ids, no files are removed: if component ids A, B, C and D are in the older installed product and component ids A, B, C, and D are in the new upgrade then file versioning rules were applied and decrementing ref count leaves the files there, perhaps with higher version, when REP removes the older product. This why component rules must be followed with this kind of upgrade, or patch, or upgrade by reinstalling with REINSTALLMODE=vomus REINSTALL=ALL.
If component ids A, B, C and D are in the older installed product and component ids A, B, C, and E are in the new upgrade then the same happens except D will be removed because its ref count is zero now, assuming there are no other clients and the rules above do not apply.
Major upgrades with REP "early" are simple in the best case because it's an uninstall of the old product followed by an install of the new, so older versions of files could be installed, and again files are removed or not according to the above rules. None of the component ids need to be the same in the simplest case that there are no shared files used by other products.
Common issues involve setting components to permanent or Shared Dll (which is required ONLY if the file is shared with a non-MSI install). There seems to be the idea that these can be turned off by doing another install that changes them, but these are system settings, not project settings.

Source control in SSIS and Concurrent work on dtsx file

I am working on building a new SSIS project from scratch. I want to work with couple of my teammates. I was hoping to get a suggestion on how we can have some have some source control, so that few of us can work concurrently on the same SSIS project (same dtsx file, building new packages.)
Version:
SQL Server Integration Service v11
Microsoft Visual Studio 2010
It is my experience that there are two opportunities for any source control system and SSIS projects to get out of whack: adding new items to the project and concurrent changes to an existing package.
Adding new items
An SSIS project has the .dtproj extension. Inside there, it's "just" XML defining what all belongs to the project. At least for 2005/2008 and 2012+ on the package deployment model. The 2012+ project deployment model carries a good bit more information about the state of the packages in the project.
When you add new packages (or project level connection managers or .biml files) the internal structure of the .dtproj file is going to change. Diff tools generally don't handle merging XML well. Or at all really. So, to prevent the need for merging the project definition, you need to find a strategy that works for you team.
I've seen two approaches work well. The first is to upfront define all the packages you think you'll need. DimFoo, DimDate, DimFoo, DimBar, FactBlee. Check that project and the associated empty packages in and everyone works on what is out there. When the initial cut of packages is complete, then you'll ensure everyone is sync'ed up and then add more empty packages to the project. The idea here is that there is one person, usually the lead, who is responsible for changing the "master" project definition and everyone consumes from their change.
The other approach requires communication between team members. If you discover a package needs to be added, communicate with your mates "I need to add a new package - has anyone modified the project?" The answer should be No. Once you've notified that a change to the project definition is coming, make it and immediately commit it. The idea here is that people commit and sync/check in whatever terminology with great frequency. If you as a developer don't keep your local repository up to date, you're going to be in for a bad time.
Concurrent edits
Don't. Really, that's about it. The general problem with concurrent changes to an SSIS package is that in addition to the XML diff issue above, SSIS also includes layout data alongside tasks so I can invert the layout and make things flow from bottom to top or right to left and there's no material change to SSIS package but as Siyual notes "Merging changes in SSIS is nightmare fuel"
If you find your packages are so large and that developers need to make concurrent edits, I would propose that you are doing too much in there. Decompose your packages into smaller, more tightly focused units of work and then control their execution through a parent package. That would allow a better level of granularity to your development and debugging process in addition to avoiding the concurrent edit issue.
A dtsx file is basically just an xml file. Compare it to a bunch of people trying to write the same book. The solution I suggest is to use Team Foundation Server as a source control. That way everyone can check in and out and merge packages. If you really dont have that option try to split your ETL process in logical parts and at the end create a master package that calls each sub packages in the right order.
An example: Let's say you need to import stock data from one source, branches and other company information from an internal server and sale amounts from different external sources. After u have all information gathered, you want to connect those and run some analyses.
You first design the target database entities that you need and the relations. One of your member creates a package that does all the import to staging tables. Another guy maybe handles external sources and parallelizes / optimizes the loading. You would build a package that in merges your staging and production tables, maybe historicizing and so on.
At the end you have a master package that calls each of the mentioned packages and maybe some additional logging or such.
In our multi-developer operation, we follow this rough plan:
Each dev has their own branch, separate from master branch
Once a week, devs push all their changes to remote
One of us pulls all changes, and merges all branches into master, manually resolving .dtproj conflicts as we go
Merge master in all dev branches - now all branches agree
Test in VS
Push all branches to remote, other devs can now pull and keep working
It's not a perfect solution, but it helps quarantine the amount of merge pain we have to experience.
We have large ssis solutions with 20+ packages in one solution, with TFS Git. One project required adding a bunch of new packages to the existing solution. We thought we were smart and knew to assign only one person to work on each new package, 2 people working on the same package would be suicide. Wasn't good enough. When 2 people tried add a different named, new, package at the same time, each showed dtproj as a file that had changed/needed to be checked in and suddenly I found myself looking at the xml for dtproj and trying to figure out which lines to keep (Microsoft should never ask end users to manually edit their internal files, which only they wrote and understand). Billinkc's solutions here are very good and the problem is very real. You may think that Microsoft is the great Wise One, and that your team can always add new packages to an existing solution without conflicts, but you'd be wrong. It also doesn't work to put dtproj in .gitignore. If you do that, you won't see other peoples new packages (actually the .dtsx file will come down in git, but you won't see that package in Solution Explorer because dtproj is what feeds Solution Explorer). This is a current problem (2021) and we are using Visual Studio 2017 Enterprise with SSDT.
To explain this problem to people, git obviously can handle a group of independent, individual files in a directory (like say .bat files) and can add, change, and delete those files easily. The problem comes in when you have a file that is naming, describing, and counting all the files in a directory (what dtproj does). When you have a file like dtproj you are creating a conflict on dtproj itself, when 2 people try to a add a new package at the same time. Your dtproj file has a line that shows the package you added, and my dtproj file shows the package I added, and tfs/git sees that as a Conflict.
Some are suggesting ways to deal with this if you have to add a lot of new packages, my idea is a little different. For the people who have to add new packages, don't work in the primary solution where this problem is, work somewhere else. Probably best to work in the "Projects" directory you get when you install Visual Studio, outside of TFS/Git. Obviously follow all the standards, Variable naming, and Package Configuration conventions for the target Solution. Then when the new packages are ready, give the .dtsx files to your Solution Gatekeeper for them to check in. Only the Gatekeeper can check in new packages using Add From Existing, avoiding conflicts. Once the package is checked in, developers can work on them in the main Solution.

What are the limitations/benefits on using MSM instead of MSI?

I'm currently building a product distributed through MSI Windows Installer. That product is being integrated by our customers using different forms such us inside their own MSI, using bootstrapper/chainner like WiX Burn or authoring tools like InstallShield.
Having this scenario in mind, I always wanted to know what are the limitations and/or benefits to use Merge Modules (MSM) instead of keeping an MSI, and also what are the nowadays recommended approach about choosing one against the other.
On paper merge modules are fine, but in the real world I find them clunky to update and hence error prone since they may be merged into many setups before being discovered to be defective. As a result I do not recommend merge modules at all. I prefer a single MSI that can be run as a batch process via a bootstrapper or batch file and that can also be updated easily. This avoids all kinds of problems that are not generally intuitive.
I want to add that merge modules work well for truly shared files installed in locations in the file system that are meant for shared files and that change infrequently. These are generally OS-runtimes. These merge modules are generally heavily tested and work ok. However, often I see people use merge modules for files that end up changing frequently and that they then end up installing in different locations in different flavors in an ad-hoc fashion. This kind of use is a total mess and a hugely wasted effort.
Having said all that - I have indeed used merge modules successfully when I have needed advanced release management with repetitive, non-changing inclusion of a set of files via a merge module into several setups. Even then I ran into a version issue after a while with a couple of files needing update, and subsequent, minor errors with the wrong merge module being used when I left the project to someone else. I also experienced having to rebuild all setups due to a minor merge module bug fix. All setups then had to go through QA again. Very frustrating with such tight coupling.
If your requirements are simple and you are not taking on a huge multi-product release project sharing a bunch of files, use MSI instead of MSM. Easier to comprehend, generally less work to deal with, more atomic updates and less risk of introducing the same error in many setups due to merge module update or design problems.
There's nothing that wrong with merge modules. Their primary use (which hasn't been mentioned) is sharing. If you want the same set of shared files in multiple MSI files, they need the same set of component guids to preserve the sharing rules. Or if you are giving files to clients for them to use (like Microsoft) in their MSI builds then give them merge modules. That's one of the reasons MS and other vendors redistribute merge modules so that everyone can build their MSIs and install them on the same system without file sharing disasters. I've also seen merge modules used as a common UI for MSI files. But mainly they are essential to make sure that shared files are used correctly. I'll tell you from experience that the disaster resulting from incorrect use of shared files is much worse that any perceived difficulty with using merge modules. Note also that they are universal and can be included in all tools that build MSI files.
I've never found merge modules difficult to patch, version, or fix. Major upgrades aren't a problem. The only potential issue I've seen is with build processes that rebuild all the binaries in a merge module during creation of a patch (.msp) build. If only one binary needs a fix but you compile them all, their versions and internals may change enough so that the patch process (the delta between two MSI files and their content) will tell you that they need to be included in the patch because they've changed, but this issue can be avoided if it is in fact an issue.

What is the MSI component generation best practice?

Visual Studio Installer states that it is a best practice to install each file as an installer component. The heat utility provided with Wix also seems to follow the practice of putting every file in its own component.
InstallShield's component wizard uses InstallShield's setup best practice of placing portable executable files in their own component but groups all other files (e.g. unversioned files) by the common destination folder.
The advantage of practice one (each file in its own component) is that each file is set up as a key file which is important if you want these files to trigger repairs. It also allows automation of creating the components (e.g. heat) easier since you are creating a component for each file.
The disadvantages of practice one include the overhead of managing so many components and the bloating of the registry after the application is installed.
An advantage of practice 2 could be seen in an install that installs hundreds of graphics files to one directory. If you do not care about repair functionality, is there any reason to create hundreds of components for this install?
These 2 different practices are conflicting and I want to know which one that people actually use and why.
I always use the Microsoft approach (something similar to what InstallShield does):
http://msdn.microsoft.com/en-us/library/aa368269(VS.85).aspx
I think it's the best because:
- important files (EXE, DLL etc.) have their own component, so they can be repaired easily
- resource files are grouped together
- it allows an optimum components count (not too many to get a long install, but enough to allow an easy repair)
I also noticed that most commercial setup authoring tools use this approach.
I've written about this in the past and I'll try to find a link to it. I think you already understand the question and it's just time for you to decide what is important to you.
For me, I work on installs with 15,000+ files and we only service with major upgrades. For "Program Executables" we follow 1:1 principals ( a must for COM, Services, ShortCuts and so on anyways ) but for content/data files we actually do a 1 to many with no key file approach to cut down on our number of components. Sure, that means we won't be able to create an MSP that services just one or two content files here and there but for our business needs that's simply not important to us.
Resilency was a bit of a 4 letter word to us so having less key files makes us happier anyways. :-) BTW, VDPROJ also makes every registry key a keyfile of it's own component and that was quite painful for us triggering unneeded repairs.
All of this aside, for anyone who doesn't fully understand all of this, I'd stick to the 1:1 pattern until you come across a situation where you don't want to anymore and you understand the impact of making that choice.