Maven + SSDM Build and Runtime Environment Automation - maven-2

Preface:
My Company, like most, has several run-time environments and several release versions which themselves are composed of different versions of various jars.
For example, let us consider release versions 1.1, 1.2, and 1.3 of Software X, which may be deployed to a developer computer, testing, or production.
Software-x-1.1 is itself composed of jarA-0.9.1 and jarB-0.7.5, but software-x-1.3 is composed of jarA-1.7.31 and jarB-0.8.1.
Currently we use Spring's PropertyPlaceholderConfigurer to configure run-time variables (such as database credentials), however, properties also change with release versions.
We also use Maven 2 POM version 4 to specify which versions of our code need to be used. We place the version numbers of our jars as properties within profiles (dev,test,prod) inside of the parent pom and then reference those version numbers in all project poms.
As of right now, we have no way to specify which project versions pertain to a given release other than the most current one. Moreover, we deploy our run-time configurations to the SSDM pickup which then configures and creates the services defined by the built versions of our software.
--
Questions:
Is there any procedure/tool we can use to build our product by merely providing the run-time environment and version number? IE "build 1.1 dev"?
Is there anyway we can store the required jar versions for each release build? We are currently versioning all files, including the parent pom, but merely versioning the parent pom does not record which release version is pertinent to that parent pom.
What else can we do to further automate the process of builds?
For example, if we could manage run-time configurations within the parent pom that would be a step in the right direction, but that seems like a violation of scope.
Any tool outside of our framework is inconceivable at this point, but not in the far future.
Summary:
How can we automate our build process to the fullest extent without being error prone?

Based on the part for released version 1.1, 1.2 and 1.3 of the Software X it seemed to be right way to use profiles to handle differences between test, production etc. environments.
The software itself is an other story. I assume you are using a version control tool (VCT) to store the state of your development. So during the preparation of Software-x-1.1 you change your root pom and define the dependencies (jarA-0.9.1, jarB-0.7.5). Make a Tag Release 1.1. and than continue to Release 1.2...during the development of Release 1.3 you decided to change the dependencies (to jarA-1.7.31 and jarB-0.8.1) which results in a change to the pom's or your root pom only). May be i oversight your real problem.

If I summarize your problem: you want to manage release of versions across multiple environments, and you release distribution is an aggregate of executable (jars) as well as environments properties. Different versions of these deploy-able distributions propagate to diff env at different stages with there own set of env properties and you are looking at a way to have a common roll out (or may be release process) to handle all of this.
It seems the first problem you have is that you run a build per release per environment when you are propagating a release. If I am not wrong, you should try looking at your app architecture first to see if there is a way you can create environment independent binaries, in some cases projects prefer keeping properties as a separate module which is deployed along with the jars, and a Property Manager of sorts which figures reads the files, so you may have a maven module called properties, which bundles one zip each for every env set of property files. Your deployer script can then be given a parameter while running on which zip file to extract to a location from where the properties can be read into the application. What you gain this way is that you "create one release distribution per release - which has contents to run on all environments".
Also, is it the case that you release version is "not" the version that you have in POM? if not aligning your release version to POMs should be done. i.e. POM should be 1.3-SNAPSHOT when you are working on development phase of that release, and be bumped off to 1.3 in a branch when you are releasing it.
There are no one size fits all solutions for such things but practices similar to this one do help to a good extent.
PS Do let me know if I got your problem right, or have ended up beating around the bushes ;-) DS.

Related

Repository for storing derived information (build artifacts)

I'm looking for a "repository" to store derived information (build artifacts).
We have a repository (currently Mercurial) to store our source code. When something is pushed to the source repository the code goes through a continuous integration server and we do an incremental build and as a result some dlls will be changed. This should be added to some "repository" so that everybody can use that version without needing to do the build again.
I'm looking for the following features:
It should be easy to update the source code and get the corresponding binaries (we could probably make a script for that)
You should easily get all binaries at once (not only those that changed during the last incremental build.
Binaries that weren't changed should only be stored once in the repository.
When updating the source code and the binaries only the changed binaries should be transferred (and not all binaries). This is similar to what happens for source code.
When updating to some version, only that version should be stored locally, not the complete history.
We should be able to remove certain versions from the binary "repository" after a while. However if the dlls are still necessary for subsequent incremental builds, these dlls should of course not be completely removed from the "repository"
What would fit these requirements?
I agree with Manfred, what you are looking for is a binary repository manager. Besides the Nexus repository manager you should consider Artifactory.
As for the feature list you asked about:
As you have mentioned the CI server should be responsible for identifying a change in the version control and starting a build process which creates the binaries. The CI server/build tool should also deploy the generated binaries to the repository manager, in case the build was successful. Artifactory offers a build integration feature which takes care of deploying the binaries together with the build metadata.
Using the build integration feature of Artifactory, you can get a list of all the binaries generated by a specific build and download them as an archive. Artifactory provides a REST API for those actions.
There are different approaches for storing the artifacts in a repository manager. Some tools stores a multiple copies of the same binary. Other, for example Artifactory, use a checksum based storage which keeps only one copy per binary (based on its checksum). This pays of if you keep multiple copies of the same binary in different repositories, especially if you are dealing with large binaries (war files, docker images, ISOs etc.). Another benefit are cheap copies/moves between repositories which is a common practice for promotion workflows.
The Artifactory build integration uses checksum based deployment which deploys only binaries which does not exist in Artifactory. For binaries which do exist and have not changed, it only created a new reference to the existing binary saving the need to send the actual bytes.
Artifactory provides multiple option of cleaning up binaries, including built in cleanup policies and the option to develop your own custom logic using user plugins and the Artifactory query language (AQL)
In addition, I highly recommend to take a look at the binary repository comparison matrix.
Disclaimer: I am working for JFrog the company behind Artifactory
You are basically asking for a repository manager like the Nexus Repository Manager as you have correctly identified with the tags.
In terms of specific requirement from your questions here are a couple of ideas.
binary components are typically identified via some coordinates that most of the time includes some sort of name and version. A release and build process changes those and deploys them to the repository. This allows you to match source code with binaries. You can also embed information like git refs in the produced binaries.
accessing the binaries is typically done via HTTP, so its easy. You then just have to determine what it means to get "all binaries".
not duplicating binaries that are essentially the same can be supported by the underlying file system or the build tool. I have seen both processes to work. Often it is however not worth the effort since storage is cheap.
there are various ways to automatically clean up repositories including scheduled tasks that do it regularly. Worst case you have to implement your own logic in an extension
Disclaimer: I work as community advocate and trainer for the Nexus Repository Manager with Sonatype.

How to avoid a build and deployment of dependencies which have no code changes

I'm doing a proof of concept on continuous integration and whether our development team will benefit from automated builds and automated deployments to reduce human error.
I've already come quite far in the process but have some questions on how to configure our incremental builds to avoid rebuilding of dependencies that had no code changes.
In addition I’d like our deployment tool to identify and deploy only assemblies rebuilt as a result of a code change.
We already use Microsoft products like TFS for source control, Visual Studio for development and Team Foundation Build for continuous integration builds. We’re currently leaning toward InRelease for deployment as it seem to integrate well with Team Foundation Build.
But first, here is our current setup...
There are 200+ C# solution files, each containing one or more projects. It is not practical in the environment to combine these projects into less solutions, i.e. by design. Projects within a solution uses project references to resolve dependencies and file references to projects in other solutions. As far as I know, this is the recommended approach by Microsoft when dealing with a large amount of projects.
We use a "branch by feature" strategy e.g. isolated development on concurrent features branches which is merged up to a stable Main branch when complete. When it's time for a release, a release is branched from main and isolated for hotfixes and deployment. The feature branches and main branch have a CI build triggered by code check-ins. Releases will mostly like be manually executed from InRelease against a selected release branch. A release will be deployed through various environments including INTEGRATION/TEST, UAT and ultimately to all our clients. We're still fleshing out the details of branching strategy, but that's a question for another time.
The current problems to solve:
1. Avoid rebuilding of dependencies that have no code changes...
When we deploy new functionality or a patch to a client, we want to push the absolute minimum in files. Our company has a very large customer base (thousands of customers) with sometimes slow internet connections, so doing a full deployment of all assemblies (200+) to every customer is not an option. I've partially solved the problem by setting up incremental builds which correctly rebuilds only changed projects as expected but also rebuilds all the dependent projects even though NO CODE CHANGES were made to them. This results in both the changed assemblies and dependencies having new timestamps. If we use the change of timestamp to identify which assemblies to deploy, then this would result in deployment of functionally unchanged assemblies. The goal here is to deploy only assemblies where the code has changed and assemblies where breaking changes occur.
For example:
Solution B, has a project called Project B
Solution A, has a project called Project A
Project B -> Project A (where Project A has a file dependency on Project B)
When a non-breaking change is made in Project A, say to the interior of a method, then the expected result is: only A is built and therefore a candidate for deployment.
When a breaking change is made in Project A then that will break Project B, the expected result is: Both A and B is built and therefore a candidate for deployment.
Currently MSBuild rebuilds all dependents regardless, which is not what we want.
2. Automatically identify which assemblies should be deployed...
I have a partial solution to the problem.
When a build is performed, my build process template is configured to run a MSBuild script containing a list of solutions to build in a particular order.
This operation is performed in the build agent’s workspace. Every time a new build is performed the build process template creates an unique drop folder in the format and copies the binaries from the build agent workspace to the drop folder. This is out of the box functionality taken care of by the standard build process template. The build has been configured not to clear the build agent workspace, so the first time it runs it will build all projects within a solution but subsequent builds will only build projects that have code changes or is dependent on other projects (incremental build?). Therefore unchanged assemblies will have the original time stamps and changed assemblies will have new timestamps.
We have a tool that can do folder comparisons between drop folders and output the results to a txt file. This allows us to identify which binaries have been added/changed/removed since the last deployment. It also gives us the added benefit of comparing the list of actual artefacts to a manifest of expected artefacts as defined by the developer. This will ensure that no assemblies get deployed that has not been specified and proven to be unit tested.
The question is how can be we leverage InRelease to deploy only the required files as per the example above and not all files in the drop folder?
Install a TFS Proxy in before your build machine, this reduce the net traffic
You will start with a branch strategy like Service Pack, you can read a documentation about in ALM Rangers guidance... And adapt you process template to build just the part of code changed. I think in BRD Lite, another guidance by ALM Rangers, you will found more information.

Trying to quickly resurrect an old Maven built project

First day on a project and first day with Maven and I've already wasted a lot of time trying to get it to build.
It appears the issue is that this old project has config, POMs, etc, that have many broken URLs embedded in them. i.e. Maven generated stack traces are presenting lots of URLs that are broken when trying to download project dependencies.
I have been given only the project source which includes Maven config files. I have not been supplied with existing Maven repositories, project dependent libraries or any build environment, etc.
I have been hacking away at these files but I don't get very far with each build attempt.
Am I doing something fundamentally wrong or is this Maven config really stuck in 2008?
Update:
My POM really was stuck in 2008, i.e. by virtue of versioning, it is a snapshot in time while the rest of the Java world moves on.
Some of the dependencies were no longer in any repositories, most of which were defunct projects and so I've ceased to use them. I had to rewrite the entire POM. I had to spend a lot of time tweaking versions to ensure compatibility between dependencies and between plugins. After much battling; some plugins just wouldn't coexist, clobbering each other.
All in all, it was many, many hours effort...too many for this project with only one developer, and I believe I only now know enough to be dangerous.
The good ol' IDE build system would have been a better choice in this instance.
ftr's advice (in the comments section) is right: Maven can't download certain dependencies, but that doesn't necessarily mean that those dependencies don't exist anymore. It could just be that the extra-repos section of the Maven configuration is now missing certain repositories, and/or there's some other connection issue (like bad proxy config - which may lead to you being able to access certain repos but not others).
I've been in a similar situation, and found out that while initially Maven reported errors when trying to download about 80% of the dependencies, after various tweaks on Maven's config I ended up making it download all of the dependencies (well except one which was really just a custom jar somebody did and which was fetched directly from the local file system, but that's besides the point).
Here's what I'd do:
Of all the dependencies that Maven says it can't download, try to spot 2 or 3 which are "well know" (like maybe if it says it can't download Servlet or some Spring library, write down the exact URL's he's trying to contact for those).
Manually check if those URL are indeed accessible (via browser). If so, make sure that the dependencies exist for the version Maven is looking for. Maybe they have been updated since the project was created, and the old version is no longer kept. In this case, 90% of the time the solution is to simply update Maven's pom to point to the new version.
If manually checking the dependency's URL shows you that in fact the dependency exists, for the version Maven is looking for, make sure there's no proxy or some other form of internet connection "extra config" which is done for your browser, but not for Maven. If that's the case, just update Maven's config with all those extra params (proxy, proxy authentication, etc).
If the dependency URL doesn't exist at all, try googling to see if that dependency doesn't now exist on some other repo. For example many of the JBoss dependencies (like Hibernate, etc) have changed repo location somewhere around 2007-2009. If that's the case just add the new repo to Maven's repo list (and remove the old one if it no longer exists).
Finally, the good old shameful way to fix this is to go to a colleague which has (or had) something to do with your project at some point, and copy his local Maven repo to your machine :)

Archivable, replicable releases when building with Maven: is there a right way?

We have a largish standalone (i.e. not Java EE) commercial Java project (10,000+ classes, four or five SVN repositories, ten or twenty third-party libraries) that's in the process of switching over to Maven. Unfortunately only one engineer (in a team of a dozen or so distributed across three countries) has any prior Maven experience, so we're kind of figuring it out as we go.
In the old Ant way of doing things, we'd:
check out source code from three or four repositories
compile it all into a single monolithic JAR
release that (as part of a ZIP file with library JARs, an installer, various config files, etc.)
check the JAR into SVN so we had a record of what the customers had actually got.
Now, we've got a Maven repository full of artifacts, and a build process that depends on Maven having access to that repository. So if we need to replicate what we actually shipped to a customer, we need to do a build against a Maven repository that has all the proper versions of everything. This is doable, I guess, if in (some version of) the (SVN-controlled) POM files we set all the dependencies to released versions?
But it gives our release engineer the creepy-crawlies, because there doesn't seem to be any way:
to make sure that somebody doesn't clobber the copy of foo-api-1.2.3.jar on the WebDAV server by mistake (the WebDAV server has access control, but that wouldn't stop a buggy build script)
to detect it if they did
to recover afterwards
His idea is, for release builds, to use a local file system as the repository rather than the WebDAV server, and put that local repository under SVN control.
Our one Maven-experienced engineer doesn't like that -- I guess because he doesn't like putting binaries under version control? -- and suggests that maybe the professional version of the Nexus server can solve the clobbering or clobber-tracking/recovery problem.
Personally, I'm not happy (sorry, Sonatype readers) with shelling out money for a non-free build system when we haven't even seen any benefit from the free version yet, and there's no guarantee it will actually solve the problem.
So our choices seem to be:
WebDAV server
Pros: only one server, also accessible by devs, ...?
Cons: easy clobbering, no clobber-tracking/recovery
Local file system
Pros: can be placed under revision control
Cons: only works with the distribution script
Frankly, both of these seem like hacks to me, and I have to wonder if there isn't a better way to do this.
So: Is there a right thing to do here?
I'm not sure to get everything but I would:
Use the maven-release-plugin (which automates the release process i.e. execute all the steps documented in release:prepare).
Use WebDAV with anonymous read-only and authenticated write policy (so only release engineer can actually deploy released artifacts to the corporate repo).
There is a no need to put generated artifacts under version control (if you have the poms under version control). I don't see the benefits of using the local file system instead of WebDAV (this is not providing more security, you can secure WebDAV as well). I don't see what the commercial version of Nexus would solve here.
Nexus has a setting which prevents you from clobbering an already released artefact in a release repository.
For a team of about a dozen, the free version of Nexus should be enough.

Should we store JRE in CVS/SVN?

I want to bundle JRE 6.0 together with my java application. All my source code reside in CVS. My client will check-out the code and build it themselves. Should I store JRE in CVS?
I normally advocate putting most everything in source control, but this seems a little excessive. Why ?
the JRE is readily available from http://java.sun.com
it doesn't change that often. I'd expect you to specify a minimum version for your code to run against (e.g. 1.5, 1.6 etc.)
I would not put a JDK or JRE into a source code repository:
It is bad practice to put externally versioned things into your version control because it usually leads to over-constraining, obscuring and/or hard-wiring your app's external dependencies. (Maven or Ivy are good solutions for dealing with external dependencies, though not in this case,)
Putting binaries into version control is a bad idea for some version control systems.
But I think your real problem (actually, your user's organization's problem) is the IT folks who refuse to contemplate upgrading the JRE:
They need to be made aware of the
fact that they can install multiple
JRE versions on the one machine, and
configure apps to launch with the JRE
version they require. (It is trivial
on Linux ...)
They need to be made aware of the fact
that their policy is an impediment to
progress.
They need to be made aware of the fact
that their policy is a potential security
issue. If they force users to deploy their
own copies of JDKs / JREs in random places,
it will be difficult to ensure that JRE security
patches get applied. (Besides, 1.4.2 is due
to be end-of-life'd soonish, and security
patches for it will cease.)
EDIT: and there is also the legal question of whether "redistributing" a JRE out of your source code repository is a violation of Sun's click-through JRE/JDK download license. (I don't know ...)
As best practice, you shouldn't keep any binary files in the source control system. For Java developers there is maven that does it's work better in versioning jar files. The reason is that we want to keep our source repository as small as possible so it is faster for those that checks out our code for the first time.
But if you still want to keep binary files in the source control, it would be best to avoid using CVS, because CVS is bad in versioning binary files. You can search with google, why it is bad. If you use SVN, then it still okay because SVN handles binary files much better than CVS.
I see nothing wrong with storing the JRE in CVS.
However, it's not so important whether you do or not as long as your script can pull it as part of the build. For example, if you want to host a downloadable jre.zip on an HTTP server, or point to it in a Maven repo, that's just as good.
Well won't your client all ready have the JRE if you expect him to compile the code before running it? The JDK contains the JRE.
Depends a lot on what you use to handle dependencies. If you use Maven, then create a maven package with the stuff you need, and host it on a local repository.
If you just have CVS (like we do) then it is fine to create big binary packages (since you will need them) which you can then put in CVS. Just be aware that they should be static for best CVS performance.
ALso note that the jsmooth package can create an EXE file of your jar with an JRE embedded in it. This might solve your deployment problem.
For remote compilation, Eclipse can work with a plain JRE. You just need to tell Eclipse where JRE you already have prepared above is located on the disk. There is also a folder inside the Eclipse distribution where the launcher looks automatically.
I'm wondering about the client building the application themselves. It will require some kind of Java compiler, most probably javac wich is part of the JDK. So your client will not only need a JRE, but a JDK as well (unless they will be using Jikes or another alternative compiler).
javac is capable of generating bytecode for previous versions of Java, so using a newer compiler should not pose any problems.
Personally, I would not include large binaries like a JRE as part of my own repository. The JRE can be considered very stable and just listing the minimum version required should be enough. Installing a JRE is also something quite different than installing a single Java application. The two activities should not be mixed.