Maven artifact and groupId naming - maven-2

I'm currently in the process of moving some project from Ant to Maven. Conformist as I am, I want to use well-established conventions for finding groupId and artifactId, but I can't find any detailed conventions (there are some, but they don't cover the points I'm wondering about).
Take this project for instance, first the Java package: com.mycompany.teatimer
Tea timer is actually two words, but the Java package naming conventions forbid the insertion of underscores or hyphens, so I'm writing it all together.
I chose the groupId identical to the package ID because I think that's a good idea. Is it?
Finally, I have to pick an artifactId, I currently went for teatimer. But when I look at other Maven projects, they use hyphens to split words in artifactIds, like this: tea-timer. But it does look weird when concatenated to the groupId: com.mycompany.teatimer.tea-timer.
How would you do this?
Another example:
Package name: com.mycompany.awesomeinhouseframework
groupId: com.mycompany.awesomeinhouseframework (?)
artifactId: awesome-inhouse-framework (?)

Weirdness is highly subjective, I just suggest to follow the official recommendation:
Guide to naming conventions on groupId, artifactId and version
groupId will identify your project uniquely across all projects,
so we need to enforce a naming schema.
It has to follow the package name
rules, what means that has to be at
least as a domain name you control,
and you can create as many subgroups
as you want. Look at More information
about package names.
eg. org.apache.maven, org.apache.commons
A good way to determine the granularity of the groupId is to use
the project structure. That is, if the
current project is a multiple module
project, it should append a new
identifier to the parent's groupId.
eg. org.apache.maven, org.apache.maven.plugins,
org.apache.maven.reporting
artifactId is the name of the jar without version. If you created it
then you can choose whatever name you
want with lowercase letters and no
strange symbols. If it's a third party
jar you have to take the name of the
jar as it's distributed.
eg. maven, commons-math
version if you distribute it then you can choose any typical
version with numbers and dots (1.0,
1.1, 1.0.1, ...). Don't use dates as they are usually associated with
SNAPSHOT (nightly) builds. If it's a
third party artifact, you have to use
their version number whatever it is,
and as strange as it can look.
eg. 2.0, 2.0.1, 1.3.1

Your convention seems to be reasonable. If I were searching for your framework in the Maven repo, I would look for awesome-inhouse-framework-x.y.jar in com.mycompany.awesomeinhouseframework group directory. And I would find it there according to your convention.
Two simple rules work for me:
reverse-domain-packages for groupId (since such are quite unique) with all the constrains regarding Java packages names
project name as artifactId (keeping in mind that it should be jar-name friendly i.e. not contain characters that maybe invalid for a file name or just look weird)

Consider the following for building a basic first Maven application:
groupId
com.companyname
artifactId
project
version
0.0.1

However, I disagree the official definition of Guide to naming conventions on groupId, artifactId, and version which proposes the groupId must start with a reversed domain name you control.
com means this project belongs to a company, and org means this project belongs to a social organization. These are alright, but for those strange domain like xxx.tv, xxx.uk, xxx.cn, it does not make sense to name the groupId started with "tv.","cn.", the groupId should deliver the basic information of the project rather than the domain.

Related

How to Group Plug-ins into Features

We are struggeling hard with how to use features the correct way.
Let’s say we have the plug-in org.acme.module which depends on org.thirdparty.specific and org.acme.core.
And we have the plug-in org.acme.other which depends on org.acme.core.
We want to create an application from these, which includes a target file and a product file. We have the following options:
One feature per module:
org.acme.core.feature
org.acme.core
org.acme.module.feature
org.acme.module
org.acme.other.feature
org.acme.other
org.thirdparty.specific.feature
org.thirdparty.specific
This makes the target and product files gigantic, and the dependencies are very hard to manage manually.
One feature per dependency group:
org.acme.module.feature
org.acme.core
org.acme.module
org.thirdparty.specific
org.acme.other.feature
org.acme.core
org.acme.other
This approach makes the dependencies very easy to manage, and the target and product files are easy to read and maintain. However it does not work at all. The moment org.acme.core changes, you need to change ALL the features. Furthermore, the application has no say in what to package, so it can’t even decide to update org.acme.core (because of a bugfix or something).
Platform Feature:
org.acme.platform.feature
org.acme.core
org.acme.other
org.thirdparty.specific (but could be its own feature)
org.acme.module.feature
org.acme.module
This is the approach used for Hello World applications and Eclipse add-ons - and it only works for those. Since all modules' target platforms would point to org.acme.platform.feature, every time anything changes for any platform plug-in, you'd have to update org.acme.platform.feature accordingly.
We actually tried that approach with only about 50 platform plug-ins. It's not feasible to have a developer change the feature for every bugfix. (And while Tycho supports version "0.0.0", Eclipse does not, so it's another bag of problems to use that. Also, we need reproducibility, so having PDE choose versions willy-nilly is out of the question.)
Again it all comes down to "I can't use org.acme.platform.feature and override org.acme.core's version for two weeks until the new feature gets released.
The entire problem is made even more difficult since sometimes more than one configuration of plug-ins are possible (let's say for different database providers), and then there are high level modules using other child modules to work correctly, which has to be managed somehow.
Is there something we are missing? How do other companies manage these problems?
The Eclipse guys seem to use the “one feature per module” approach. Not surprisingly, since it’s the only one that works. But they don’t use target platforms nor product files.
The key to a successful grouping is when to use "includes" in features and when to just use dependencies. The difference is that "includes" are really included, i.e. p2 will install included bundles and/or included features all the time. That's the reason why you need to update a bundle in every feature if it's included. If you don't update it, you will end up with multiple versions in the install.
Also, in the old day one had to specify dependencies in features. These days, p2 will mostly figure out dependencies from the bundles. Thus, I would actually stop specifying dependencies in features but just includes. Think of features as a way to specify what gets aggregated.
Another key point to grouping is - less is more. If you have as many features as bundles chances a pretty high that you have a granularity issue. Instead, think about what would a user install separately. There is no need to have four features for things that a user would never install alone. Features should not be understood as a way of grouping development/project structures - that's where folders in SCM or different SCM repos are ok. Think of features as deployment structures.
With that approach, I would recommend a structure similar to the following example.
my.product.base
base feature containing the bare minimum of the product
could be org.acme.core plus a few minimum
my.product.base.dependencies
features with 3rd party libraries for my.product.base
my.addon.xyz
feature bundling an add-on
separate features for things that can be installed separately
my.addon.xyz.dependencies
3rd party libraries for add-on dependencies
Now in the product definition I would list just my.product.base. There is no need to also list the dependencies features. p2 will fetch and install the dependencies automatically. However, if you want to bind your product to specific versions of the dependencies and don't want p2 to select any matching one, then you must include the my.product.base.dependencies feature.
In the target definition I would include a "my.product.sdk" feature. That feature is an aggregation feature of all other features. It makes target platform management easier. I typically create an sdk feature with everything.
Another feature that is also very often seen is a "master" feature. This is an "everything" feature that maybe used for creating a p2 repository during the build. The resulting p2 repository is then used for assembling products.
For a more real world example see here:
http://git.eclipse.org/c/gyrex/gyrex-server.git/tree/releng/features
Features and Continuous Delivery
There was a comment regarding frequent updates to feature.xml. A feature.xml only needs to be modified when there is a change in structure. No updates need to happen when the bundle version is modified. You should reference bundles in features with version 0.0.0. That makes Tycho to fill in the proper version at build time. Thus, all you need to do is commit a change to any bundle and then kick off a rebuild. Tycho also takes care of updating the feature qualifier based on the qualifiers of the contained bundles. Thus, the new feature qualifier will be different than in a previous build.

Setting ivy conflict managers

I am trying to set conflict managers in Ivy, but I can't find a concrete example of how to set them. For example, to set the "strict" manager, what would this look like?
<conflict-managers>
???
</conflict-managers>
<rant>
Yeah, isn't Ivy documentation a hoot! I mean, does it have to be well organized and complete? Does it really have to make sense. I mean, it's not like my job depends upon it!
Wait a second, it does...
</rant>
Sorry, I have to get the state of Ivy documentation off my chest. It makes Maven documentation look wonderful in comparison.
The best book on Ivy I've found is Manning's Ant in Action. It's a seven year old book that's out of print (but is still available as an ebook. If it wasn't for this book, (which is using Ivy 1.4), I would have been completely lost. Unfortunately, it doesn't delve deep into the Ivy settings.
There is a listing of all of the possible conflict managers buried deep in the Ivy documentation.
all this conflicts manager resolve conflicts by selecting all revisions. Also called the NoConflictManager, it doesn't evict any modules.
latest-time this conflict manager selects only the 'latest' revision, latest being defined as the latest in time. Note that latest in time is costly to compute, so prefer latest-revision if you can.
latest-revision this conflict manager selects only the 'latest' revision, latest being defined by a string comparison of revisions.
latest-compatible this conflict manager selects the latest version in the conflicts which can result in a compatible set of dependencies. This means that in the end, this conflict manager does not allow any conflicts (similar to the strict conflict manager), except that it follows a best effort strategy to try to find a set of compatible modules (according to the version constraints)
strict this conflict manager throws an exception (i.e. causes a build failure) whenever a conflict is found.
I haven't played around with them, but I believe you simply do the following in the ivy-settings.xml:
<conflict-managers>
<latest-revision/>
</conflict-managers>
You can also define conflict management in your ivy.xml too which might be a bit more practical since it can be defined on a module-by-module basis.
Of course a few examples would have gone a long way with this, but the Ivy documentation doesn't provide many.
The best book on Ivy I've found is Manning's Ant in Action.
That was me. Ivy has moved on a lot since then, and so have builds
One issue with the ivy conflict managers is that it differs from maven, whose policy is "shallowest on the graph first", that picks the closest one. This is good if you explicitly ask for a version, bad if you have >1 transitive dependency when "closest" isn't what you want.
With ivy you can hit the strict resolve which says "you have to explicitly resolve every single conflict in your dependencies". This adds extra work # build time, but has a key result: if you explicitly declare the versions of things you want, you are now in control of what you have in your classpath.
The Ivy reference documentation strictly follows the XML tag structure of the ivy.xml and ivy-settings.xml files. You are expected to extract the information required directly from the document structure.
Decoding from the Ivy docs:
The conflict-managers tag is for declaring what conflict managers a project may use and configuring them if they accept configuration, not for setting the conflict manager to use.
<conflict-managers>
<latest-cm name="mylatest-conflict-manager" latest="my-latest-strategy"/>
<compatible-cm name="my-latest-compatible-conflict-manager" latest="my-latest-strategy"/>
</conflict-managers>
The settings tag has an attribute for choosing the default conflict manager:
<settings defaultConflictManager="strict"/>
Or in an ivy.xml:
<dependencies>
<dependency.../>
<conflict manager="strict">
</dependencies>
Note that most of the conflict managers are more liberal in their interpretation of your intentions than you would expect. Two examples:
* Branches are considered irrelevant, if a dependency is available on two branches the "latest" family of resolvers will pick the latest available from either.
* Both the "latest-time" and "latest-revision" resolvers ignore version constraints except to set boundaries on the matching space. e.g. if a depends on b-1.0 and c-1.0 but c-1.0 depends on b-5.0 then you will get b-5.0 despite it not meeting the constraint requested.
I assume your need is result of discovering one of these design flaws.

Examples of Semantic Version Names

I have been reading about semver. I really like the general idea. However, when it comes to putting it to practice, I feel like I'm missing some key pieces of information. I'm not sure where the name of a library exists, or what to do with file variants. For instance, is the file name something like [framework]-[semver].min.js? Are there popular JavaScript frameworks that use semver? I don't know of any.
Thank you!
Let me try to explain you.
If you are not developing a library that you like to keep for years to come, don't bother about it.. If you prefer to version every development, read the following.
Suppose you are an architect or developer developing a library that is aimed to be used by hundreds of developers over time, in a distributed manner. You really need to be cautious of what you are doing, what your developers are adding (so interesting features that grabs your attention to push those changes in the currently distributed file). You dont know how do you tell your library users to upgrade. In what scenarios? People followed some sort of versioning, and interestingly, their thoughts all are working fine.
Then why do you need semver ?
It says "There should be a concrete specification for anything for a group of people to follow anything collectively, even though they know it in their minds". With that thought, they made a specification. They have made their observation and clubbed all the best practices in the world about versioning software mainly, and given a single website where they listed them. that is semver.org. Its main principles are :
Imagine you have already released your library with a version "lib.1.0.98", Now follow these rules for subsequent development.
Let your library is bundled and named as xyz and,
Given a version number MAJOR.MINOR.PATCH, (like xyz.MAJOR.MINOR.PATCH), increment the:
1. MAJOR version when you make incompatible API changes
(existing code of users of your library breaks if they adapt this without code changes in their programs),
2. MINOR version when you add functionality in a backwards-compatible manner
(existing code works, and some improvements in performance and features also), and
3. PATCH version when you make backwards-compatible bug fixes.
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
If you are not a developer or are not in a position to develop a library of a standard, you need not worry at all about semver.
Finally, the famous [d3] library follows this practice.
Semantic Versioning only defines how to name your versions. It does not specify what you will do with your version number afterwards. You can put the version numbers in package names, you can store it in a properties file inside your application, or just publish it in a wiki. All those options are opened to discussion and not part of the problem space addressed by SemVer.
semver is used by npm and bower (and perhaps some other tools) for dependency management. Using semver it is possible to decide which versions of which packages to use if multiple libraries used depend on the same library.
As others have said, semantic versioning is a standard versioning scheme that tells your users which versions of your library should be compatible with each other, and which ones are not.
The idea, is to be able to give your users more confidence that it's safe to upgrade to a newer patch/version, because it's tried, tested, and true to being backwards compatible with the previous version (minor increments). That is, perceptively that's what your telling your users.
As far as tooling goes, I don't do much in javascript, but I typically let my build server handle stamping my assemblies etc with the correct version. I have a static major number I upgrade whenever I make breaking changes, a static minor number I upgrade everytime I add new features, and an auto-incrementing Patch number whenever I checkin bug fixes.
Especially if this is a javascript library you plan to share on a public repository of some kind (nuget, gem, etc) you probably want some for of automated packaging system, and you put the logic in there for specifying your version number (in the package meta data, in the name of the javascript file, which is typically the standard I've seen).
Take a look at sbt which is the Scala Build Tool. In it, we write dependencies like this:
val scalatest = "org.scalatest" %% "core" % "2.1.7" "test"
val jodatime = "org.joda" % "jodatime" % "1.4.5"
Wherein the operator %% means "the current version of Scala that you're building." Packaging things in this language generally create JAR files with the name like this <my project>_<scala version>_<library version>.jar which is quite handy for semantically naming things automagically. The % operator can be interpreted as "don't version this part."
That said, this resulted from the fact that the same library compiled to different Scala versions were not binary compatible with each other. So it was more as a result of, rather than a conscious design choice, the binary incompatibilities.

In a Maven project, what are reasons for either a nested or a flat directory layout?

As my Maven project grows, I'm trying to stay on top of the project structure. So far, I have a nested directory layout with 2-3 levels, where there's a POM on each level with module entries corresponding to the directories at that level. POM inheritance (parent property) does not necessarily follow this, and is not relevant for the purpose of this question.
Now, while the nested structure seems pretty natural to Maven, and it's nice and clean as long as you are on one particular level, I'm starting to get confused by what I look at in my IDE (Eclipse and IntelliJ IDEA).
I had a look at the Apache Felix sources, and they have a pretty complex project in what seems to be a flat directory structure, so I'm wondering if this would be a better way to go.
What are some pros and cons for either approach that you have experienced in practice?
Note that this question (which I found meanwhile) seems to be very similar. I'll leave it to the community to decide whether this should be closed as a duplicate.
I use a kind of mixed approach. Things with distinct lifecycle (from a release and thus VCS point of view) are flat, things with the same lifecycle are nested. And I use svn:externals for the checkout. I wrote about this approach in this previous answer.
I vote for nesting. I'm using IDEA 9 which shows the nesting in the project pane, so the presentation mirrors your logical project structure. (This wasn't the case in 8.1 - it was flattened out.)
I prefer keeping things nested, especially if the names are very similar - makes navigation much easier when using a command prompt. I have a project with names like myapp-layer-component, so they all start with the same prefix, and many have the same -layer-, so using autocomplete on the commandline is next to useless. Separating these out into a nested structure is then much easier because each part of the name (appname, layer or component) is repeated just once at each level in the directory structure.
If building from the command line, it's much easier to build a subset of the project, e.g. if I'm working on the db model, then often I need to build all projects in that area. This is tricky to do when the files are flattened out - the only way I know is to use the -pl argument to maven and specify the proejcts to build. With the nested directories, I just cd to the db directory and run mvn.
For example, instead of
myapp-web-gui1
myapp-web-gui2
myapp-web-base
myapp-svc-clustered
myapp-svc-clustered-integrationtest
myapp-svc-simple
myapp-db-model
myapp-db-hibernate
We have the structure
\myapp
\web
\gui1
pom.xml
\gui2
pom.xml (other poms omitted to keep it short)
\base
\svc
\clustered
\clustered-it
\simple
\db
\model
\hibernate
You could also add nesting for the integration tests, but this seems like driving the point too far.
With nesting, you also get all the benefits of inheritance (and some of it's pains...)
The only issue I've had with this is that the directory name doesn't match the artifact id. (I'm still using full artifactIds.) And so each project must explicitly define SCM paths, since these can no longer be inferred from the parent pom. Of course, each directory can be made the same as the artifactId, and then the SCM details can be inferred from the parent, but I found the long directory names a bit unwieldy.

Maven: Unofficial artifact naming scheme?

I'm creating some Maven artifacts for various dependencies for our projects, and while I'm taking my best guess at group / artifact IDs, I'd like to add something to flag them as "unofficial" and created by us for compilation, so that should we find official sources for the same thing in the future there's no confusion and we can simply change to point to the identifiers. Is there a best/common/reccomended practice for doing so?
I was just thinking something like setting groupId="org.providername.unofficial", but since Maven's all about "doing it our way" I just want to see if there's a precedent for something different already...
Maven coordinates to uniquely identify any artifact include groupId, artifactId and version. So, changing any of those would allow you to differentiate your artifact from other one. However if you want to be able to use your version instead of a standard artifact as one of the dependencies of some other component you would have to keep the same groupId and artifactId, or else you'd have to deal with excludes in that dependency. So, for that is would be the best to change just version, e.g. add some qualifier like 1.0mycompayname.
Is there a best/common/recommended practice for doing so?
To my knowledge, there is no official recommended practice for this (since the artifacts are non public after all) but I find that using a flag is a good idea that we also use (with an "internal" qualifier) and we also put such artifacts in a special "third-party" group of our repository manager.