How should I lay out my repository?

How should I lay out my repository? - maven-2

I'm moving an application out of an svn repository it shares with a bunch of other stuff into its own, brand new one. So, I have a chance to make a fresh start with layout.
The app itself has two components - a reasonably standard Java webapp, that talks to a database, and a backend component, also Java, that polls the db, and kicks off long-running processing tasks based on what it finds - essentially, the DB is being used as a queue. The code is broken up into three packages:
org.blah.common - code, such as DAOs, that is shared between web app and back end
org.blah.webapp - The web app; this depends on org.blah.common, and builds out to a .war file.
org.blah.backend - The back end process; this depends on org.blah.common, and builds out to a tar file containing a jar and some scripts.
I'd also like to get other bits of tomcat and apache config into svn as well.
Right now, all three packages are in svn under a src dir, and there's an ant script with different targets that build the different parts. It's all a bit scrappy - the svn:ignore property has gotten quite big, and it's not immediately apparent that the scripts in one dir are related to the code in some package down under src, while those in another are for starting and stopping tomcat.
I'm drawn to the maven standard directory layout, but I've not used it before. I've come up with this:
common/
src/
main/
java/
resources/
test/
java/
resources/
target/ # Not checked in
common.jar
webapp/
src/
main/
java/
resources/
webapp/
test/
java/
resources/
target/ # Not checked in
webapp.war
backend/
src/
main/
java/
perl/
resources/
test/
java/
resources/
target/ # Not checked in
backend.tar
infra/
tomcat/
bin/
conf/
apache/
bin/
conf/
db/
tables/
procs/
triggers/
Note that right now, I don't intend to migrate to maven - I'll adapt the existing ant scripts, since they work. I'd like to keep the option of moving to maven (or something like buildr, that uses the maven layout) at some point in the future though.
So:
Does this seem like a reasonable way of laying out the repository? Is there anything that will trip me up further down the line?
Is this going to be obvious to people new to the app?
Would this be compatible with maven, should I decide to use it? (I know that theoretically, you can make maven work with any layout, but I believe they recommend a standard for a reason.)
Are IDEs going to have any problems with this? (Depending on which computer I'm on, I use intellij or eclipse. Other people on my team - who helpfully don't have opinions on this - use netbeans.)

Does this seem like a reasonable way of laying out the repository? Is there anything that will trip me up further down the line?
Well, Maven captures industry best practices, including the layout, so this seems a very good choice even if you're not using Maven right now. Actually, this is the recommended migration strategy when moving from another technology to Maven: first, move to Maven layout and update the existing build scripts and then, introduce Maven. In your case, if all projects have the same lifecycle (if they are all released together), I don't have any particular remarks except maybe about the infra project that may not be managed this way with Maven but, nothing blocking right now.
Is this going to be obvious to people new to the app?
I find it pretty clear and, honestly, if some people have a problem with it and if they can't adapt, maybe it's them that need to be fixed :)
Would this be compatible with maven, should I decide to use it? (I know that theoretically, you can make maven work with any layout, but I believe they recommend a standard for a reason.)
It seems almost entirely compatible with Maven (except the infra part as I said but this is really not an issue). And yes, it's obviously simpler if you don't have to modify Maven's configuration and use the default conventions. Note that you could setup a Maven build in parallel of the Ant build to move seamlessly.
Are IDEs going to have any problems with this? (Depending on which computer I'm on, I use intellij or eclipse. Other people on my team - who helpfully don't have opinions on this - use netbeans.)
It's been a long time since I didn't import an Ant project into one of these IDE but I think that they should all be able to deal with this layout (100% sure when using Maven). The best way to answer this question would be to do some testing of course :)

The only issue I would have is that I expect src directories to directly have source code inside, rather than how the above layout is. However I think it is a way of thinking that I could overcome quite quickly, especially within Eclipse.

Why are the target directories in the repository? I am a fan of not checking in build results, as they can be reproduced easily. If they can't be reproduced easily, then that is the issue that should be solved instead of checking in binaries.
Other than that I don't see any issues with this layout. Except for the tomcat directory it is the standard maven layout.

Related

How to use Ivy/Ant to build using intermediate artifacts

I am trying to revise my build process to use ant with apache ivy for my personal projects. These consist of a few shared modules, and a few application modules that depend on the shared modules. For the sake of this post, let's simplify and say I have a shared module (common), and an application module (application) which depends on common. Each module has it's own effective svn repository:
svn_repo_1/common/trunk
/branches
/tags
svn_repo_2/application/trunk
/branches
/tags
I check out the relevant revision into a common workspace, in a flat structure:
workspace/common
workspace/application
In general, application will depend on a published version of common, so there will be no need to build common when building application.
However, when I need to add new functionality to common that is required by application, I would then like application to depend on the latest common build from my workspace (without needing to publish common to my repository).
I assumed this is what latest.integration meant (i.e. changing application's ivy.xml to specify latest.integration for the common revision). My intention was to use the ivy buildlist task to find the local modules that needed to be built before application could be built. This does not work however, because the buildlist task seems to include the common/build.xml entry regardless of whether application's ivy.xml file specifies latest.integration or some other published revision.
I would appreciate any suggestions. I am struggling with ivy's documentation and samples, so any real-world examples would also be helpful. Note: I am not interested in a Maven solution here.

Wow, this is truly deja vu! Go back to some of my first questions on this site from 3 - 4 months ago and they're almost all Ivy-related! I empathize with you 100% that Ivy is a difficult beast to learn and tame, but after using it professionally for a few months now, I'll never develop without it again. So my first piece of advice: keep going. Sooner or later, what little (practical) documentation you find on Apache Ivy will alll start to make sense and fall into play.
I can understand there may be extenuating reasons for why you don't want to publish your common to your repo. However, if you are a newcome to transitive dependency management, the first piece of practical advice I can give you is that you should always publish your JARs/WARs/whatever to your repo; not an intermediary "integration" local to your workspace.
The reason for this is simple: Ivy only has the ability to crawl the repositories you define in your settings file (basically). If you deliberately keep a JAR like common outside of one of these defined repositories, then: (a) Ivy has no way to resolve transitive dependencies (its primary job), and (b) "downstream" (dependent) JARs fail to be dynamically updated every time you tweak common. Thus, using Ivy only to not publish JARs is a bit counter-productive; I'm surprised Ivy even includes it as a feature.
I guess I would need to understand your motivation for not publishing common. If you're simply having problems getting the ivy:publish task to work, no worries I can provide plenty of examples to help get you started. But if there are some other reasons, then I ask you to consider this solution: set up multiple repositories.
Perhaps you have one "primary" repository where mostly everything gets published; and then you have a "secondary" or "intermediary" repository where you publish common to whenever it makes sense (for you) to do that. You can then configure your Ant build with two different publish tasks, such as publish-main and publish-integration.
That way you get the best of both worlds: you get your intermediary staging area, and you get to keep everything inside of Ivy's powerful control.

A layout for maven project with a patched dependency

Suppose, I have an opensource project that depends on some library, that must be patched in order to fix some issues. How do I do that? My ideas are:
Have that library sources set up as a module, keep them in my vcs. Pros: simple. Cons: some third party sources in my repo, might slow down build process, hard to find a patched place (though can be fixed in README)
Have a module, like in 1, but keep patched source files only, compile them with orignal library jar in classpath and somehow replace *.class files in library jar on build. Pros: builds faster, easy to find patched places. Cons: hard to configure, that jar hackery is non-obvious (library jar in repository and in my project assembly would be different)
Keep patched *.class files in main/resources, and replace on packaging like in 2). Pros: almost none. Cons: binaries in vcs, hard to recompile a patched class as patch compilation is not automated.
One nice solution is to create a distinct project with patched library sources, and deploy it on local/enterprise repository with -patched qualifier. But that would not fit for an opensourced project that is meant to be easily buildable by anyone who checks out its sources. Or should I just say "and also, before you build my project, please check out that stuff and run mvn install".

One nice solution is to create a distinct project with patched library sources, and deploy it on local/enterprise repository with -patched qualifier. But that would not fit for an opensourced project that is meant to be easily buildable by anyone who checks out its sources. Or should I just say "and also, before you build my project, please check out that stuff and run mvn install".
This is what I would do (and actually what I do) for both a corporate and an opensource project. Get the sources, put them under version control in a distinct project, patch them, rebuild the patched library (and include this information in the version, something like X.Y.Z-patched), deploy it to a repository (you could use SVN for this, a la Google Code1), declare the repository in your POM and update the dependency to point on your patched version.
With this approach, you can say to your users: check out my code and run mvn install and they will just get the patched version without any extra action. This is IMHO the cleanest way (not error prone, no class path order mess, no increase of the build time, etc).
1 Lots of people are deploying their code to their hosted subversion repository (how-to in this post).

One nice solution is to create a distinct project with patched library sources, and deploy it on local/enterprise repository with -patched qualifier. But that would not fit for an opensourced project that is meant to be easily buildable by anyone who checks out its sources. Or should I just say "and also, before you build my project, please check out that stuff and run mvn install".
I'd agree with this and Pascal's answer. Some additional notes:
you may use dependency:unpack on the original artifact and then combine that with your compiled classes if you don't want to rebuild the whole dependant project
in either case, your pom.xml will need to correctly represent the dependencies of that library
you can still integrate this as part of your project's build to avoid the 'deploy to a repository' step
make sure you honour the constraints of the project's license when doing all this!

Creating one big War file out of 2 war projects

Anyone knows a decent way to merge 2 war modules in to one big war file ?
Maybe through some custom maven plugin , or maven-war-plugin configuration ?
Thanks

It's not that simple.
The basic problem is that each WAR is its own namespace within the container, so simply mashing them together could readily produce conflicts if WAR A overwrites something in WAR B (index.jsp is a good example).
The prudent thing is to take each WAR and put them in to their own subtree of a new WAR, but even still you have "global" artifacts that would need to be resolved, notably the contents of the web.xml, but also things like properties files that tend to be "one per WAR", log4j.properties for example.
Finally, a portable WAR doesn't "hard code" it's WARs name in to their links, but rather relies on getting the context path from the request. However, if you merge two WARs underneath a master WAR, the context path is only to the root of the application, not the specific sub directory of each individual WAR. So, you'll need to hunt down all of those references, or references where the path was hard coded, and correct them.
So, there's really no automated way to merge WARs.

The maven Cargo plugin can merge WAR files but I've never used this outside a testing context (where I had full control on what I wanted to merge).
For simpler scenarios, you can maybe use overlays.
But none of these solutions will magically solve collisions. You'll have to do some choices.

Maven - installing artifacts to a local repository in workspace

I'd like to have a way in which 'mvn install' puts files in a repository folder under my source (checkout) root, while using 3rd party dependencies from ~/.m2/repository.
So after 'mvn install', the layout is:
/work/project/
repository
com/example/foo-1.0.jar
com/example/bar-1.0.jar
foo
src/main/java
bar
src/main/java
~/.m2/repository
log4j/log4j/1.2/log4j-1.2.jar
(In particular, /work/project/repository does not contain log4j)
In essense, I'm looking for a way of creating a composite repository that references other repositories
My intention is to be able to have multiple checkouts of the same source and work on each without overwriting each other in the local repository with 'install'. Multiple checkouts can be because of working on different branches in cvs/svn but in my case it is due to cloning of the master branch in git (in git, each clone is like a branch). I don't like the alternatives which are to use a special version/classifier per checkout or to reinstall (rebuild) everything each time I switch.

Maven can search multiple repositories (local, remote, "fake" remote) to resolve dependencies but there is only ONE local repository where artifacts get installed during install. It would be a real nightmare to install artifacts into specific locations and to maintain this list without breaking anything, that would just not work, you don't want to do this.
But, TBH, I don't get the point. So, why do you want to do this? There might be alternative and much simpler solutions, like installing your artifacts in the local repository and then copying them under your project root. Why wouldn't this work? I'd really like to know the final intention though.
UPDATE: Having read the update of the initial question, the only solution I can think of (given that you don't want to use different versions/tags) would be to use two local repositories and to switch between them (very error prone though).
To do so, either use different user accounts (as the local repository is user specific by default).
Or update your ~/.m2/settings.xml each time you want to switch:
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
http://maven.apache.org/xsd/settings-1.0.0.xsd">
<localRepository>${user.home}/.m2/repository</localRepository>
<!--localRepository>${user.home}/.m2/repository2</localRepository-->
...
</settings>
Or have another settings.xml and point on it using the --settings option:
mvn install --settings /path/to/alternate/settings.xml
Or specify the alternate location on the command line using the -Dmaven.repo.local option:
mvn -Dmaven.repo.local=/path/to/repo
These solutions are all error prone as I said and none of them is very satisfying. Even if you might have very good reasons to work on several branches in parallel, your use case (not rebuilding everything) is not very common. Here, using distinct user accounts migh be the less worse solution IMO.

This is INDEED possible with the command line, and in fact is quite useful. For example, if you want to create an additional repo under your Eclipse project, you just do:
mvn install:install-file -DlocalRepositoryPath=repo \
-DcreateChecksum=true -Dpackaging=jar \
-Dfile=%2 -DgroupId=%3 -DartifactId=%4 -Dversion=%5
It's the "localRepositoryPath" parameter that will direct your install to any local repo you want.
I have this in a batch file that I run from my project root, and it installs the file into a "repo" directory within my project (hence the % parameters). So why would you want to do this? Well, let's you say you are professional services consultant, and you regularly go into customer locations where you are forced to use their security hardened laptops. You copy your self-contained project to their laptop from a USB stick, and presto, you can do your maven build no problem.
Generally, if you are using YOUR laptop, then it makes sense to have a single local repo that has everything in it. But to you who got cocky and said things like "why would you want to do that", I have some news...the world is a bigger place with more options than you might realize. If you are using laptops that are NOT yours, and you need to build your project on that laptop, get the resulting artifact, and then remove your project directory (and the local repo you just used), this is the way to go.
As to why you would want to have 2 local repos, the default .m2/repository is where the companies standard stuff goes, and the local "in project" repo is where YOUR stuff goes.

This is not possible with the command line client but you can create more complex repository layouts with a Maven repository server like Nexus.
The reason why it's not possible is that Maven allows to nest projects and most of them will reference each other, so installing each artifact in a different repository would lead to lots of searches on your local hard disk (or to failed builds when you start a build in a sub-project).

FYI: symlinks work in Windows7 and above so this kind of thing is easy to achieve if all your code goes in the same place in the local repo, i.e /com/myco/.
type mklink for details

I can see that you do not want to use special versions or classifiers but that is one of the best solutions to solve this problem. I work on the same project but different versions and each mvn install takes half an hour to build. The best option is to change the pom version appended with the change name, for example 1.0.0-SNAPSHOT-change1 that I'm working on thereby having multiple versions of the same project but with different code base.
It has made my life very easy in the long run. It helps run multiple builds at the same time without issues. Even during SCM push, we can skip the pom file from staging so there can always be 2 versions for you to work on.
In case you have a huge project with multiple sub-modules and want to change all the versions together, you can use the below command to do just that
mvn versions:set -DnewVersion=1.0.0-SNAPSHOT-change1 -DprocessAllModules
And once done, you can revert using
mvn versions:revert
I know this might be not what you are looking for, but it might help someone who wants to do this.

Maven repository configurations

I've asked a similar question in which part of this was addressed, but I'd like to expand in more detail.
When configuring maven to look at internal repositories, is it best to put that information in the project pom or in a user's settings.xml? An explanation on why would be really helpful here.
thanks,
Jeff

You should always try to make the maven project so that it compiles from a clean checkout from source control in your local environment; without a settings.xml. In my opinion this means that you place any overrides to sensible default values in the user's settings.xml file. But the pom should contain sensible values that will work for everyone.

I encourage you to put the repository definition in the POM, this way any developer just grab a copy of the code and run Maven to get it compiled, without having to change things in his settings file.
I find the setting.xml file useful just for hacking Maven's behaviour in special situations, for example when one repository is not accessible due to a firewall and you need to use a mirror. But that's my personal opinion. Maven documentation gives you more freedom:
The settings element in the
settings.xml file contains elements
used to define values which configure
Maven execution in various ways, like
the pom.xml, but should not be bundled
to any specific project, or
distributed to an audience. These
include values such as the local
repository location, alternate remote
repository servers, and authentication
information.
If you have a local repository which is used in every single project you may add that at the settings.xml, just be sure that configuration is well documented, in my current project it's not and new developers struggle at the beginning when they try to compile something.

We use the user's settings.xml and include info in the README about what possible other repos may be needed.
In theory a given group-artifact-version is the same no matter which repo it comes from. It works pretty well for us. If you find yourself with two different assets that have the same group-artifact-version identifier, then that indicates you're doing something really bad.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas