Apache Ivy: Where do I put all these JARs? - apache

I'm trying to convince the higher-ups at my work place to migrate to Apache Ivy. I've managed to get a few sandbox projects working using Ivy to power the build, and now I have a greenlight to put together a migration proposal.
We all agree on one thing: we don't want to trust JARs that are located in public directories! I know, I know, a bit paranoid, yes. But we'd like to have a setup where we pull a JAR from a trusted source (either downloading it from the open source project itself, or most likely, gulp, a public repo), and use it for some time before we "certify" it (give it our blessing as a safe artifact to use).
Then we want to have a common repository for all JARs used by our many projects.
My original thinking was to place this repository up in version control (we have an SVN server). But I wasn't sure what best practices dictate. It might make more sense to put our JARs on a file server and FTP to them in the Ivy script.
Either way, SVN (HTTPS) or FTP, all of our servers are authenticated. So, a small number of questions:
Where should we be publishing all of our "certified" JARs (everything from `log4j` to any homegrown JARs we produce)? What do best practices dictate?
The "ivyrep" resolver-type does not take username or passwd atrributes. If our "JAR server" (FTP, SVN, etc.) is authenticated, how do I configure the Ivy scripts to login?

I must echo Brian's recommendation to use a repository manager like Nexus. It's a lot less work in the long run. You'll also discover that the professional version of Nexus enables you to create approval processes around repositories which you plan to use in your build. See the procurement suite functionality.
If, on the other hand, you are determined to build your own repository, then ivy has the tools for the job. You need to become very familiar with the ivy settings file and how it declares and uses resolvers.
If repository is accessible via HTTPS the the url resolver should be able to access it. The resolver will assume that each version of an artifact is in a different directory and you'll need to specify the URL pattern that ivy will need to use when accessing the repository:
<url name="two-patterns-example">
<ivy pattern="http://ivyrep.mycompany.com/[module]/[revision]/ivy-[revision].xml" />
<artifact pattern="http://ivyrep.mycompany.com/[module]/[revision]/[artifact]-[revision].[ext]" />
</url>
The pattern is fully flexible to how you store the artifacts.
Authentication is also handled in the settings file using the credentials tag.
Finally, the FTP protocol is also supported. It's hard to find in the doco, but it's supported by the vfs resolver.
I think that's enough information on an option I don't recommend :-) Having said that I once created an FTP based repository for managing releases to clients. It's useful to have a tool this powerful :-)

Why not use something like Sonatype's Nexus. I've seen it used for Maven, and I believe it'll work for Ivy.
You can set it up to download from remote repositories into (say) a 'test' repository. You can then evaluate those .jars, and if they're good, upload them into an 'approved' repository for general consumption. There's some authentication surrounding this, but you'd have to evaluate that in greater depth. Certainly you can restrict the uploading into repositories via a username/password pair.

Related

Cache Credentials During SVN Merge

A merge from a feature branch to trunk took over 45 minutes to complete.
The merge included a whole lot of jars (~250MB), however, when I did it on the server with the file:// protocol the process took less than 30 seconds.
SVN is being served up by Apache over https.
The version of SVN on the server is
svn, version 1.6.12 (r955767)
compiled Sep 3 2013, 17:49:49
My local version is
svn, version 1.7.7 (r1393599)
compiled Oct 8 2012, 20:42:17
On checking the Apache logs I made over 10k requests and apparently each of these requests went through an authentication layer.
Is there a way to configure the server so that it caches the credentials for a period and doesn't make so many authentication requests?
I guess the tricky part is making sure the credentials are only cached for the life of single svn 'request'. If svn merge makes lots of unique individual https requests, how would you determine how long to store the credential for without adding potential security holes?
First of all I'd strongly suggest you upgrade the server to a 1.7 or 1.8 versions since 1.7 and newer servers support an updated version of the protocol that requires fewer requests for many actions.
Second, if you're using path based authorization you probably want SVNPathAuthz short_circuit in your configuration. Without this for secondary paths (i.e. paths not in the request URI) as may happen for many recursive requests (especially log) when the authorization for those paths are run it runs back through the entire Apache httpd authentication infrastructure. With the setting instead of running the entire authentication/authorization infrastructure for httpd, we simply ask mod_authz_svn to authorize the action against the path. Running through the entire httpd infrastructure can be especially painful if you're using LDAP and it needs to go back to the LDAP server to check credentials. The only reason not to use the short_circuit setting is if you have some other authentication module that depends on the path, I've yet to see an actual setup like this in the wild though.
Finally, if you are using LDAP then I suggest you configure the caching of credentials since this can greatly speed up authentication. Apache httpd provides the mod_ldap module for this and suggest you read the documentation for it.
If you provide more details of the server side setup I might be able to give more tailored suggestions.
The comments suggesting that you not put jars in the repository are valuable, but with some configuration improvements you can help resolve some of your slowness anyway.
The merge included a whole lot of jars (~250MB)
That's your problem! If you go through your network via http://, you have to send those jars via http://, and that can be painfully slow. You can increase the cache size of Apache httpd, or you can setup a parallel svn:// server, but you're still sending 1/4 gigabyte of jars through the network. It's why file:// was so much faster.
You should not be storing jars in your Subversion repository. Here's why:
Version control gives you a lot of power:
It helps you merge differences between branches
It helps you follow the changes taking place.
It helps identify a particular change and why a particular change took place.
Storing binary files like jars provide you none of that. You can't merge binary files, and you can't track their changes.
Not only that, but version control systems usually use diffs to track changes. This saves a lot of space. Imagine a 1 kilobyte text file. In 5 revisions, six lines are changed. Instead of taking up 6K of space, only 1K plus those six changes are stored.
When you store a jar, and then a new version of that jar, you can't easily do a diff, and since jar format is zip, you can't really compress them either, store five versions of a jar in Subversion, and you store pretty close to five times the size of that jar. If a jar file is 10K, you're storing 50K of space for that jar.
So, not only are jar files taking up a lot of space, and they don't give you any power in storage, they can quickly take over your repository. I've seen sites where over 90% of a 8 gigabyte repository is nothing but compiled code and third party jars. And, the useful life of these binary files is really quite limited too. So, in these places, 80% of their Subversion repository is wasted space.
Even worse, you tend to lose where you got that jar, and what is in it. When users put in a jar called commons-beans.jar, I don't know what version that jar is, whether that jar was built by someone, and whether it was somehow munged by that person. I've see users merge two separate jars into a single jar for ease of use. If someone calls that jar commmons-beanutils-1.5.jar because it was version 1.5, it's very likely that someone will update it to version 1.7, but not change the name. (It would affect the build, you have to add and delete, there is always some reason).
So, there's a massive amount of wasted space with little benefit and almost no information. Storing jars is just plain bad news.
But your build needs jars! What should you do?
Get a jar repository like Nexus or Artifactory. Both of these repository managers are free and open source.
Once you store your jars in there, you can fetch the revision of the jar you want either through Maven, Gradel, or if you use Ant and want to keep your Ant build system, Ivy. You can also, if you don't feel like being that fancy, fetch the jars via an Ant <get/> task. If you use Jenkins, Jenkins can easily deploy the built jars for other projects to use in your Maven repository.
So, get rid of the jars. Merging will then be a simple diff between text files. Merging branches will be much quicker, and less information has to be sent over the network. If you don't want to switch to Maven, then use Ivy, or simply update your builds with the <wget> task to fetch the jars and the versions you need.

Managing commit rights in svn by delegating to project managers

We have multiple projects in svn repo.And for each project there are many users.As number of users is large so its troublesome to manage their commit rights using "Auth file".
I have read somewhere that we can delegate user's rights to their managers by creating a text file.But i am not sure how to achieve this and perhaps hOOKS need to be configured for this .As i am new to SVN so need your expert advice.Please guide me how to achieve this and if you have hook already confgiured p,kindly provide.
How to setup access control in SVN?
I have seen this link and answer by VonC is great and perfect for me.But i dont know how to start .. can anybody help me out here as i am not pro in svn and unix .
Thanks in advance
Preface
Using single repository for multiple projects is Bad Idea (tm): one repo - one project
Forget immediately about old as mammoth's shit SVN 1.5 - use at least 1.6 on client and server (1.8 may be best choice)
Face
Simplified user-management for SVN-users can be reached using LDAP-based authentication instead of ordinary file (in case of "repository per project" <location> from answer will be location of each repo with SVNPath, in case of old structure <location> must be linked to every root of project) and having different groups for different repositories in Require ldap-group directive - read also Apache 2.2 docs in Apache Module mod_authnz_ldap part. From management POV, LDAP-auth and permissions means: each developer must be in LDAP-tree, included in one or more related to repositories groups
In case of additional requirement for Path-Based Authorization within repositories and using groups inside authz-file, you may find useful LDAP Groups to Subversion Authz Groups Bridge, which allow you to regenerate authz-groups from LDAP-data
As result, most (if not all) SVN-related ACLs can be managed in LDAP-side only

Make Maven Proxy/Server settings configurable based on location?

So I'm not sure what the best way to accomplish this is, but basically I have a laptop that I use at work for Maven projects. It works fine when I'm at work, but as soon as I walk out of the door of their corporate proxy and maven server, I often have to do alot of hand-fudging of the settings.xml file when I'm at home if I'm not VPN'ed in:
We have a corporate-installed Maven Repository proxy server to store some of our own artifacts and handle being the middle-man for our commonly used artifacts.
We have an http proxy that we use for connecting to the outside world.
Both configurations have been handled by my settings.xml file for setting a single Nexus group and maven proxies. If I'm not connected to the VPN while away from the office, I have to muck around with the settings.xml each time I'm not on it, then switch it back when I am on it.
What solutions have anyone else found to handle this? I've been trying profiles to manage the proxy, but I can't seem to get it to work correctly, and it's starting to look pretty ugly. Are there some settings configurations that can detect when I'm not behind the proxy at work and not use the corporate proxy server or Maven server?
While I can think of some profile based solution to handle the proxy (basically, reading the <active> value from a property defined in a profile), this wouldn't be fully automated (the profile activation do not support network based stuff) unless you can find a file that is present or not depending on your location (in which case, you could use an existing/missing file trigger but this is kinda hacky). Anyway, this would solve only one part of the problem because mirrors can't be declared in profiles (see MNG-3525).
So, instead of trying to control this with a profile, my suggestion would be to use two settings.xml and to pass your settings-home.xml file with the -s command line option when you're at home.
Another option would be to automate the changes in your settings.xml with a script (Groovy would be a good choice as someone reported in MNG-3525).
I found a use environment variables to set nonProxyHosts together with proxy and noproxy shell aliases to be the most convenient solution when switching between networks with proxy and without it.
In settings.xml, configure proxy with
<host>proxy.corporation.int</host>
<port>8080</port>
<nonProxyHosts>${env.MAVEN_NONPROXY}</nonProxyHosts>
Then in ~/.profile set
export MAVEN_NONPROXY_PROXY='*.corporation.int|local.net|some.host.com'
export MAVEN_NONPROXY_NOPROXY='*'
alias proxy="export MAVEN_NONPROXY=\"$MAVEN_NONPROXY_PROXY\" && export all_proxy=http://proxy.corporation.int:8080"
alias noproxy="export MAVEN_NONPROXY=\"$MAVEN_NONPROXY_NOPROXY\" && unset all_proxy"
To do the switch when roaming, you would just execute from a shell:
[me#linuxbox me]$ proxy
or
[me#linuxbox me]$ noproxy
Obviously, both aliases proxy and noproxy can include much more changes than just setup of MAVEN_NOPROXY and all_proxy.
I was frustrated by the same problem: having to manually edit settings.xml when roaming between networks. So much in fact, that I wrote a Maven plugin that enables automatic discovery of proxy settings. The current implementation uses the proxy-vole library written by Bernd Rosstauscher to detect proxy settings based on OS configuration, browser, and environment settings.
I've just released the source code of the plugin on Github, under an Apache 2.0 license: https://github.com/volkertb/autoproxy-maven-plugin
You're welcome to give it a try and to see if it meets your needs. Any feedback or contributions are welcome!
(Note: you don't necessarily have to add the plugin to your project's POM. You can invoke it from the command line as well, after you've installed it. See the README on the site for more details.)
You can set MAVEN_OPTS when you need to activate a proxy:
export MAVEN_OPTS="-Dhttp.proxyHost=my-proxy-server -Dhttp.proxyPort=80 -Dhttp.nonProxyHosts=*.my.org -Dhttps.proxyHost=my-proxy-server -Dhttps.proxyPort=80 -Dhttps.nonProxyHosts=*.my.org"

How do I backup a nexus repository manager

The nexus book: http://www.sonatype.com/books/nexus-book/reference/. Does not seem to spend any time on how one should go about backing up a nexus repository. If I am installing my snapshot and releases into this local repository, it seems that it would behoove me to back it up. However, I'm not really interested in backing up anything that can easily be downloaded from a remote repository.
Some google searches do not seem to reveal the canonical answer either, so perhaps for posterity it can be recorded here.
Thanks,
Nathan
When you install Nexus, you'll end up with two directories:
nexus-webapp-1.3.1.1/
sonatype-work/
We've separated the application from the data and configuration. The Nexus application is in nexus-webapp-1.3.1.1/ and the data and configuration is in sonatype-work/nexus. This was mainly done to facilitate easier upgrades, but it also has the side-effect of making it very easy to backup a Nexus installation.
The Simple Answer
Nexus doesn't store repositories in a database or do anything that would preclude a simple backup of the file system under sonatype-work/nexus. If you need to create a complete backup, just archive the contents of the sonatype-work/nexus.
Better Answer
If you want a more intelligent approach to backing up a Nexus installation, you will certainly want to backup everything under sonatype-work/nexus/conf, sonatype-work/nexus/storage, sonatype-work/nexus/template-store. If you want to backup the metadata and file attributes that Nexus keeps for proxy repository, backup sonatype-work/nexus/proxy, although this isn't required as the information about the proxy repository will be generated on-demand as attributes are requested.
You don't need to backup sonatype-work/nexus/logs and you don't need to backup the Lucene indexes in sonatype-work/nexus/indexer.
Nexus Pro Answer
There is a Nexus Professional plugin which can automate the process of creating a backup of the Nexus configuration data. This plugin is going to address the contents of the sonatype-work/nexus/conf directory. If you need to backup the sonatype-work/nexus/storage directory, you will need to configure some backup system to backup the contents of that filesystem. Once again, as with Nexus Open Source, there is currently no real benefit in backing up the contents of sonatype-work/nexus/indexer or sonatype-work/nexus/logs.
Excluding Storage for Remote Repositories
In your question you mention that you want to exclude the storage devoted to the local cache of a remote repository. If you are interested in doing this, you'll have to take a further level of granularity and just exclude the directories under sonatype-work/nexus/storage that correspond to the remote repositories.
Do you need to shut Nexus down for a backup?
Brian Fox told me no, the only real chance for file contention is going to be the files in the indexer/ directory. You shouldn't have a problem backing up the sonatype-work filesystem with a running instance of Nexus.
BTW, thanks for the question, this answer will likely be incorporated into the next version of the Nexus book.
afaik nexus (free version) does not have any backup features, but it should be as simple, as knowing your companies groupId and grabbing it from the storage directories in nexus
but i would schedule a complete repository backup too, you never know when the remote repositories are down, when you need them the most

Should I use an FTP server as a maven host?

I would like to host a Maven repository for a framework we're working on and its dependencies. Can I just deploy my artifacts to my FTP host using mvn deploy, or should I manually deploy and/or setup some things before being able to deploy artifacts? I only have FTP access to server I want to host the Maven repo on.
The online repository I want to use is not hosted by myself. As I say, I only have FTP access, so if possible, I would like to use that FTP space as a Maven repository. The tools mentioned seem to work when you have full control over the host machine, or at least more than just FTP access since you need to configure the local directories where the repositories will be placed. Is this possible?
You might want to have a look at Nexus, a Maven repository manager. We've replaced our local Maven repository with a Nexus-based one and find it tremendously useful.
I've successfully used Archiva as my repository for several years ... see http://archiva.apache.org/. It's easy to administer and allows you to configure as many repositories as you need (SNAPSHOT, internal, external, etc).
According to the book "Better Builds with Maven", the most common type of repository is HTTP, this paragraph describes what I think you need:
This chapter will assume the repositories are running from http://localhost:8081/ and that artifacts are deployed to the repositories using the file system. However, it is possible to use a repository on another server with any combination of supported protocols including http, ftp, scp, sftp and more. For more information, refer to Chapter 3.
A Maven 2 repository is simply a specific directory structure, so once you get the transport and server specifications right for the repository and deployment portion of your POMs, it should be completely transparent to your users.
You can even use Dropbox. All that you need is a public address to access the files generated with mvn deploy, with any of the protocols in the accepted answer.
I guess there are more services that can work in the same way, but I'm not certain about the URL schemes that alternatives to Dropbox may use.
https://maven.apache.org/wagon/wagon-providers/wagon-ftp/ will tell you that you can use ftp to read from an existing repository, but not to create a new one. I don't think that it is impossible in principle, but no one has cared to write all the fiddly code to do the directory management via ftp.