Accessing Java BigQuery Tools in Maven with Nexus - google-bigquery

I'm at the very beginning of building a bigquery uploader in Java. Goal is to download a full twitter stream and uploading that into hourly buckets, then processing these in Dremel for a topic detection and tracking project. This is all Java on MacOSX in Eclipse Juno, plus Maven with a local Nexus proxy.
I'm stuck at the starting gate; finding and compiling a simple Java sample that authenticates and uploads a CSV file. Closest I've found is bigquery-appengine-sample, although I don't see why I need appengine for a bigquery project. Will explore that later.
Problem is, the project won't build in maven. No error flags shown in eclipse, but often eclipse error flags are unreliable. The pom shows a red eclipse flag on this element:
com.google.apis
google-api-services-bigquery
${bigquery.version}
Maven install fails with
[ERROR] Failed to execute goal on project bigquery-appengine-sample: Could not resolve dependencies for project com.google.api.client:bigquery-appengi
ne-sample:war:1.0.0-SNAPSHOT: The following artifacts could not be resolved: com.google.apis-samples:shared-sample-appengine:jar:1.3.2, com.google.api
s:google-api-services-bigquery:jar:v2-rev18-1.7.2-beta: Failure to find com.google.apis-samples:shared-sample-appengine:jar:1.3.2 in ...SOF won't allow URLs... was cached in the local repository, resolution will not be reattempted until the update interval of nexus has elapsed
or updates are forced -> [Help 1]
Nexus proxies are defined for http://mavenrepo.google-api-java-client.googlecode.com/hg/ and http://google-gson.googlecode.com/svn/mavenrepo/ (both set as SNAPSHOT). Neither of these seem seem to be working. Browse storage and indexes both come up empty (just an archetype catalog). However Browse Remote does seem to work AFAICT; shows a maven repository tree that seems complete.
So my question: how to build bigquery samples in Java with maven?

There's a few non-App Engine Java samples here: http://code.google.com/p/google-bigquery-tools/source/browse/samples/java/
I use Sonatype's Maven plugin, and at minimum, your pom.xml should look like this:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.google</groupId>
<artifactId>google</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>google</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>1.10.3-beta</version>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-bigquery</artifactId>
<version>v2-rev19-1.7.2-beta</version>
</dependency>
<dependency>
<groupId>com.google.oauth-client</groupId>
<artifactId>google-oauth-client</artifactId>
<version>1.8.0-beta</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>google-api-services</id>
<url>http://mavenrepo.google-api-java-client.googlecode.com/hg</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
</project>

Based on what I can observe from outside google, the answer to the question as posed is:
No java example is available (yet?) that covers uploading. The sample
at the address in the posted answer only covers querying sample data
that is already present in the account.
The closest sample to what I need seems to be the
command line and python samples. But using them is complicated by the fact that
each sample is hard-wired
to work only with specific maven dependency versions, samples aren't being updated
when new API versions are released, and older API versions are often not available via
maven.
The maven problem seems to originate from using a wildly idiosyncratic version
numbering system (such as the v2beta1-rev17-1.7.1-beta-beta example
above). Maven relies on strict adherence to convention, and such
version numbers definitely don't comply. For rapidly changing
beta code, version numbers should be major#.minor#.micro-SNAPSHOT as documented in the maven book. NO EXCEPTIONS or maven breaks as explained in the comments.
URGENT REQUEST: Please provide a working java sample that covers all BigQuery features, particularly step 1; uploading data. Please bundle the samples with the API code to ensure they stay in sync. And adopt maven conventions throughout, including retroactively.

I was able to get this to work after testing out the url http://google-api-client-libraries.appspot.com/mavenrepo for these libraries. It does not allow browsing so nexus will attempt to index this repo and fail. When this happens Nexus blocks access to this repo and thus local build fails due to missing dependencies.
To alleviate this I changed two settings in Nexus for this repository.
Download Remote Indexes = false
AutoBlock Enabled = false
With this set I was able to build and download dependencies correctly.
Note: We also use GCM-Server Repo for Android and it had this same issue, the same solution was applied and we were able to cache the new dependency.

Related

how to download the pdfbox-app jar?

I'm working with pdfbox-app-2.0.0-20140226.103319-176.jar. But i notice there is a continues development and Apache PDFBox application published new version frequently. In the officials’ side I can see pdfbox-app-2.0.0-**-182,183,184.jar in the following URL. I try to get with the pdfbox-app-2.0.0-**-177,178,179,180,181.jars using pom.xml file, but no luck. Could you please help me to get pdfbox-app-2.0.0-**-177,178,179,180,181.jars.
Could you please help me to get pdfbox-app-2.0.0-**-177,178,179,180,181.jars.
No, no one could (except in the case the source repository were tagged for every snapshot, or you are so lucky to find somebody who have saved that specific version).
You should work with public repository, but if you need a not yet published functionality you could work with snapshot repository. In such a case you have to be aware you are using an unstable, rapidly-evolving, version so if your program works today it might not work tomorrow because the code has evolved: in general this "problem" is a wanted behavior when working with unstable versions.
In fact that problem notifies you that you are counting on a functionality that will not exists in the future at the same way as it existed in your used (old) version, and the earlier you discover the error the better your are able to change your work without too much pain.
In order to work with a snapshot repository add (in your case):
<project>
<modelVersion>4.0.0</modelVersion>
<repositories>
<repository>
<id>ApacheSnapshot</id>
<name>Apache Repository</name>
<url>https://repository.apache.org/content/groups/snapshots/</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
then add the dependency:
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox-app</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
More generally speaking working effectively with SNAPSHOT is to be intended in the context of a development process where developer teams communicate and project managers are in charge of making deadlines respected.

Running unit tests in Tycho fails: resolves google-collections instead of Guava

I am having an issue running tests using tycho due to an incorrect dependency resolution that, somehow, is placing the the old Google Collections .jar on the classpath and not the Guava one, despite the fact that at no point in any of my poms do I specify a dependency on collections (only guava).
My unit tests fail due to things like NoSuchMethodError (ImmutableList.copyOf), NoClassDefFoundError (Joiner), which I pretty much narrowed down to 'finding the wrong jar'. These same tests pass when ran manually in Eclipse.
Here is the relevant part of the pom:
<dependencies>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>14.0.1</version>
</dependency>
...
</dependencies>
The phrase 'google collections' appears no where. The only other repository I specify is:
<repositories>
<repository>
<id>helios</id>
<layout>p2</layout>
<url>http://download.eclipse.org/releases/helios</url>
</repository>
</repositories>
My plugin imports 'com.google.common.base' and 'com.google.common.collect' as imported packages. I have my own bundled version of Guava 14 in my workspace for debugging, but in the POM I elect to not use my local module.
I followed Sean Patrick Floyd's answer on this question (JUnit throws java.lang.NoSuchMethodError For com.google.common.collect.Iterables.tryFind), and had my test throw an exception with the location of the .jar that the Iterables class was loaded from. It spat back out:
java.lang.IllegalArgumentException: file:/C:/Documents and Settings/Erika Redmark/.m2/repository/p2/osgi/bundle/com.google.collect/0.8.0.v201102150722/com.google.collect-0.8.0.v201102150722.jar
This is where I am now stuck. This google-collections jar is coming seemingly out of no where, and I don't know how to stop it. As long as it is being resolved, my unit tests will fail. How can I stop Tycho from trying to get the old Google Collections?
Just to clarify, this has not stopped building and deployment; the plugin update site is on an CI platform and we have been able to install the plugin on different Eclipse IDEs, so this issue is only affecting the tests.
Please let me know if additional information is needed.
The plug-in com.google.collect 0.8.0.v201102150722 is part of the Helios p2 repository that you have configured in your POM. This means that this plug-in is part of the target platform and so may be used to resolve dependencies.
If you want to ensure that the bundle is not used, make sure that it is not part of the target platform. In your case, the easiest way to do this is to explicitly remove the plug-in from the target platform:
<plugin>
<groupId>org.eclipse.tycho</groupId>
<artifactId>target-platform-configuration</artifactId>
<version>${tycho-version}</version>
<configuration>
<filters>
<filter>
<type>eclipse-plugin</type>
<id>com.google.collect</id>
<removeAll />
</filter>
</filters>
</configuration>
</plugin>
Next, you need to make sure that the guava plug-in is part of the target platform. You can add an artifact from a Maven repository to the target platform in the following way:
Declare a Maven dependency to the artifact in the dependencies section of the POM. You already have done this correctly.
Set the configuration parameter <pomDependencies> to consider on Tycho's target-platform-configuration plug-in.
Note that this will generally only work if the referenced artifact is already an OSGi bundle. This is the case here: com.google.guava:guava:14.0.1 seems to have all manifest headers needed by OSGi.
This should give you the result you wanted: In the test runtime, guava should now be used to match your com.google.common.* package imports.
And another general remark on declaring dependencies in Tycho: In Tycho, you can only declare dependencies in the PDE source files META-INF/MANIFEST.MF, feature.xml, etc.
The normal Maven-style dependencies declared in the POM do not add dependencies to the project. As explained above, the POM dependencies may only add artifacts to the target platform, i.e. the set of artifacts that may be used by Tycho to resolve the dependencies declared in the PDE source files. So in the end, the POM dependency may become part of the resolved dependencies, but only if the dependency resolver picks it for matching one of the declared dependencies.
by default, tycho will add any p2 artifacts you installed in your local maven repo to the target platform. If bundle com.google.collect exports the package which you import, it may be wired.
To stop tycho from including any locally installed artifacts, you can use -Dtycho.localArtifacts=ignore (or, remove the unwanted bundle from your local maven repo)
See http://wiki.eclipse.org/Tycho/Release_Notes/0.16#Improvements_and_Fixes

EasyWSDL has missing dependencies

I am trying to add a WSDL module to my existing application, but I'm struggling to get the dependencies resolved.
According to their website, this is the correct dependency
<dependency>
<groupId>org.ow2.easywsdl</groupId>
<artifactId>easywsdl-wsdl</artifactId>
<version>2.1</version>
</dependency>
After a search (search.maven.org), I already changed the version to 2.3 and there are a bunch of files that are downloaded into my local repository, but when running the application (with the websites demo code), I bump into this error:
java.lang.ClassNotFoundException: com.ebmwebsourcing.easycommons.uri.UriManager
And I believe it has something to do with the missing artifacts :
com.ebmwebsourcing.easycommons:easycommons.uri:jar:1.1
com.ebmwebsourcing.easycommons:easycommons.logger:jar:1.1
In particular the first one. Now, I'm relatively new to using Maven... How would I go about solving this?
Thanks.
The solution is to add the petalslink repository. Appearantly the standard maven repository doesn't contain the easycommons dependency. The petalslink repository does.

Error while using JSFUnit/HtmlUnit/CSSParser

We've just recently converted our project to using Maven for builds and dependency management, and after the conversion I'm getting the following exception while trying to run any JSFUnit tests in my project.
Exception class=[java.lang.UnsupportedOperationException]
com.gargoylesoftware.htmlunit.ScriptException: CSSRule com.steadystate.css.dom.CSSCharsetRuleImpl is not yet supported.
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:527)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:537)
...
All the dependencies and JARs for JSFUnit were pulled with Maven using the JBoss repository (http://repository.jboss.com/maven2/).
We're using the following dependencies in the project:
jboss-jsfunit-core 1.2.0.Final
jboss-jsfunit-richfaces 1.2.0.Final
richfaces-ui 3.3.2.GA
openfaces 2.0
JSF 1.2_12
Facelets 1.1.14
Before the dependencies were being managed by Maven, we were able to run our JSFUnit tests just fine. I was able to semi-fix the issue by using a ss_css2.jar file that someone had tucked into our WEB-INF/lib directory (from before the Maven conversion). I'm hoping to find out if there's something else I can do to fix the dependencies in Maven rather than resorting to managing some of the dependencies myself.
You're very likely getting an "incompatible" version of HtmlUnit or another JAR (pulled transitively). Try with the version you were using previously and declare it under the dependencyManagement section, e.g.
<dependencyManagement>
<dependencies>
<dependency>
<groupId>net.sourceforge.htmlunit</groupId>
<artifactId>htmlunit</artifactId>
<version>2.7</version><!-- put "your" version here -->
</dependency>
</dependencies>
</dependencyManagement>
Or, if you changed any version, try to revert to the exact previous state (by the way, could you clarify the differences between the previous versions and the one currently used?).
Update: It appears that the problem was related to the version of the cssparser artifact. I hadn't all the required elements to figure this out but the OP did :)

Maven automatic SNAPSHOT update

Let's say I have one project with the following POM:
<groupId>com.mine</groupId>
<artifactId>coreJar</artifactId>
<packaging>jar</packaging>
<version>0.0.1-SNAPSHOT</version>
And then in another project I always want to reference the latest SNAPSHOT:
<dependencies>
<dependency>
<groupId>com.mine</groupId>
<artifactId>coreJar</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
...
<dependencies>
But instead of 0.0.1-SNAPSHOT, I want it to always grab the latest SNAPSHOT version. In the past you could use LATEST, but this has since been deprecated (for reasonable reasons).
I do understand you can specify versions, such as:
[1.5,)
But I could never get it to work with a "-SNAPSHOT":
[0.0.1,)-SNAPSHOT // Doesn't work!
The question then is how do I get maven to grab the latest SNAPSHOT in my other project?
Another option (which I use) is to include the following in your pom.xml. updatePolicy tag will force maven to always use latest snapshot from this repo.
<repositories>
<repository>
<id>you-snapshots</id>
<url>http://host/nexus/repos/snapshots</url>
<snapshots>
<updatePolicy>always</updatePolicy>
</snapshots>
<releases>
<updatePolicy>always</updatePolicy>
</releases>
</repository>
</repositories>
p.s. I always configure all repos in pom.xml because we use several CI servers and it will be quite hard to configure all of them (I am lazy...)
Doc on settings.xml updatePolicy for reference.
The frequency for downloading updates - can be "always", "daily" (default), "interval:XXX" (in minutes) or "never" (only if it doesn't exist locally).
Use
mvn -U, --update-snapshots
Forces a check for updated releases and snapshots on remote repository
A few words about dependency ranges and SNAPSHOT dependencies (quoting the Dependency Mediation and Conflict Resolution design document):
Incorporating SNAPSHOT versions into the specification
Resolution of dependency ranges should not resolve to a snapshot (development version) unless it is included as an explicit boundary. There is no need to compile against development code unless you are explicitly using a new feature, under which the snapshot will become the lower bound of your version specification. As releases are considered newer than the snapshot they belong to, they will be chosen over an old snapshot if found.
So, to answer your question, the only way to use a SNAPSHOT with dependency ranges is as boundary and you won't get higher SNAPSHOT versions automatically by design (which really makes sense).
Personally, I don't like to use dependency ranges because I find that it can lead to build reproducibility issues and makes the build more fragile. I do not recommend them.
Just in case, upgrading the SNAPSHOT version typically means that you are releasing some code and the maven release plugin provides support for that (see the Updating POM Versions).
There is a Versions plugin for Maven which allows you to update your pom to the latest greatest SNAPSHOTS in visible repositories. It does this by inspecting your pom and comparing against remote repositories and then modifying as required.
It is a useful tool but I would definitely like to see an equivalent to the deprecated LATEST option. I find this kind of dependency particularly useful in continuous integration scenarios.
use mvn install -U
u must use this to force maven to get the latest snapshots
It's
<version>[0.0-SNAPSHOT,)</version>
In case you want to update your SNAPSHOT releases inside Eclipse (when using m2e / m2eclipse), right click the affected project, then select "Maven" -> "Update Project..." -> "OK" (with selected project causing problems).