PDFBox jpeg2000 rendering with e.g. twelvemonkeys-jpeg to avoid patent issues - pdfbox

I got a pdf file which I open with PDFBox (version 2.0.20, but should be not version related). The file has a page which is actually a JPEG2000 image.
First I got the well known error : Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed.
I added the JAI core tools and the corresponding jpeg2000 plugin in my POM:
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-jpeg2000</artifactId>
<version>1.3.0</version>
</dependency>
And everything works fine!
BUT: the internet says, that the usage of jai-imageio-jpeg2000 infringes patents if you use without paying.
Therefore my question is, can I make PDFBox use a different module? I understood that twelvemonkeys is a good choice and I have read some threads where it was tested. But I have found no howto, HOW to make pdfbox switch to e.g. twelvemonkeys.
I removed the above from the POM and added the twelvemonkeys, but that does not work (got again the error message from above)
<dependency>
<groupId>com.twelvemonkeys.imageio</groupId>
<artifactId>imageio-jpeg</artifactId>
<version>3.8.2</version>
</dependency>

So finally I used the JDeli library. It is a commercial library and at time of writing you need to pay between 800$ per year or a one-time payment of 4000$. But based on the fact that with patent problem of the JJ2000 code you might run into even bigger issues, I decided for our project to go with it.
Money is one topic but do my jpeg2000 problems with the pdfbox disappear? Yes!
I followed the instructions on the web page (https://support.idrsolutions.com/jdeli/tutorials/add-jdeli-as-a-maven-dependency):
I downloaded the trial lib, added to my maven archive and added this to my pom.xml:
<dependency>
<groupId>com.idrsolutions</groupId>
<artifactId>jdeli</artifactId>
<version>1.0</version>
</dependency>
As I wanted to use the product as JAI plugin I also checked out the git project for the plugin : https://github.com/idrsolutions/JDeli_ImageIO_Plugin
After checkout I did the mvn install and the plugin was in my mvn repo. I added then also the plugin as dependency to my pom.xml:
<dependency>
<groupId>com.idrsolutions</groupId>
<artifactId>JDeli_ImageIO_Plugin</artifactId>
<version>1.0</version>
</dependency>
From here my pdfs with the jpeg2000 images inside could be loaded with pdfbox as expected.
So this will not answer my question how to use twelvemonkeys to read pdfs with jpeg2000 inside with pdfbox as it is not possible (see above), but it provides an alternative which worked at least for me as long as you can accept to pay for the library.

Related

PDFBox 2.0.4 has different JAR files when downloaded from its site and when taken from Maven

If I use
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.4</version>
</dependency>
as instructed at https://pdfbox.apache.org/2.0/getting-started.html I don't get the classes at org.apache.pdfbox.tools and org.apache.pdfbox.tools.imageio (such as ImageIOUtil, JPEGUtil, MetaUtil, TIFFUtil and others).
However, if I download JAR file from http://www.gtlib.gatech.edu/pub/apache/pdfbox/2.0.4/pdfbox-app-2.0.4.jar as directed from https://pdfbox.apache.org/download.cgi#20x, I get them all.
What you got from maven is the pdfbox download. What you got from the download URL (where you might notice 10 different downloads) is pdfbox-app, which is for the command line tools (and contains everything). These are different downloads. If you want ImageIOUtil, JPEGUtil, MetaUtil, TIFFUtil, then get pdfbox-tools as an addition to the pdfbox artifact.

Accessing Java BigQuery Tools in Maven with Nexus

I'm at the very beginning of building a bigquery uploader in Java. Goal is to download a full twitter stream and uploading that into hourly buckets, then processing these in Dremel for a topic detection and tracking project. This is all Java on MacOSX in Eclipse Juno, plus Maven with a local Nexus proxy.
I'm stuck at the starting gate; finding and compiling a simple Java sample that authenticates and uploads a CSV file. Closest I've found is bigquery-appengine-sample, although I don't see why I need appengine for a bigquery project. Will explore that later.
Problem is, the project won't build in maven. No error flags shown in eclipse, but often eclipse error flags are unreliable. The pom shows a red eclipse flag on this element:
com.google.apis
google-api-services-bigquery
${bigquery.version}
Maven install fails with
[ERROR] Failed to execute goal on project bigquery-appengine-sample: Could not resolve dependencies for project com.google.api.client:bigquery-appengi
ne-sample:war:1.0.0-SNAPSHOT: The following artifacts could not be resolved: com.google.apis-samples:shared-sample-appengine:jar:1.3.2, com.google.api
s:google-api-services-bigquery:jar:v2-rev18-1.7.2-beta: Failure to find com.google.apis-samples:shared-sample-appengine:jar:1.3.2 in ...SOF won't allow URLs... was cached in the local repository, resolution will not be reattempted until the update interval of nexus has elapsed
or updates are forced -> [Help 1]
Nexus proxies are defined for http://mavenrepo.google-api-java-client.googlecode.com/hg/ and http://google-gson.googlecode.com/svn/mavenrepo/ (both set as SNAPSHOT). Neither of these seem seem to be working. Browse storage and indexes both come up empty (just an archetype catalog). However Browse Remote does seem to work AFAICT; shows a maven repository tree that seems complete.
So my question: how to build bigquery samples in Java with maven?
There's a few non-App Engine Java samples here: http://code.google.com/p/google-bigquery-tools/source/browse/samples/java/
I use Sonatype's Maven plugin, and at minimum, your pom.xml should look like this:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.google</groupId>
<artifactId>google</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>google</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>1.10.3-beta</version>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-bigquery</artifactId>
<version>v2-rev19-1.7.2-beta</version>
</dependency>
<dependency>
<groupId>com.google.oauth-client</groupId>
<artifactId>google-oauth-client</artifactId>
<version>1.8.0-beta</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>google-api-services</id>
<url>http://mavenrepo.google-api-java-client.googlecode.com/hg</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
</project>
Based on what I can observe from outside google, the answer to the question as posed is:
No java example is available (yet?) that covers uploading. The sample
at the address in the posted answer only covers querying sample data
that is already present in the account.
The closest sample to what I need seems to be the
command line and python samples. But using them is complicated by the fact that
each sample is hard-wired
to work only with specific maven dependency versions, samples aren't being updated
when new API versions are released, and older API versions are often not available via
maven.
The maven problem seems to originate from using a wildly idiosyncratic version
numbering system (such as the v2beta1-rev17-1.7.1-beta-beta example
above). Maven relies on strict adherence to convention, and such
version numbers definitely don't comply. For rapidly changing
beta code, version numbers should be major#.minor#.micro-SNAPSHOT as documented in the maven book. NO EXCEPTIONS or maven breaks as explained in the comments.
URGENT REQUEST: Please provide a working java sample that covers all BigQuery features, particularly step 1; uploading data. Please bundle the samples with the API code to ensure they stay in sync. And adopt maven conventions throughout, including retroactively.
I was able to get this to work after testing out the url http://google-api-client-libraries.appspot.com/mavenrepo for these libraries. It does not allow browsing so nexus will attempt to index this repo and fail. When this happens Nexus blocks access to this repo and thus local build fails due to missing dependencies.
To alleviate this I changed two settings in Nexus for this repository.
Download Remote Indexes = false
AutoBlock Enabled = false
With this set I was able to build and download dependencies correctly.
Note: We also use GCM-Server Repo for Android and it had this same issue, the same solution was applied and we were able to cache the new dependency.

Source of org.hibernate.validator.engine.ConfigurationImpl

OK, I'm trying to get a validator working with a jsp form.. Have just started using maven, so it's all a bit new... what's the best way of locating which repository I should be selecting for the above class? I already have the following entries for validation:
<dependency>
<groupId>javax.validation</groupId>
<artifactId>validation-api</artifactId>
<version>1.0.0.GA</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-validator-annotation-processor</artifactId>
<version>4.1.0.Final</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-validator</artifactId>
<version>4.1.0.Final</version>
</dependency>
I tried loading hibernate itself, but that broke the build: I'm using eclipse for persistence, and building fails after dragging in a huge number of prerequisite libraries.
I figured that I should get a better strategy than just stabbing away with repositories..
Just to be clear: I'm getting the error:
java.lang.NoClassDefFoundError: Could not initialize class org.hibernate.validator.engine.ConfigurationImpl
This should be in the hibernate-validator dependency, so not behaving as I would expect.
Do you have slf4j-api on the classpath (this should be pulled in by HV, but just to make sure)?
The NoClassDefFoundError doesn't mean that ConfigurationImpl is not available, but it couldn't be loaded (typically due to problems in the static initializer or by imports of the loaded class which are not available).
A side note on using the annotation processor: instead of adding it as project dependency you can also use it via the Maven annotation processor plugin. That way you can't accidentally import classes from it in your project. The set-up is described in detail in the Hibernate Validator reference guide.
OP probably won't need it anymore, but maybe somebody else will.
I've had a similar problem with combination of Hibernate Validator 4.3 Final and slf4j 1.5.11
Finally I've found pair of versions that are willing to work together namely:
-Hibernate Validator Final 4.2
-slf4j 1.6.6

exclude dependencies when running sonar analysis

I have a test project requiring some heavy jars which i put in ${M2_HOME}\test\src\main\resources\ and add them in the pom.xml using :
<dependency>
<groupId>server</groupId>
<artifactId>server</artifactId>
<version>1.0</version>
<scope>system</scope>
<systemPath>${M2_HOME}\test\src\main\resources\server.jar</systemPath>
</dependency>
<dependency>
<groupId>client</groupId>
<artifactId>client</artifactId>
<version>6.0</version>
<scope>system</scope>
<systemPath>${M2_HOME}\test\src\main\resources\client.jar</systemPath>
</dependency>
I want to know if it possible to exclude them during sonar analysis, or generally just analyze java sources folder.
If the problem is that these JARs are included in sonar analysis because they are located in src/main/resources, then one obvious solution would be to put them somewhere else (see the post scriptum). If for whatever reason this is not possible, please clarify (I'd really like to know why you put these JARs under resources).
If the problem is that these JARs are declared as dependencies, you could use a specific profile not including them to run sonar.
PS: Note that using the system scope is a bad practice and has several drawbacks (as mentioned here or here). A cleaner approach would be to use a file based repository as suggested in this answer.

Attaching source to a system scoped dependency

I have a dependency which is scoped as "system".
I'd like to know if there's a way to define the attached source and javadoc for the dependency. This seems like something that should've been taken care of, but I can't seem to fine any documentation on it or why it was neglected.
I am specifically looking for the configuration solution, not installing it to my local repo, or deploying it to a common repo. For the sake of this discussion, those options are out.
Do you mean attach sources using m2eclipse?
If so, you just need to ensure the sources jar is in the same directory. I tried this by copying commons-io-1.4.jar to some other directory and setting the system path, if commons-io-1.4-sources.jar is in the same directory, m2eclipse finds and attaches the sources.
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>1.4</version>
<scope>system</scope>
<systemPath>C:\test\lib\commons-io-1.4.jar</systemPath>
</dependency>
And the source jar is
C:\test\lib\commons-io-1.4-sources.jar
I guess it'll work the same for javadoc, not tried it though.