Exception in deserializing avro object in map reduce - apache

I am trying to run a map reduce job which takes an avro file as input and does some processing. I followed the sample program apache has given us here
http://avro.apache.org/docs/1.7.6/mr.html
But I keep on running into this exception
java.lang.Exception: java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
Any idea on what I may be doing wrong? I have specified my pom configs in the bottom. Also I am using MapR version 4.
<repositories>
<repository>
<id>MapR</id>
<url>http://repository.mapr.com/maven/.</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>1.7.6</version>
<classifier>hadoop2</classifier>
</dependency>
</dependencies>

Common cause of such errors is this:
Your software was compiled against 1.7.6 version of avro, but in runtime, classes from older version were probably loaded.
Make sure that 1.7.6 is the actual version of your avro artifacts in your runtime classpath. Print out the classpath at the start of your mapper. If you're using oozie, the classpath jars are listed in launcher job output.
The first avro jar you see in the classpath is the one that will be used to load the classes, so if it isn't 1.7.6, that's the problem.
You can force your classpath artifacts to come first in the task's classpath by setting mapreduce.job.user.classpath.first configuration property to true.
Also you have another error in your pom that may very well cause you problems, maybe the very ones you're seeing. You are using avro-mapred artifact compiled for hadoop2 while the hadoop artifact you're depending on is that of hadoop1. These should not be compatible. If you're using hadoop1, loose the hadoop2 classifier on avro-mapred, and if you're using hadoop2, remove hadoop-core and put hadoop-mapreduce-client-core instead.

I have solved this by injecting the right Avro jar in bootstrap action, as described here:
https://stackoverflow.com/a/40235289/3487888

Related

Resolving depending war file with shrinkwrap maven resolver

I'm working on setup of arquillian testing. I want to deploy a WAR to JBoss using arquillian. This war is defined as a dependency in my pom.xml:
<dependency>
<groupId>my.project</groupId>
<artifactId>mywar</artifactId>
<version>1.0</version>
<type>war</type>
<scope>runtime</scope>
</dependency>
But when I try to resove this dependency using shrinkwrap, it throws a NoResolvedResultException:
PomEquippedResolveStage resolver = Maven.configureResolver().workOffline().loadPomFromFile("pom.xml");
File war = resolver.resolve("my.project:mywar:war").withoutTransitivity().asSingleFile();
It seems that somehow the resolver isn't able to deal with war files. I experminted with org.jboss.shrinkwrap.resolver.api.ResolveWithRangeSupportStage.resolveVersionRange(String) as well and it seems to interpret the ":war" in the coordinates as the version - which obviously won't work.
If I supply the version, it works:
Maven.resolver().resolve("my.project:mywar:war:1.0").withoutTransitivity().asSingleFile();
But I need to make it work without the version because this will change by the time and I don't want to adapt the version on each release.
Any ideas?
Since your artifact is not JAR, I think you have to add a question mark. Your resolver should look like: .resolve("my.project:mywar:war:?")

Apache Common IO FileUtils Issue

I am trying to use the FileUtils.writeStringToFile() method of the Apache Commons IO. Every bit of documentation says that I can do this:
FileUtils.writeStringToFile(File, String with data, boolean append);
I want this method, because I want the data to be written to the end of the file each time.
However, in Eclipse, it keeps telling me that this method does not exist. The only two I have are:
FileUtils.writeStringToFile(File, String with data);
FileUtils.writeStringToFile(File, String with data, String encoding);
I corrected my POM file to now have this dependency:
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
Can someone please tell me what I am doing wrong?
Version 1.3.2 doesn't have this method, use a newer version of commons-io
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
Check the FileUtils 2.4 javadoc
Turns out I was adding the Tomcat library files as well as the JRE library files to my project. Because when I deleted commons-io from my POM, I still had FileUtils available.
I had to get rid of the Tomcat library files from my build path, and once I put commons-io back in, it worked.

apache mime4j maven dependency for 0.7.2

Trying to use the Apache Mime4J dependency for Version 0.7.2 like this:
<repositories>
<repository>
<id>org.apache.james</id>
<url>http://repo1.maven.org/maven2/</url>
</repository>
</repositories>
<dependency>
<groupId>org.apache.james</groupId>
<artifactId>apache-mime4j</artifactId>
<version>0.7.2</version>
</dependency>
I got an error message that the dependency could not be downloaded.
After checking that
http://uk.maven.org/maven2/org/apache/james/apache-mime4j/0.7.2/apache-mime4j-0.7.2.jar
indeed does not exist
but http://uk.maven.org/maven2/org/apache/james/apache-mime4j/0.7.2/
had .bin.tar.gz files I worked around the problem using:
<dependency>
<groupId>org.apache.james</groupId>
<artifactId>apache-mime4j</artifactId>
<version>0.6.1</version>
</dependency>
This will therefore not reference the more current 0.7.2 release.
This is my "set of questions":
Why does the 0.7.2 release not contain a Jar file?
How should I reference the dependency to get the latest Jar?
Do I need it anyway?
What differences are there between the 0.7.2 and the
0.6.1 release?
Question 1: Why an artifact might not exist
According to the changelog there has been some refactoring going on to split the functionality into the three parts: core,dom and storage.
Question 2: How to get the latest artifact
modify the dependencies to:
<dependency>
<groupId>org.apache.james</groupId>
<artifactId>apache-mime4j-core</artifactId>
<version>0.7.2</version>
</dependency>
<dependency>
<groupId>org.apache.james</groupId>
<artifactId>apache-mime4j-dom</artifactId>
<version>0.7.2</version>
</dependency>
<dependency>
<groupId>org.apache.james</groupId>
<artifactId>apache-mime4j-storage</artifactId>
<version>0.7.2</version>
</dependency>
Question 3: Do I need it?
if you'd like to use the improved DOM API: yes. You will need to modify your
import statements and can not use new Message() any more. Use
MessageServiceFactory.newInstance().newMessageBuilder().newMessage();
instead. The multipart.getBodyParts() function has also changed and returns an Entity now.
There is no isMimeType() for the Entity. You might want to use getMimeType() instead.
Question 4: What changed between versions?
See the change log between 0.7.2 and 0.6.1.

Why does Maven2 check for updates of stax-ex at every build?

Maven2 checks for updates of stax-ex at every build. And it's just checking this single dependency, all other dependencies are updated only once per day.
Maven2 output:
artifact org.jvnet.staxex:stax-ex: checking for updates from java.net
stax-ex (groupid: org.jvnet.staxex, version: 1.2) is included as part of jaxws-rt (groupid: com.sun.xml.ws, version: 2.1.3). We have an artifactory repository as intermediary.
What could I do? ( Building offline would be an unpopular work-around.)
I had the same problem, and wanted to get to the bottom of it!
The problem is in the pom.xml file of streambuffer (a dependency of jaxws-rt), which doesn't specify a version for stax-ex. Instead, it uses RELEASE, meaning the latest released version:
<dependency>
<groupId>org.jvnet.staxex</groupId>
<artifactId>stax-ex</artifactId>
<version>RELEASE</version>
</dependency>
This forces Maven to check constantly for the latest release of stax-ex (even if jaxws-rt itself requests version 1.2), by downloading its corresponding maven-metadata.xml.
An easy workaround is to force the version of stax-ex in a dependencyManagement section of your pom.xm:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.jvnet.staxex</groupId>
<artifactId>stax-ex</artifactId>
<version>1.2</version>
</dependency>
</dependencies>
</dependencyManagement>
And then Maven will stop bothering you about this warning...
It looks like you have remote repository declarations in your POMs that bypass your enterprise repository. If you are using Artifactory you can either have remote repository references in POMs automatically stripped off on a virtual repository level, or configure mirror-any in your settings to enforce artifact resolution go strictly through your Artifactory.

Maven - how to put the build dependance jar files?

I run a simple CXF maven project http://cxf.apache.org/docs/using-cxf-with-maven.html, and get error below
[INFO] [cxf-codegen:wsdl2java {execution: generate-sources}]
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] org/springframework/core/io/support/ResourcePatternResolver
org.springframework.core.io.support.ResourcePatternResolver
[INFO] ------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: org/springframework/core/io/support/ResourcePatternResolver
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
I don't know how to put the springframework-core dependance ?
I tried below like most of answers
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
<version>2.5.6</version>
</dependency>
</dependencies>
but it didn't help, I also don't know why it depends on springframework
It works if I put the jar file under $M2_HOME/lib, but is it correct way ? since when I solve this, it requires to add more lib there, can I put it into pom.xml somewhere ?
I tried to put <dependencies/> inside <build> tag, it doesn't work
my maven is 2.2.1 on windows
It works if I put the jar file under $M2_HOME/lib, but is it correct way ?
No, definitely not. To add a dependency, you need to declare it in your pom.xml, something like this:
<project>
...
<dependencies>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
<version>???</version>
</dependency>
...
</dependencies>
</project>
But I don't understand why you would have to add this dependency, spring-core is a dependency of cxf, you should get it transitively. You're not providing enough context information for a more precise answer though.
You have to define it in the pom.xml
Read the docs at http://maven.apache.org/pom.html#Dependencies
You'll need to add this to your Maven pom.xml file in the <dependencies> section:
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
<version>???</version>
</dependency>
The version number will depend on which version of spring you're using. (I'm using 2.0 as the version #, along with spring 2.0.8).
Finally I found it by myself, it is due to the error of my springframework-2.5.5 package from local repository. The jar file is not correct. I notice this later in eclipse
Pascal's answer is also correct.
The springframework-2.5.5 is automatically download by maven, unfortunately it is broken, so it still complain the class, and if I put springframework-2.5.6 inside, even it will be downloaded, it will not be used, maven still think it loaded the springframework-2.5.5 into its classpath.
And if I put into %M2_HOME%/lib, surely it will be maven's classpath, and it is wrong to use it.
Since I met this kind of problem before, now I know what it is.
Summary: checking your dependance files to see whether the package is correct
BTW: Thanks for all especially pascal