Pig: STORE with MongoInsertStorage don't work - apache-pig

I'm executing this simple code in a pig script:
REGISTER /home/myuser/mongodb/mongo-2.10.1.jar
REGISTER /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p0.30/lib/mongo-hadoop-cdh4-1.2.0/mongo-hadoop-core_cdh4.3.0-1.2.0.jar
REGISTER /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p0.30/lib/mongo-hadoop-cdh4-1.2.0/mongo-hadoop-pig_cdh4.3.0-1.2.0.jar
set mapred.map.tasks.speculative.execution false;
set mapred.reduce.tasks.speculative.execution false;
col = LOAD 'mongodb://localhost:27017/mydb.mycollection' using com.mongodb.hadoop.pig.MongoLoader ('id:chararray, companyId:chararray, ts:chararray', 'id');
STORE col INTO 'mongodb://localhost:27017/mydb.mycollection2' USING com.mongodb.hadoop.pig.MongoInsertStorage ('', '');
it returns the following error:
Location Config: Configuration: For URI: file:/tmp/temp449583595/tmp-109467318
2014-04-04 14:30:40,913 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2017: Internal error creating job configuration.
Details at logfile: /home/myuser/pig/pig_1396614639609.log
the end of file pig_1396614639609.log:
... at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused
by: java.lang.IllegalArgumentException: Invalid URI Format. URIs must
begin with a mongodb:// protocol string. at
com.mongodb.hadoop.pig.MongoInsertStorage.setStoreLocation(MongoInsertStorage.java:159)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:576)
... 17 more
I don't know where is the error so that mongodb protocol string "mongodb://" is well-written.

I have a similar issue when running LOAD and STORE using mongo-hadoop on the same Pig script.
It throws
java.net.UnknownHostException: localhost:27017 is not a valid Inet address
at org.apache.hadoop.net.NetUtils.verifyHostnames(NetUtils.java:587)
at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:734)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3890)
at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
I didn't investigate further, but either is a bug or some parameter related to locking. I don't know.
If I run the same code, but loading and storing in different scripts it runs without a problem.

Related

How to handle "Couldn't get any response" in mule?

I am trying to implement proper error handling in my code. There is a request component, which hits a specific api. I have to handle the scenario, where "server couldn't send a response" exception occurs.
Mule does not identify it as a HTTP:NOT FOUND error instead it identifies it as MULE:UNKNOWN.
How should I handle this case ? I want mule to identify it as a HTTP error.
EDIT:
How do I handle the MULE:UNKNOWN error type. I dont want to handle it under ANY and the compiler doesn't accept the type MULE : UNKNOWN. ?
org.mule.runtime.deployment.model.api.DeploymentException: Failed to
deploy artifact [] Caused by:
org.mule.runtime.api.exception.MuleRuntimeException:
org.mule.runtime.deployment.model.api.DeploymentInitException:
MuleRuntimeException: Could not find ErrorType for the given
identifier: 'MULE:UNKNOWN' Caused by:
org.mule.runtime.deployment.model.api.DeploymentInitException:
MuleRuntimeException: Could not find ErrorType for the given
identifier: 'MULE:UNKNOWN' Caused by:
org.mule.runtime.core.api.config.ConfigurationException: Could not
find ErrorType for the given identifier: 'MULE:UNKNOWN' Caused by:
org.mule.runtime.api.lifecycle.InitialisationException: Could not find
ErrorType for the given identifier: 'MULE:UNKNOWN' Caused by:
org.mule.runtime.api.lifecycle.LifecycleException: Could not find
ErrorType for the given identifier: 'MULE:UNKNOWN' Caused by:
org.mule.runtime.api.exception.MuleRuntimeException: Could not find
ErrorType for the given identifier: 'MULE:UNKNOWN'
HTTP:NOT FOUND means that the server returned an HTTP 404 (ie not found) response. If the server aborted the response for any reason it is not expected that the HTTP Requester to return a NOT FOUND. Having said that, the MULE:UNKNOWN error indicates an error that the component can not handle. You could try to update the version of the HTTP Connector to the last one to see if it has been improved to handle better that particular situation. Check the release notes to see if a newest one has been released.

Error when connecting Hive with kibi?

I am using kibi-community-demo-full-4.6.4-linux-x64 version.
In datasource:
"connection_string": "jdbc:hive://localhost:10000/root",
"libpath": "/home/pare/Downloads/jar/",
"drivername": "org.apache.hadoop.hive.jdbc.HiveDriver",
"libs": "hive-jdbc-0.11.0.jar,hive-metastore-0.11.0.jar,libthrift-0.9.1.jar,hive-service-0.13.1.jar,hive-jdbc-1.2.1.2.3.2.0-2950-standalone.jar,hadoop-common-2.7.1.2.3.2.0-2950.jar",
After that when in queries I write a query it will show error like:
Queries Editor: Error 400 Bad Request: Error running static method java.lang.IllegalArgumentException: Bad URL format at org.apache.hive.jdbc.Utils.parseURL(Utils.java:185) at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:84)
What is the error can any one explain me how to solve it?
I am able to connect after changing the jars version.
and also I changed the driver name "org.apache.hive.jdbc.HiveDriver".

Amazon s3 connector: IllegalArgumentException: Empty key when using getAuthorization

When using the getAuthorization function from the Amazon s3 connector I am getting the following trace:
[2016-04-28 11:42:14,155] ERROR - AmazonS3AuthConnector Error occured in connect
or
java.lang.IllegalArgumentException: Empty key
at javax.crypto.spec.SecretKeySpec.<init>(SecretKeySpec.java:94)
at org.wso2.carbon.connector.amazons3.auth.AmazonS3Authentication.getAut
horizationHeaderValue(AmazonS3Authentication.java:79)
at org.wso2.carbon.connector.amazons3.auth.AmazonS3AuthConnector.connect
(AmazonS3AuthConnector.java:102)
at org.wso2.carbon.connector.core.AbstractConnector.mediate(AbstractConn
ector.java:32)
at org.apache.synapse.mediators.ext.ClassMediator.mediate(ClassMediator.
java:78)
at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractLis
tMediator.java:81)
at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractLis
tMediator.java:48)
at org.apache.synapse.mediators.template.TemplateMediator.mediate(Templa
teMediator.java:97)
at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMe
diator.java:129)
at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMe
diator.java:78)
at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractLis
tMediator.java:81)
at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractLis
tMediator.java:48)
at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMe
diator.java:149)
at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(Pro
xyServiceMessageReceiver.java:175)
at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180)
at org.apache.axis2.transport.base.AbstractTransportListener.handleIncom
ingMessage(AbstractTransportListener.java:328)
at org.apache.synapse.transport.vfs.VFSTransportListener.processFile(VFS
TransportListener.java:751)
at org.apache.synapse.transport.vfs.VFSTransportListener.scanFileOrDirec
tory(VFSTransportListener.java:407)
at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTranspo
rtListener.java:177)
at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTranspo
rtListener.java:124)
at org.apache.axis2.transport.base.AbstractPollingTransportListener$1$1.
run(AbstractPollingTransportListener.java:67)
at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(Native
WorkerPool.java:172)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
When looking through the code it seems that the key should be there as the custom InvalidKeyException is not being thrown but rather the java.lang.IllegalArgumentException is being thrown by javax.crypto.
My mediator config.
<amazons3.getAuthorization>
<accessKeyId>********************</accessKeyId>
<secretAccessKey>****************************************</secretAccessKey>
<methodType>POST</methodType>
<contentType>multipart/form-data</contentType>
<bucketName>*********</bucketName>
<uriRemainder>/</uriRemainder>
<isXAmzDate>true</isXAmzDate>
</amazons3.getAuthorization>
What am I doing wrong? Anyone has any experience with this? Does this function work for others?
This error has been resolved. Apparently setting a property with the key fields fixes the problem.

Configuration values for hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode in HIVE

I am trying to add data to an external table using apache-hive. I am getting the following error in the hive logs
2015-06-15 17:27:44,614 ERROR [LocalJobRunner Map Task Executor #0]: mr.ExecMapper (ExecMapper.java:map(171)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"transactiondate":"05-01-2015 08:26:21","transactiontype":"CASHOUT","transactionid":144590889,"sourcenumber":null,"destnumber":null,"amount":19000,"assumedfield1":880,"customerid":33394093,"transactionstatus":"COMPLETED","assumedfield2":325,"assumedfield3":175870}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 256
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:933)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:709)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 10 more
I googled for this error and came across this link which says that we must change the values of hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode variables to higher values. What are the optimum configurations for these variables on a single node hadoop installation? None of these configuration values are working for me. Please help.
set hive.exec.max.dynamic.partitions=1000;
set hive.exec.max.dynamic.partitions.pernode=250;
Please do not try to increase hive partitions to higher value .
It may cause Namenode crash . If possible try to change the partition column and apply new logic over it

WSO2 Identity Server: Error while loading identity configurations after editing identity.xml

I'm trying to set up an WSO2-Identity Server. I've downloaded the Version 4.0.0 Binary. It started correctly and I was able to use it with LDAP.
However, if I want to insert the correct ServerURL into the identity.xml file I get an error.
I inserted the follwoing into the identity.xml:
<OpenIDServerUrl>https://server.vm.uni-freiburg.de:9443/openidserver</OpenIDServerUrl>
<OpenIDUserPattern>https://server.vm.uni-freiburg.de:9443/openid/</OpenIDUserPattern>
and when starting the wso2 IS the following error is thrown:
[2012-11-23 10:34:41,510] ERROR {org.wso2.carbon.identity.core.util.IdentityConfigParser} - Error while loading Identity Configurations
org.apache.axiom.om.OMException: com.ctc.wstx.exc.WstxIOException: Stream Closed
at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:296)
at org.apache.axiom.om.impl.llom.OMElementImpl.getNextOMSibling(OMElementImpl.java:336)
at org.apache.axiom.om.impl.traverse.OMChildElementIterator.next(OMChildElementIterator.java:104)
at org.wso2.carbon.identity.core.util.IdentityConfigParser.readChildElements(IdentityConfigParser.java:154)
at org.wso2.carbon.identity.core.util.IdentityConfigParser.<init>(IdentityConfigParser.java:60)
at org.wso2.carbon.identity.core.util.IdentityConfigParser.getInstance(IdentityConfigParser.java:71)
at org.wso2.carbon.identity.core.util.IdentityUtil.populateProperties(IdentityUtil.java:58)
at org.wso2.carbon.identity.sso.saml.ui.internal.SAMLSSOUIBundleActivator.start(SAMLSSOUIBundleActivator.java:33)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:782)
at java.security.AccessController.doPrivileged(Native Method)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:773)
at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:754)
at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:352)
at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:370)
at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1068)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:557)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:464)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:248)
at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:445)
at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:220)
at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:330)
Caused by: com.ctc.wstx.exc.WstxIOException: Stream Closed
at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)
at org.apache.axiom.util.stax.wrapper.XMLStreamReaderWrapper.next(XMLStreamReaderWrapper.java:225)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.parserNext(StAXOMBuilder.java:681)
at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:214)
... 20 more
Caused by: java.io.IOException: Stream Closed
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:214)
at com.ctc.wstx.io.ISOLatinReader.read(ISOLatinReader.java:79)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046)
at com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1053)
at com.ctc.wstx.sr.StreamScanner.getNextCharFromCurrent(StreamScanner.java:811)
at com.ctc.wstx.sr.BasicStreamReader.readEndElem(BasicStreamReader.java:3206)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2832)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
... 23 more
However, if I leave the default values:
<OpenIDServerUrl>https://localhost:9443/openidserver</OpenIDServerUrl>
<OpenIDUserPattern>https://localhost:9443/openid/</OpenIDUserPattern>
the server starts normally.
I've found a Bug-Report from https://wso2.org/jira/browse/IDENTITY-407 which may be connected with the problem but is not the same.
What am I doing wrong? Thanks in advance!