Pig and Jython - Can't Register UDF - apache-pig

I am trying to write it Python UDF; I am using the Datastax package for that. When I try to write a simple UDF such as:
#outputSchema("word:chararray")
def helloworld():
return 'Hello, World'
And then register it in the grunt shell:
REGISTER 'pig.py' USING org.apache.pig.scripting.jython.JythonScriptEngine as myfuncs;
I get the following error:
ERROR 2998: Unhandled internal error. org/python/core/PyObject
java.lang.NoClassDefFoundError: org/python/core/PyObject
at org.apache.pig.scripting.jython.JythonScriptEngine.registerFunctions(JythonScriptEngine.java:304)
at org.apache.pig.PigServer.registerCode(PigServer.java:534)
at org.apache.pig.tools.grunt.GruntParser.processRegister(GruntParser.java:423)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:419)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:190)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:490)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: Class org.python.core.PyObject not found in modules [ModuleClassLoader:Ana$
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:120)
at com.datastax.bdp.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:38)
at com.datastax.bdp.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:32)
... 14 more
Does anyone know what could be causing this error?

Add $PIG_HOME/lib/jython.jar to your PIG_CLASSPATH environment variable.

Related

impala catalogd cannot connect to thrift

I try to install impala from source code, when i try to run catalgd this error shows up
E0629 10:33:01.143334 4439 CatalogServiceCatalog.java:416] Unable to fetch the current
notification event id from metastore.Metastore event processing will be disabled.
Java exception follows:
org.apache.thrift.TApplicationException: Internal error processing get_current_notificationEventId
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_current_notificationEventId(ThriftHiveMetastore.java:6512)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_current_notificationEventId(ThriftHiveMetastore.java:6500)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getCurrentNotificationEventId(HiveMetaStoreClient.java:3532)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
at com.sun.proxy.$Proxy4.getCurrentNotificationEventId(Unknown Source)
at org.apache.impala.catalog.CatalogServiceCatalog.getEventsProcessor(CatalogServiceCatalog.java:412)
at org.apache.impala.catalog.CatalogServiceCatalog.<init>(CatalogServiceCatalog.java:348)
at org.apache.impala.catalog.CatalogServiceCatalog.<init>(CatalogServiceCatalog.java:362)
at org.apache.impala.service.JniCatalog.<init>(JniCatalog.java:133)
E0629 10:33:01.143791 4439 catalog.cc:87] CatalogException: Fatal error while initializing metastore event processor
CAUSED BY: TApplicationException: Internal error processing get_current_notificationEventId
. Impalad exiting.
also the hive metastore is running

Weblogic Domain creation error through script in putty

I am trying to create weblogic domain using silent mode through putty .I have used below command:
./config.sh -mode=silent -silent_xml=/home/ec2-user/createdomain.xml
I am getting below error message while executing it:
Exception in thread "Thread-1" java.lang.IllegalStateException: No able to create the instance of the template catalog class com.oracle.cie.domain.template.catalog.impl.GlobalTemplateCat
at com.oracle.cie.domain.template.catalog.TemplateCatalogFactory.createGlobalTemplateCatalog(TemplateCatalogFactory.java:138)
at com.oracle.cie.domain.template.catalog.TemplateCatalogFactory.getGlobalCatalog(TemplateCatalogFactory.java:78)
at com.oracle.cie.domain.template.catalog.TemplateCatalogFactory.getGlobalCatalog(TemplateCatalogFactory.java:33)
at com.oracle.cie.wizard.domain.silent.tasks.LoadTemplateCatalogTask$1.run(LoadTemplateCatalogTask.java:23)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.oracle.cie.domain.template.catalog.TemplateCatalogFactory.createGlobalTemplateCatalog(TemplateCatalogFactory.java:133)
... 4 more
Caused by: com.oracle.cie.domain.env.EnvironmentServiceException: Failed to get inventory for /home/ec2-user/oracle/middleware/oracle_common/common/bin
at com.oracle.cie.domain.env.EnvironmentServiceImpl.init(EnvironmentServiceImpl.java:425)
at com.oracle.cie.domain.env.EnvironmentServiceImpl.<init>(EnvironmentServiceImpl.java:89)
at `com`.oracle.cie.domain.env.EnvironmentServiceImpl.getInstance(EnvironmentServiceImpl.java:364)
at com.oracle.cie.domain.env.EnvironmentServiceFactory.getEnvironmentService(EnvironmentServiceFactory.java:35)
at com.oracle.cie.domain.template.catalog.impl.OracleHomeLocator.getProductInstalDirs(OracleHomeLocator.java:31)
at com.oracle.cie.domain.template.catalog.impl.GlobalTemplateCat.populateProductCatalogs(GlobalTemplateCat.java:446)
at com.oracle.cie.domain.template.catalog.impl.GlobalTemplateCat.<init>(GlobalTemplateCat.java:90)
at com.oracle.cie.domain.template.catalog.impl.GlobalTemplateCat.<init>(GlobalTemplateCat.java:83)
... 9 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.oracle.cie.common.ReflectionHelper.process(ReflectionHelper.java:48)
at com.oracle.cie.domain.env.EnvironmentServiceImpl.init(EnvironmentServiceImpl.java:384)
... 16 more
Caused by: com.oracle.cie.gdr.external.InventoryException: com.oracle.cie.gdr.utils.GdrException: The gdr meta-data directory /home/ec2-user/oracle/middleware/oracle_common/common/bin/inventory is invalid or does not exist.
at com.oracle.cie.gdr.external.impl.OracleHomeInventoryImpl.<init>(OracleHomeInventoryImpl.java:55)
at com.oracle.cie.gdr.external.impl.OracleHomeInventoryFactory.createInventory(OracleHomeInventoryFactory.java:60)
at com.oracle.cie.gdr.external.InventoryFactory.getOracleHomeInventory(InventoryFactory.java:99)
... 22 more
Caused by: com.oracle.cie.gdr.utils.GdrException: The gdr meta-data directory /home/ec2-user/oracle/middleware/oracle_common/common/bin/inventory is invalid or does not exist.
at com.oracle.cie.gdr.MetaDataHome.init(MetaDataHome.java:206)
at com.oracle.cie.gdr.MetaDataHome.<init>(MetaDataHome.java:188)
at com.oracle.cie.gdr.MetaDataHome.<init>(MetaDataHome.java:172)
at com.oracle.cie.gdr.MetaDataHome.<init>(MetaDataHome.java:157)
at com.oracle.cie.gdr.MetaDataHome.<init>(MetaDataHome.java:144)
at com.oracle.cie.gdr.MetaDataHome.<init>(MetaDataHome.java:86)
at com.oracle.cie.gdr.Home.getMetaDataHome(Home.java:619)
Which Weblogic version are you using? I have not seen a silent script to create domains for a While. If you are trying to do this on Weblogic 12c, it won't work as this kind of script used to be available for older versions such as 8 and 9 as far as I remember.
If you want to automate domain's provisioning for versions such as 12c you should use a newer approach. Here, I am proposing two options.
You can use Ansible, WLST and Python to create the domain. You can see an example here https://github.com/textanalyticsman/ansible-soa
You can use Weblogic Deploy Tooling, this is an Open Source tool provided by Oracle and you can find out it here https://github.com/oracle/weblogic-deploy-tooling
The combination of Weblogic Deploy Tooling and Ansible is also a good option as is shown in https://github.com/textanalyticsman/ansible-soa-wldt
You can also try Weblogic Kubernetes Operator https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/domain-resource/

wildfly+intelij Error running admin process

Im using wildfly-8.2.0.Final. It works fine on work pc. But when I try deploy it on home pc I get this error
Error running admin process:
Message: java.lang.NoClassDefFoundError: org/wildfly/security/password/Password Stack trace: com.intellij.javaee.process.common.JavaeeProcessUtilException: java.lang.NoClassDefFoundError: org/wildfly/security/password/Password at com.intellij.javaee.process.common.MethodInvocator.invoke(MethodInvocator.java:47) at com.intellij.javaee.oss.process.JavaeeProcess.processRequest(JavaeeProcess.java:112) at com.intellij.javaee.oss.process.JavaeeProcess.run(JavaeeProcess.java:52) at com.intellij.javaee.oss.process.JavaeeProcess.main(JavaeeProcess.java:31) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.rt.execution.CommandLineWrapper.main(CommandLineWrapper.java:66) Caused by: java.lang.NoClassDefFoundError: org/wildfly/security/password/Password at com.intellij.javaee.oss.jboss.agent.WildFly11Agent.createAuthHandler(WildFly11Agent.java:15) at com.intellij.javaee.oss.jboss.agent.JBoss7Agent.doConnect(JBoss7Agent.java:49) at com.intellij.javaee.oss.agent.SimpleAgentBase$1.doJob(SimpleAgentBase.java:24) at com.intellij.javaee.oss.agent.SimpleAgentBase$1.doJob(SimpleAgentBase.java:20) at com.intellij.javaee.oss.agent.SimpleAgentJob.perform(SimpleAgentJob.java:12) at com.intellij.javaee.oss.agent.SimpleAgentBase.connect(SimpleAgentBase.java:33) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.javaee.process.common.MethodInvocator.invoke(MethodInvocator.java:41) ... 8 more Caused by: java.lang.ClassNotFoundException: org.wildfly.security.password.Password at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 19 more 19:06:24,746 INFO [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-4) JBAS010400: Bound data source [java:jboss/datasources/ehospital]
I am using:
Intelij Idea,
wildfly-8.2.0.Final,
jdk1.8.0_231
Has anyone faced this error?
Problem was with Postgre version and driver. Downloaded version for my Postgre and
changed driver in JBOSS_HOME\modules\org\postgres\main. That helped me

java.lang.NoClassDefFoundError spark-submit in yarn cluster mode, cluster being setup using Ambari

I'm using the spark-submit command as below:
spark-submit --class com.example.hdfs.spark.RawDataAdapter --master yarn --deploy-mode cluster --jars /home/hadoop/emr/deployment/server/emr-core-1.0-SNAPSHOT.jar home/hadoop/emr-spark-1.0-SNAPSHOT.jar hdfs://111.11.11.111:8020/user/hdfsinputfile.zip 8000
However, it gives me the error java.lang.NoClassDefFoundError: com/example/emr/parser/IParser3. Though the IParser3.class is present in emr-core-1.0-SNAPSHOT.jar. I don't understand why it throws that error. I tried several ways but couldn't succeed. How can I resolve this?
I am able to run the same command in client mode and also as a standalone spark application. Getting this error only when in yarn cluster mode.
Exception from container-launch. Container id: container_e37_1526066605784_0014_02_000001 Exit code: 15 Container exited with a non-zero exit code 15. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : g.ClassLoader.defineClass(ClassLoader.java:763) at java.lang.ClassLoader.defineClass(ClassLoader.java:642) at com.example.hdfs.spark.utils.SimpleClassLoader.loadJarFile(SimpleClassLoader.java:126) at com.example.hdfs.spark.utils.SimpleClassLoader.(SimpleClassLoader.java:38) at com.example.hdfs.spark.input RawInputFormat.loadPlugins(RawInputFormat.java:71) at com.example.hdfs.spark.RawDataAdapter.run(RawDataAdapter.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at com.example.hdfs.spark.RawDataAdapter.main(RawDataAdapter.java:33) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$anon$3.run(ApplicationMaster.scala:646) 18/05/14 14:00:13 ERROR ApplicationMaster: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282) at org.apache.spark.deploy.yarn.ApplicationMaster$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768) at org.apache.spark.deploy.SparkHadoopUtil$anon$2.run(SparkHadoopUtil.scala:67) at org.apache.spark.deploy.SparkHadoopUtil$anon$2.run(SparkHadoopUtil.scala:66) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Caused by: java.util.concurrent.ExecutionException: Boxed Error at scala.concurrent.impl.Promise$.resolver(Promise.scala:55) at scala.concurrent.impl.Promise$.scala$concurrent$impl$Promise$resolveTry(Promise.scala:47) at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:244) at scala.concurrent.Promise$class.tryFailure(Promise.scala:112) at scala.concurrent.impl.Promise$DefaultPromise.tryFailure(Promise.scala:153) at org.apache.spark.deploy.yarn.ApplicationMaster$anon$3.run(ApplicationMaster.scala:664) Caused by: java.lang.NoClassDefFoundError: com/example/emr/parser/IParser3 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.lang.ClassLoader.defineClass(ClassLoader.java:642) at com.example.hdfs.spark.utils.SimpleClassLoader.findClass(SimpleClassLoader.java:152) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.lang.ClassLoader.defineClass(ClassLoader.java:642) at com.example.hdfs.spark.utils.SimpleClassLoader.loadJarFile(SimpleClassLoader.java:126) at com.example.hdfs.spark.utils.SimpleClassLoader.(SimpleClassLoader.java:38) at com.example.hdfs.spark.input.RawInputFormat.loadPlugins(RawInputFormat.java:71) at com.example.hdfs.spark.RawDataAdapter.run(RawDataAdapter.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at com.example.hdfs.spark.RawDataAdapter.main(RawDataAdapter.java:33) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$anon$3.run(ApplicationMaster.scala:646) Failing this attempt. Failing the application.
Quoting from Spark Documentation :-
http://spark.apache.org/docs/latest/running-on-yarn.html
In client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
In cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application
So in cluster mode, the jar is executed on any available node so , so you can try these 2 ways :-
1) Copy the dependency jar to each node .
2) You can try to copy the jar to Distributed (HDFS system) and then use it .
For more details you can have a look into :
https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management

PredictionIO - Error when trainning kmean clustering

I followed the guidance below to train and deploy KMean clustering.
But I got error with pio train:
[WARN] [Template$] template.json does not exist. Template metadata will not be available. (This is safe to ignore if you are not working on a template.)
[INFO] [Runner$] Submission command: /home/lavalamp/PredictionIO/vendors/spark-1.4.1/bin/spark-submit --class io.prediction.workflow.CreateWorkflow --jars file:/home/lavalamp/PredictionIO/MyKmeans/target/scala-2.10/template-scala-parallel-vanilla_2.10-0.1-SNAPSHOT.jar,file:/home/lavalamp/PredictionIO/MyKmeans/target/scala-2.10/template-scala-parallel-vanilla-assembly-0.1-SNAPSHOT-deps.jar --files file:/home/lavalamp/PredictionIO/conf/log4j.properties --driver-class-path /home/lavalamp/PredictionIO/conf file:/home/lavalamp/PredictionIO/lib/pio-assembly-0.9.4.jar --engine-id gYCE4NX4ODPQkryp9Jq9by3OEXxa4fxQ --engine-version b972fa8f340c142fb6dffbebc6d276b3bb32eeda --engine-variant file:/home/lavalamp/PredictionIO/MyKmeans/engine.json --verbosity 0 --json-extractor Both
--env PIO_ENV_LOADED=1,PIO_STORAGE_SOURCES_MYSQL_PASSWORD=123456,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/lavalamp/.pio_store,PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://192.168.1.73/pio,PIO_HOME=/home/lavalamp/PredictionIO,
PIO_FS_ENGINESDIR=/home/lavalamp/.pio_store/engines,PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=MYSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=MYSQL,
PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_MYSQL_USERNAME=root,PIO_FS_TMPDIR=/home/lavalamp/.pio_store/tmp,
PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=MYSQL,
PIO_CONF_DIR=/home/lavalamp/PredictionIO/conf
Exception in thread "main" java.lang.ClassCastException: com.biglabs.VanillaEngine$ cannot be cast to io.prediction.controller.EngineFactory
at io.prediction.workflow.WorkflowUtils$.getEngine(WorkflowUtils.scala:69)
at io.prediction.workflow.CreateWorkflow$.liftedTree1$1(CreateWorkflow.scala:193)
at io.prediction.workflow.CreateWorkflow$.main(CreateWorkflow.scala:192)
at io.prediction.workflow.CreateWorkflow.main(CreateWorkflow.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Can anyone help me with this issue?
try this solution, https://github.com/singsanj/KMeans-parallel-template
hope this solve your issues.
just dont forget to update the scripts/loadData.py with you newly created app access key and engine.json with your appId.
if you still have issues.. happy to solve.