I have a DataWeave message transformer, let's say:
%dw 2.0
import fail from dw::Runtime
output application/java
fun isValuePresent(value, message: String) = if ( value == null or isEmpty(value) ) fail(message) else value
---
{
brand: isValuePresent(payload.document[0].brand, p('import.error.missing.brand')),
...
I also have an error handler for this kind of errors.
Errors in Mule have their properties, like: description or detailedDescription.
Now normally, when I am catching other errors (like those from is true component) - everything is fine, error.description holds my error message, everything is fine.
But when an error produced by fail() is produced, I get a very big error description message:
""my error message here
Trace:
at fail (Unknown)
at isValuePresent (line: 13, column: 85)
at main (line: 23, column: 7)" evaluating expression: "%dw 2.0
import fail from dw::Runtime
output application/java
fun isValuePresent(value, message: String) = if ( value == null or isEmpty(value) ) fail(message) else value
---
{
brand: isValuePresent(payload.document[0].brand, p('import.error.missing.brand')),
...
...
etc, etc
It looks like the whole content of my dataweave script is added to the trace. And I just want to have:
my error message here
Trace:
at fail (Unknown)
at isValuePresent (line: 13, column: 85)
at main (line: 23, column: 7)" evaluating expression: "%dw 2.0
Is it possible to achieve this? Or I have I made some mistakes when designing this behaviour? Is there a way to fix this?
I don't think you can do anything about that. It is the way the error is reported with fail() and it is not customizable. Probably it depends on how DataWeave itself reports errors rather than fail() itself.
I'm new in Mulesoft, I'm following Quickstart guide. In Step 2 (https://developer.mulesoft.com/guides/quick-start/developing-your-first-mule-application), I need to receive variables from URI in this way:
[{'id' : attributes.uriParams.productId}]
But when I try my GET I have the following error in console:
**Message : "Cannot coerce Array ([{id: "2" as String {class: "java.lang.String"}}]) to Object 1| [{'id' : attributes.uriParams.productId}] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Trace: at main (line: 1, column: 1)" evaluating expression: "[{'id' : attributes.uriParams.productId}]". Error type : MULE:EXPRESSION Element : get:\products(productId):test_daniel-config/processors/1 # test6_db_connection:test_daniel.xml:133 (Select) Element XML : SELECT product.,CONCAT('["', (GROUP_CONCAT(variant.picture SEPARATOR '","')),'"]') AS pictures,CONCAT('[', GROUP_CONCAT('{"',variant.identifierType, '":"', variant.identifier, '"}'),']') AS identifiersFROM product INNER JOIN variant ON product.uuid = variant.productUUIDWHERE product.uuid = :id; #[[{'id' : attributes.uriParams.productId}]] *
Any Ideas? Thanks!
cannot coerce Array to object error pop's up when you are using an array where you were supposed to use an object.
in the exception above the uri-param should be treated as ab object i.e. enclosed in {} but its being treated as an array of objects [{}].
this is causing the error.
We have successfully executed the DatabaseTablesPreparer and inited the tables in the DB, but when we try to init the indexes on the table with SQLScriptPreparer, we get the following exception:
ES1 dbinit [] [] com.intershop.platform.cartridge.internal.CartridgeImpl [] [] [] [] "main" Neither Ivy descriptor nor cartridge properties found for cartridge 'app_core_a1'!
ES1 dbinit [] [app_core_a1:Class1 DatabaseIndexesPreparer [hr/a1/core/dbinit/scripts/dbindex.ddl] Version:null] com.intershop.beehive.core.dbinit.preparer.database.DatabaseIndexesPreparer [] [] [] [] "main" [core] Exception java.lang.NullPointerException: null
at com.intershop.beehive.core.dbinit.preparer.database.SQLScriptPreparer.getCommand(SQLScriptPreparer.java:158)
at com.intershop.beehive.core.dbinit.preparer.database.SQLScriptPreparer.process(SQLScriptPreparer.java:353)
We had the similar problem with DatabaseTablesPreparer (Cartridge was null), and we solved it by adding cartridge.properties file, but now we are getting the same error ("Neither Ivy descriptor nor cartridge properties found for cartridge 'app_core_a1'") even though the cartridge properties file is defined.
There are the lines in decompiled preparer code where the null pointer exception occurs:
getCartridge().getVersion() + (getCartridge().getBuild().isEmpty() ? "" : new StringBuilder().append(".").append(getCartridge().getBuild()).toString()) };
This is the preparer from dbinit.properties:
Class1 = com.intershop.beehive.core.dbinit.preparer.database.DatabaseIndexesPreparer \
hr/a1/core/dbinit/scripts/dbindex.ddl
And this is the dbinit command we are executing:
dbinit.bat --exec-id=app_core_a1:Class1
DatabaseTablesPreparer from the same cartridge, defined in the same dbinit executes successfully.
Problem was fixed by publishing cartridge. It seems that ivy descriptor was deleted and it had to be republished.
UPDATING the problem statement
We are using spark 1.2.0 (Hadoop 2.4). We have defined SchemaRDDs using data files in HDFS and would like to enable querying these as tables via HiveServer2. We are encountering runtime exceptions while trying to saveAsTable and would like guidance on how to proceed.
Source code:
package foo.bar
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import org.apache.spark._
import org.apache.spark.sql.hive._
object HiveDemo {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Demo")
val sc = new SparkContext(conf)
// sc is an existing SparkContext.
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
// Create an RDD
val zipRDD = sc.textFile("/model-inputs/all_zip_state.csv")
// The schema is encoded in a string
val schemaString = "ODSMEMBERID,ZIPCODE,STATE,TEST_SUPPLIERID,ratio_death_readm_low,ratio_death_readm_high,regions"
// Generate the schema based on the string of schema
val schema =
StructType(
schemaString.split(",").map(fieldName => StructField(fieldName, StringType, true)))
// Convert records of the RDD (zip) to Rows.
val rowRDD = zipRDD.map(_.split(",")).map(p => Row(p(0), p(1), p(2), p(3), p(4), p(5), ""))
// Apply the schema to the RDD.
val zipSchemaRDD = hiveContext.applySchema(rowRDD, schema)
// HiveContext's save as Table
zipSchemaRDD.saveAsTable("allzipstable")
}
}
spark-submit Command:
./bin/spark-submit --class foo.bar.HiveDemo --master yarn-cluster --jars /usr/lib/hive/lib/hive-metastore.jar,/usr/lib/spark-1.2.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/usr/lib/spark-1.2.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/usr/lib/spark-1.2.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar --num-executors 3 --driver-memory 4g --executor-memory 2g --executor-cores 1 lib/datapipe_2.10-1.0.jar 10
Exception at runtime on Node:
15/01/29 22:35:50 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Unresolved plan found, tree:
'CreateTableAsSelect None, allzipstable, false, None
LogicalRDD [ODSMEMBERID#0,ZIPCODE#1,STATE#2,TEST_SUPPLIERID#3,ratio_death_readm_low#4,ratio_death_readm_high#5,regions#6], MappedRDD[3] at map at HiveDemo.scala:30
)
Exception in thread "Driver" org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan found, tree:
'CreateTableAsSelect None, allzipstable, false, None
LogicalRDD [ODSMEMBERID#0,ZIPCODE#1,STATE#2,TEST_SUPPLIERID#3,ratio_death_readm_low#4,ratio_death_readm_high#5,regions#6], MappedRDD[3] at map at HiveDemo.scala:30
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:78)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:78)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:76)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)
at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)
at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)
at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:412)
at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.scala:412)
at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:413)
at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:413)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
at org.apache.spark.sql.SchemaRDDLike$class.saveAsTable(SchemaRDDLike.scala:126)
at org.apache.spark.sql.SchemaRDD.saveAsTable(SchemaRDD.scala:108)
at com.healthagen.datapipe.ahm.HiveDemo$.main(HiveDemo.scala:36)
at com.healthagen.datapipe.ahm.HiveDemo.main(HiveDemo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)
15/01/29 22:35:50 INFO yarn.ApplicationMaster: Invoking sc stop from shutdown hook
Another attempt:
package foo.bar
import org.apache.spark.{ SparkConf, SparkContext }
import org.apache.spark.sql._
case class AllZips(
ODSMEMBERID: String,
ZIPCODE: String,
STATE: String,
TEST_SUPPLIERID: String,
ratio_death_readm_low: String,
ratio_death_readm_high: String,
regions: String)
object HiveDemo {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("HiveDemo")
val sc = new SparkContext(conf)
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
import hiveContext._
val allZips = sc.textFile("/model-inputs/all_zip_state.csv").map(_.split(",")).map(p => AllZips(p(0), p(1), p(2), p(3), p(4), p(5), ""))
val allZipsSchemaRDD = createSchemaRDD(allZips)
allZipsSchemaRDD.saveAsTable("allzipstable")
}
}
Exception on node:
15/01/30 00:28:19 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Unresolved plan found, tree:
'CreateTableAsSelect None, allzipstable, false, None
LogicalRDD [ODSMEMBERID#0,ZIPCODE#1,STATE#2,TEST_SUPPLIERID#3,ratio_death_readm_low#4,ratio_death_readm_high#5,regions#6], MapPartitionsRDD[4] at mapPartitions at ExistingRDD.scala:36
)
Exception in thread "Driver" org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan found, tree:
'CreateTableAsSelect None, allzipstable, false, None
LogicalRDD [ODSMEMBERID#0,ZIPCODE#1,STATE#2,TEST_SUPPLIERID#3,ratio_death_readm_low#4,ratio_death_readm_high#5,regions#6], MapPartitionsRDD[4] at mapPartitions at ExistingRDD.scala:36
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:78)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:78)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:76)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)
at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)
at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)
at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:412)
at org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.scala:412)
at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:413)
at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:413)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
at org.apache.spark.sql.SchemaRDDLike$class.saveAsTable(SchemaRDDLike.scala:126)
at org.apache.spark.sql.SchemaRDD.saveAsTable(SchemaRDD.scala:108)
at com.healthagen.datapipe.ahm.HiveDemo$.main(HiveDemo.scala:22)
at com.healthagen.datapipe.ahm.HiveDemo.main(HiveDemo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)
15/01/30 00:28:19 INFO yarn.ApplicationMaster: Invoking sc stop from shutdown hook
You need to use a HiveContext
Here are the java/scala docs:
* Note that this currently only works with SchemaRDDs that are created from a HiveContext as
* there is no notion of a persisted catalog in a standard SQL context.
#Experimental
def saveAsTable(tableName: String): Unit =
sqlContext.executePlan(CreateTableAsSelect(None, tableName, logicalPlan, false)).toRdd
So in your code change it to:
val sc = new HiveContext(conf)
Actually you should rename it to
val sqlc = new HiveContext(conf)
FYI: more info about registering tables (in SQLContext): note the tables are transient if done this way:
/**
* Temporary tables exist only
* during the lifetime of this instance of SQLContext.
*
* #group userf
*/
def registerRDDAsTable(rdd: SchemaRDD, tableName: String): Unit = {
catalog.registerTable(Seq(tableName), rdd.queryExecution.logical)
}
UPDATE Your new stacktrace includes the following phrase:
Unresolved plan found, tree:
That typically means you have a column that does not match the underlying table. I will look further to see if am able to isolate - but in the meantime you might also consider from that perspective.
createSchemaRDD code snippet from above works fine on spark 1.2.1
There was a CTAS defect in 1.2.0
I have currently started to work with JSON files and process data using PIG scripts. I am using Pig version 0.9.3.I have come across PiggyBank which i thought will be useful to load and process json file in PIG scripts.
I have built piggybank.jar through ANT.
Later, I have compiled the Java File and updated the piggybank.jar. Was trying to run the given example json file.
I have written a simple PIGSCRIPT and the respective JSON as follows.
REGISTER piggybank.jar
a = LOAD 'file3.json' using org.apache.pig.piggybank.storage.JsonLoader() AS (json:map[]);
b = foreach a GENERATE flatten(json#'menu') AS menu;
c = foreach b generate flatten(menu#'popup') as popup;
d = foreach c generate flatten(popup#'menuitem') as menu;
e = foreach d generate flatten(menu#'value') as val;
DUMP e;
file3.json
{ "menu" : {
"id" : "file",
"value" : "File",
"popup": {
"menuitem" : [
{"value" : "New", "onclick": "CreateNewDoc()"},
{"value" : "Open", "onclick": "OpenDoc()"},
{"value" : "Close", "onclick": "CloseDoc()"}
]
}
}}
I get the following exception during runtime:
org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error while reading input - Could not json-decode string: { "menu" : {
at org.apache.pig.piggybank.storage.JsonLoader.parseStringToTuple(JsonLoader.java:127)
Pig log file:
Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias e
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias e
at org.apache.pig.PigServer.openIterator(PigServer.java:901)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:561)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
at org.apache.pig.PigServer.openIterator(PigServer.java:893)
... 12 more
================================================================================
Please correct me if I am wrong. Thanks
You can handle nested json loading with Twitter's Elephant Bird: https://github.com/kevinweil/elephant-bird
a = LOAD 'file3.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad')
This will parse the JSON into a map http://pig.apache.org/docs/r0.11.1/basic.html#map-schema the JSONArray gets parsed into a DataBag of maps.