I currently have a transformation that includes a UserDefinedJavaClass step. This runs as expected in Spoon. When I try and run this transformation as part of a web application using the Kettle jars I get the following error:
017/01/20 10:04:56 - Load questionnaires.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Error initializing UserDefinedJavaClass:
2017/01/20 10:04:56 - Load questionnaires.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : org.pentaho.di.core.exception.KettleException:
2017/01/20 10:04:56 - Load questionnaires.0 - null
2017/01/20 10:04:56 - Load questionnaires.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Error initializing step [Load questionnaires]
2017/01/20 10:04:57 - evalue-di-risk - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Step [Load questionnaires.0] failed to initialize!
Because this works in Spoon, I can only assume I am missing a library. I am already including kettle-core, kettle-engine (kettle version 6.1.0.1-196) and janino which I would have thought was all I required to get this running.
EDIT I have taken every jar from the Spoon lib folder and dumped it into my webapp and this didn't work either.
EDIT AGAIN It turns out that the problem is having a UserDefinedJavaClass and a Table Input that contains variables which should be replaced during the transformation does not work and causes the above error.
you can not do that, you have to use the sdk api
and compile from your application.
http://wiki.pentaho.com/display/EAI/The+PDI+SDK
Related
In pentaho I get an error when I read a BigQuery table with a "Table Entry", I have these considerations:
This table was created from a Google Drive sheet with the service account
I can read this table with "Google Sheet Plugins"
[1]: https://i.stack.imgur.com/q9dl0.png
[2]: https://i.stack.imgur.com/gCxQK.png
2022/07/29 21:20:36 - Select.0 - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : An error occurred, processing will be stopped:
2022/07/29 21:20:36 - Select.0 - An error occurred executing SQL:
2022/07/29 21:20:36 - Select.0 - select 1 from `ms-data-warehouse.ms_Dev_Staging.ET_ods_hour`
2022/07/29 21:20:36 - Select.0 - [Simba][BigQueryJDBCDriver](100032) Error executing query job. Message: BIGQUERY_API_ERR
2022/07/29 21:20:36 - Select.0 - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Error initializing step [Select]
2022/07/29 21:20:36 - insert drive - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Step [Select.0] failed to initialize!
2022/07/29 21:20:36 - Select.0 - Finished reading query, closing connection.
2022/07/29 21:20:36 - Spoon - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : insert drive: preparing transformation execution failed
2022/07/29 21:20:36 - Spoon - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : org.pentaho.di.core.exception.KettleException:
2022/07/29 21:20:36 - Spoon - We failed to initialize at least one step. Execution can not begin!
Your second screenshot says that it doesn't have the Drive access.
BigQuery doesn't keep the credential for accessing the Google Drive, instead, BigQuery uses the "current user" credential trying to access Google Drive.
Apparently the "service account" has the Google Drive access (in order to create that table) but either your account or the account used to setup the Simba BigQueryJDBCDriver doesn't have the access to the Google Drive file.
I'm running a kettle job which calls another job and then a transformation, the job runs smoothly in the PDI Spoon but when i call the job via a kitchen job it throws
2016/03/07 20:51:50 - Purifier_success - Executing command :
/home/ubuntu/dataintegration/data-integration/null/kettle_6a3db0e2-e4a6-11e5-b201-d7ec3590f1c5shell
2016/03/07 20:51:50 - Purifier_success - ERROR (version 5.4.0.1-130,
build 1 from 2015-06-14_12-34-55 by buildguy) : (stderr)
/home/ubuntu/dataintegration/data-integration/null/kettle_6a3db0e2-e4a6-11e5-b201-d7ec3590f1c5shell:
1:
/home/ubuntu/dataintegration/data-integration/null/kettle_6a3db0e2-e4a6-11e5-b201-d7ec3590f1c5shell:
2016/03/07 20:51:50 - Purifier_success - ERROR (version 5.4.0.1-130,
build 1 from 2015-06-14_12-34-55 by buildguy) : (stderr) : not found
But the job runs successfully. But its awkward to see to see the shell screen saying error.
Is there any way to solve it.
I am trying to do hadoop-mapreduce in pentaho.I have hadoopcopyfiles step in a job to specify input path of file.All works fine if my input file location is with root access.(ie.)files created already in root folder.But, if i give source file as my local file location,it gives the following error in pentaho log
2016/01/12 11:44:57 - Spoon - Starting job...
2016/01/12 11:44:57 - samplemapjob1 - Start of job execution
2016/01/12 11:44:57 - samplemapjob1 - Starting entry [Hadoop Copy Files]
2016/01/12 11:44:57 - Hadoop Copy Files - Starting ...
2016/01/12 11:44:57 - Hadoop Copy Files - Processing row source File/folder source : [file:///home/vasanth/Desktop/my.txt] ... destination file/folder : [hdfs://WEB2131:9000/new1/]... wildcard : [null]
2016/01/12 11:45:03 - Hadoop Copy Files - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : File System Exception: Could not find files in "file:///home/vasanth/Desktop".
2016/01/12 11:45:03 - Hadoop Copy Files - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : Caused by: Invalid descendent file name "hdfs:".
I have tried giving
sudo chmod 777 /home/vasanth/Desktop/my.txt
but error is still there.how can i solve this problem?
I am trying to access an Excel sheet and perform actions on it. When I am running it, I get the below error. It is not reading the input file.
2015/04/29 13:11:42 - INPUT.0 - Opening openFile #0 : /var/opt/UTM/V1.2/data/DAILY_inputs/working/Input_29-04-2015.xlsx
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : Unexpected error :
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : java.lang.NoClassDefFoundError: jxl/WorkbookSettings
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.steps.excelinput.ExcelInput.getRowFromWorkbooks(ExcelInput.java:501)
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.steps.excelinput.ExcelInput.processRow(ExcelInput.java:405)
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.step.BaseStep.runStepThread(BaseStep.java:2664)
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.steps.excelinput.ExcelInput.run(ExcelInput.java*
As far as I know from your errors, there is a jar(Java library) file missing.
Could you do a quick check in Pentaho folder and look for this file.
.../data-integration/lib/jxl-2.6.12.jar
If it's missing, you could try to download and put it into 'lib' folder.
Download from here:
http://mirrors.ibiblio.org/pub/mirrors/maven/net.sourceforge.jexcelapi/jars/jxl-2.6.jar
Then restart Pentaho and try again.
I am trying to make a cartesian product in pentaho by using join rows (cartesian product) . I am using 2 input streams and both have data. But I am getting error:
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : Unexpected error
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : java.lang.NullPointerException
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.trans.steps.joinrows.JoinRows.getRowData(JoinRows.java:213)
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.trans.steps.joinrows.JoinRows.outputRow(JoinRows.java:301)
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.trans.steps.joinrows.JoinRows.processRow(JoinRows.java:287)
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
2013/11/22 13:57:31 - Join Rows (cartesian product).0 - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at java.lang.Thread.run(Thread.java:744)
How to debug this?
From your post, I see you are using version 4.4.0-stable. I have often managed to get more helpful error messages from this version using the following trick:
add a "select fields" step after the buggy step,
click get fields to select.
If you don't get a more meaningful error message, we would need more details about the two input streams you're trying to join.