Pentaho CSV input error, dont know how to solve it - pentaho

We failed to initialize at least one step. Execution can not begin!
2021/01/18 15:31:45 - Carte - Installing timer to purge stale objects after 1440 minutes.
2021/01/18 15:33:05 - /Transformation 1 - Dispatching started for transformation [/Transformation 1]
2021/01/18 15:33:05 - CSV file input.0 - ERROR (version 9.1.0.0-324, build 9.1.0.0-324 from 2020-09-07 05.09.05 by buildguy) : Error initializing step [CSV file input]
2021/01/18 15:33:05 - CSV file input.0 - ERROR (version 9.1.0.0-324, build 9.1.0.0-324 from 2020-09-07 05.09.05 by buildguy) : java.lang.NumberFormatException: For input string: ""
2021/01/18 15:33:05 - CSV file input.0 - at java.lang.NumberFormatException.forInputString(Unknown Source)
2021/01/18 15:33:05 - CSV file input.0 - at java.lang.Integer.parseInt(Unknown Source)
2021/01/18 15:33:05 - CSV file input.0 - at java.lang.Integer.parseInt(Unknown Source)
2021/01/18 15:33:05 - CSV file input.0 - at org.pentaho.di.trans.steps.csvinput.CsvInput.init(CsvInput.java:875)
2021/01/18 15:33:05 - CSV file input.0 - at org.pentaho.di.trans.step.StepInitThread.run(StepInitThread.java:69)
2021/01/18 15:33:05 - CSV file input.0 - at java.lang.Thread.run(Unknown Source)
2021/01/18 15:33:05 - /Transformation 1 - ERROR (version 9.1.0.0-324, build 9.1.0.0-324 from 2020-09-07 05.09.05 by buildguy) : Step [CSV file input.0] failed to initialize!

Related

Pentaho - Big Query [Simba][BigQueryJDBCDriver](100032) Error executing query job. Message: BIGQUERY_API_ERR

In pentaho I get an error when I read a BigQuery table with a "Table Entry", I have these considerations:
This table was created from a Google Drive sheet with the service account
I can read this table with "Google Sheet Plugins"
[1]: https://i.stack.imgur.com/q9dl0.png
[2]: https://i.stack.imgur.com/gCxQK.png
2022/07/29 21:20:36 - Select.0 - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : An error occurred, processing will be stopped:
2022/07/29 21:20:36 - Select.0 - An error occurred executing SQL:
2022/07/29 21:20:36 - Select.0 - select 1 from `ms-data-warehouse.ms_Dev_Staging.ET_ods_hour`
2022/07/29 21:20:36 - Select.0 - [Simba][BigQueryJDBCDriver](100032) Error executing query job. Message: BIGQUERY_API_ERR
2022/07/29 21:20:36 - Select.0 - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Error initializing step [Select]
2022/07/29 21:20:36 - insert drive - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Step [Select.0] failed to initialize!
2022/07/29 21:20:36 - Select.0 - Finished reading query, closing connection.
2022/07/29 21:20:36 - Spoon - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : insert drive: preparing transformation execution failed
2022/07/29 21:20:36 - Spoon - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : org.pentaho.di.core.exception.KettleException:
2022/07/29 21:20:36 - Spoon - We failed to initialize at least one step. Execution can not begin!
Your second screenshot says that it doesn't have the Drive access.
BigQuery doesn't keep the credential for accessing the Google Drive, instead, BigQuery uses the "current user" credential trying to access Google Drive.
Apparently the "service account" has the Google Drive access (in order to create that table) but either your account or the account used to setup the Simba BigQueryJDBCDriver doesn't have the access to the Google Drive file.

pentaho data integration 8.3 cannot output to parquet file

I export table input to big data Parquet output. I set Location=Local, and Folder/File name= "file:///G:/temp/feng",and run the transformation,get the follow errors:
2022/05/08 20:30:09 - Spoon - Using legacy execution engine
2022/05/08 20:30:09 - Spoon - Transformation opened.
2022/05/08 20:30:09 - Spoon - Launching transformation [v_bi_test_to_parquet]...
2022/05/08 20:30:09 - Spoon - Started the transformation execution.
2022/05/08 20:30:09 - v_bi_test_to_parquet - Dispatching started for transformation [v_bi_test_to_parquet]
2022/05/08 20:30:10 - Parquet output.0 - ERROR (version 8.3.0.0-371, build 8.3.0.0-371 from 2019-06-11 11.09.08 by buildguy) : Unexpected error
2022/05/08 20:30:10 - Parquet output.0 - ERROR (version 8.3.0.0-371, build 8.3.0.0-371 from 2019-06-11 11.09.08 by buildguy) : org.pentaho.di.core.exception.KettleException:
2022/05/08 20:30:10 - Parquet output.0 - can't get service format shim
2022/05/08 20:30:10 - Parquet output.0 - org.pentaho.hadoop.shim.ConfigurationException: Unable to load Hadoop configuration java.util.ServiceConfigurationError: org.pentaho.hadoop.shim.common.authorization.NoOpHadoopAuthorizationService: Provider org.pentaho.hadoop.shim.mapr60.authorization.ShimNoOpHadoopAuthorizationService could not be instantiated
2022/05/08 20:30:10 - Parquet output.0 -
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.big.data.kettle.plugins.formats.impl.parquet.output.ParquetOutput.init(ParquetOutput.java:101)
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.big.data.kettle.plugins.formats.impl.parquet.output.ParquetOutput.processRow(ParquetOutput.java:64)
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2022/05/08 20:30:10 - Parquet output.0 - at java.lang.Thread.run(Thread.java:748)
2022/05/08 20:30:10 - Parquet output.0 - Caused by: org.pentaho.big.data.api.initializer.ClusterInitializationException: org.pentaho.hadoop.shim.ConfigurationException: Unable to load Hadoop configuration java.util.ServiceConfigurationError: org.pentaho.hadoop.shim.common.authorization.NoOpHadoopAuthorizationService: Provider org.pentaho.hadoop.shim.mapr60.authorization.ShimNoOpHadoopAuthorizationService could not be instantiated
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.big.data.impl.shim.initializer.ClusterInitializerProviderImpl.initialize(ClusterInitializerProviderImpl.java:53)
2022/05/08 20:30:10 - Parquet output.0 - at Proxy7007af9d_3635_4f9d_bf71_525036c201f0.initialize(Unknown Source)
2022/05/08 20:30:10 - Parquet output.0 - at Proxy3114331f_fba1_4e42_aec2_2c5fd16bcea1.initialize(Unknown Source)
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.big.data.api.initializer.impl.ClusterInitializerImpl.initialize(ClusterInitializerImpl.java:45)
2022/05/08 20:30:10 - Parquet output.0 - at Proxy1066287c_6d43_4647_b5db_91dd44f553e0.initialize(Unknown Source)
2022/05/08 20:30:10 - Parquet output.0 - at Proxy1b43f5f0_5092_4eeb_8b68_1741c28de27c.initialize(Unknown Source)
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.big.data.api.cluster.service.locator.impl.NamedClusterServiceLocatorImpl.getService(NamedClusterServiceLocatorImpl.java:110)
2022/05/08 20:30:10 - Parquet output.0 - at Proxyf7550ee3_4ecd_4735_96a6_fb538ea631df.getService(Unknown Source)
2022/05/08 20:30:10 - Parquet output.0 - at Proxy435b60e6_4566_40ed_a8a6_42846dbeb2bc.getService(Unknown Source)
2022/05/08 20:30:10 - Parquet output.0 - at org.pentaho.big.data.kettle.plugins.formats.impl.parquet.output.ParquetOutput.init(ParquetOutput.java:99)
2022/05/08 20:30:10 - Parquet output.0 - ... 3 more
What's the problem? What about shim service?

Pentaho DI6.1, Error using mBox in Email Message Input Step

This time I need help with this software. I'm trying to create a transformation that, obtaining a mbox, return certains parts of the emails. But! When I use the step Email Message Input preview function, Pentaho return me this.
2016/09/09 14:52:53 - cfgbuilder - Warning: The configuration
parameter [org] is not supported by the default configuration builder
for scheme: sftp 2016/09/09 14:54:58 - DBCache - Loading database
cache from file: [C:\Users\fangonzalez.kettle\db.cache-6.1.0.1-196]
2016/09/09 14:54:58 - DBCache - We read 0 cached rows from the
database cache! 2016/09/09 14:54:59 - Spoon - Trying to open the last
file used. 2016/09/09 15:03:37 -
C:\Users\fangonzalez\Desktop\Pentaho\trans.ktr : trans - Dispatching
started for transformation
[C:\Users\fangonzalez\Desktop\Pentaho\trans.ktr : trans] 2016/09/09
15:03:37 - Email messages input.0 - ERROR (version 6.1.0.1-196, build
1 from 2016-04-07 12.08.49 by buildguy) : Error opening folder 1 :
java.lang.NullPointerException 2016/09/09 15:03:37 - Email messages
input.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49
by buildguy) : java.lang.NullPointerException 2016/09/09 15:03:37 -
Email messages input.0 - at
org.pentaho.di.trans.steps.mailinput.MailInput.openNextFolder(MailInput.java:347)
2016/09/09 15:03:37 - Email messages input.0 - at
org.pentaho.di.trans.steps.mailinput.MailInput.getOneRow(MailInput.java:214)
2016/09/09 15:03:37 - Email messages input.0 - at
org.pentaho.di.trans.steps.mailinput.MailInput.processRow(MailInput.java:75)
2016/09/09 15:03:37 - Email messages input.0 - at
org.pentaho.di.trans.step.RunThread.run(RunThread.java:62) 2016/09/09
15:03:37 - Email messages input.0 - at java.lang.Thread.run(Unknown
Source) 2016/09/09 15:03:37 - Email messages input.0 - Finished
processing (I=0, O=0, R=0, W=0, U=0, E=1) 2016/09/09 15:03:37 -
C:\Users\fangonzalez\Desktop\Pentaho\trans.ktr : trans -
Transformation detected one or more steps with errors. 2016/09/09
15:03:37 - C:\Users\fangonzalez\Desktop\Pentaho\trans.ktr : trans -
Transformation is killing the other steps!
There is the pic of the step config screen
The option "Fetch in batches" cannot be enable if you use only one mbox.

File System Exception: Could not find files. Caused by: Invalid descendent file name hdfs

I am trying to do hadoop-mapreduce in pentaho.I have hadoopcopyfiles step in a job to specify input path of file.All works fine if my input file location is with root access.(ie.)files created already in root folder.But, if i give source file as my local file location,it gives the following error in pentaho log
2016/01/12 11:44:57 - Spoon - Starting job...
2016/01/12 11:44:57 - samplemapjob1 - Start of job execution
2016/01/12 11:44:57 - samplemapjob1 - Starting entry [Hadoop Copy Files]
2016/01/12 11:44:57 - Hadoop Copy Files - Starting ...
2016/01/12 11:44:57 - Hadoop Copy Files - Processing row source File/folder source : [file:///home/vasanth/Desktop/my.txt] ... destination file/folder : [hdfs://WEB2131:9000/new1/]... wildcard : [null]
2016/01/12 11:45:03 - Hadoop Copy Files - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : File System Exception: Could not find files in "file:///home/vasanth/Desktop".
2016/01/12 11:45:03 - Hadoop Copy Files - ERROR (version 6.0.0.0-353, build 1 from 2015-10-07 13.27.43 by buildguy) : Caused by: Invalid descendent file name "hdfs:".
I have tried giving
sudo chmod 777 /home/vasanth/Desktop/my.txt
but error is still there.how can i solve this problem?

Pentaho Excel sheet reading error

I am trying to access an Excel sheet and perform actions on it. When I am running it, I get the below error. It is not reading the input file.
2015/04/29 13:11:42 - INPUT.0 - Opening openFile #0 : /var/opt/UTM/V1.2/data/DAILY_inputs/working/Input_29-04-2015.xlsx
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : Unexpected error :
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : java.lang.NoClassDefFoundError: jxl/WorkbookSettings
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.steps.excelinput.ExcelInput.getRowFromWorkbooks(ExcelInput.java:501)
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.steps.excelinput.ExcelInput.processRow(ExcelInput.java:405)
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.step.BaseStep.runStepThread(BaseStep.java:2664)
2015/04/29 13:11:42 - INPUT.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:30:46) : at org.pentaho.di.trans.steps.excelinput.ExcelInput.run(ExcelInput.java*
As far as I know from your errors, there is a jar(Java library) file missing.
Could you do a quick check in Pentaho folder and look for this file.
.../data-integration/lib/jxl-2.6.12.jar
If it's missing, you could try to download and put it into 'lib' folder.
Download from here:
http://mirrors.ibiblio.org/pub/mirrors/maven/net.sourceforge.jexcelapi/jars/jxl-2.6.jar
Then restart Pentaho and try again.