Pentaho not retaining the log and temp files - pentaho

I am running a pentaho ETL kettle transformation(.ktr) to load data from a source db2 database into a destination netezza database.
When I run the transformation, I specify the directory to store the log files and temp .txt files. But after the transformation finishes, these files are no longer there, so I guess pentaho is cleaning them up. IS there a way to retain these files?
The other problem is that I am getting a sql exception while the transformation step is inserting into netezza like this:
error
2013/10/30 14:13:17 - Load XXX_TABLE_NAME - ERROR (version 4.4.0-stable, build 17588 from 2012-11-21 16.02.21 by buildguy) : at org.netezza.internal.QueryExecutor.getNextResult(QueryExecutor.java:279)
No further details are there. How can I troubleshoot this?

That seems like an issue with pentaho. Is there no way to generate a trace of what it's doing in the transformation ? are you sure it's reading data ? what happens if the target is not netezza ?
If you've got access to the netezza appliance, there are a few options, all in the documentation. off the top of my head:
look in the current queries view while it's running
enable query history logging (requires admin access + restarting the instance)
check the pg.log file in /nz/kit/log/postgres/ (logs all queries by default)

Related

Committing hudi files manually

I am using spark 3.x with apache-hudi 0.8.0 version.
While I am trying to create presto table by using hudi-hive-sync tool I am getting below error.
Got runtime exception when hive syncing
java.lang.IllegalArgumentException: Could not find any data file written for commit [20220116033425__commit__COMPLETED], could not get schema for table
But I checked all data for partitiionKeys using zepplin notebook , I see all data present.
Its understood that I need to do manually commit the file. How to do it ?

Why does my SSIS package not pull all the data from a specific table in Oracle?

Situation: I have created an SSIS package that has a OLE DB Source that is an Oracle Database and an OLE DB destination of SQL Server. I use [Oracle Provider for OLE DB] connection.
Problem: when I execute the package it will run short and only return 220,000 out of 4 million records. The package runs with no errors and no warnings. Just successfully completes but will not go past 220,000 records. I found one similar problem on this site however it pointed to a date format issue and this table has no date data types in its structure.
Troubleshooting so far:
I extracted the table as a flat file and ran the package to the same destination table, this runs fine. All 4 million records will load from flat file to destination ok.
I have tried running the package as a fast load and a normal - no change
I have tried different buffer combinations and Auto Adjust Buffer Size - no change
I have uninstalled and reinstalled, VS, Oracle 12c, SSDT - No change
I thought maybe it could be memory or size issue, no luck, I load many other tables that are larger in size memory wise.
Environment Specs:
VS V 15.9.14
Oracle Developer Tools for Visual Studio 12.2.0.1.0
SSDT 15.1.61906.3120
SSIS 15.0.1301.433
SQL Server 2016 13.0.4 - SP1
Has anyone dealt with something like this what are somethings I could try or look into?
Thanks!

variable for SQLLOGDIR not found

I am using ola hallengren script for maintenance solution. When I run just the Database backup job for user database I get the following error. Unable to start execution of step 1 (reason: Variable SQLLOGDIR not found). The step failed.
I have checked the directory permissions and there is no issue there. The script creates the job with no problem. I get error message when I try to run the job.
I had this same issue just the other day. I run a number of 2017 servers but the issue happened when I started running on a 2012 server.
I've dropped Ola a mail to confirm but best I can make out is that the SQLLOGDIR parameter specified in the 'advanced' tab for the step (for logging outputs) is not compatible with 2012 and maybe below 2017 though I have not tested these.
HTH,
Adam.
You need to replace this part in the advanced tab with the job name for example :
$(ESCAPE_SQUOTE(JOBNAME)) replace it with CommandLogCleanup_$(ESCAPE_SQUOTE(JOBID)) so then it will look like this:
$(ESCAPE_SQUOTE(SQLLOGDIR))\CommandLogCleanup_$(ESCAPE_SQUOTE(JOBID))_$(ESCAPE_SQUOTE(STEPID))_$(ESCAPE_SQUOTE(DATE))_$(ESCAPE_SQUOTE(TIME)).txt
instead of this:
$(ESCAPE_SQUOTE(SQLLOGDIR))\$(ESCAPE_SQUOTE(JOBNAME))_$(ESCAPE_SQUOTE(STEPID))_$(ESCAPE_SQUOTE(DATE))_$(ESCAPE_SQUOTE(TIME)).txt
Do this for all the other jobs if you don't want to recreate them.
I had the same issue on my SQL Server 2012 version, the error was during the dB backup using Ola's scripts, as mentioned above the issue is with the output file, I changed the location and the output file from the SQL Job and reran the job successfully (refer the attached screenshot for reference.
The error is related to the job output file.
When you create a maintenance job using the Ola script it will automatically assign output file to the step. Sometimes the location does not exist on the server.
I faced the same issue, then I ran the integrity script manually on the server and it completed without error, then I found that the error is in job configuration.
I changed the job output file location and now job also running fine.
The trick is to build the string for the #output_file_name parameter element by element before calling the stored procedure. If you look into Olas code you will see that is exactly what he is doing.
I have tried to describe this in more detail in the post Add SQL Agent job step with tokens in output file name.

Talend Open Studio: Load input files into database

I have an empty SQLlite database. Next to that, I have 6 input files (delimited, excel, json, xml).
Now, all I want to do is load the input files into the empty database.
I tried to connect one input file with the DB and just run it. That didn't work (the DB doens't have anything in it, I suspect that is a problem).
Then, I tried to connect an input file with a tMap, define the table there, define the schema and connect the tMap to the DB (tSQLliteOutput).
When I tried to run it, I receive the following error:
Starting job ProductDemo_Load at 16:46 15/11/2015.
[statistics] connecting to socket on port 3843
[statistics] connected
Exception in component tSQLiteOutput_1
java.sql.SQLException: no such table:
at org.sqlite.DB.throwex(DB.java:288)
at org.sqlite.NativeDB.prepare(Native Method)
at org.sqlite.DB.prepare(DB.java:114)
at org.sqlite.PrepStmt.<init>(PrepStmt.java:37)
at org.sqlite.Conn.prepareStatement(Conn.java:231)
at org.sqlite.Conn.prepareStatement(Conn.java:224)
at org.sqlite.Conn.prepareStatement(Conn.java:213)
at workshop_test.productdemo_load_0_1.ProductDemo_Load.tFileInputExcel_1Process(ProductDemo_Load.java:751)
at workshop_test.productdemo_load_0_1.ProductDemo_Load.runJobInTOS(ProductDemo_Load.java:1672)
at workshop_test.productdemo_load_0_1.ProductDemo_Load.main(ProductDemo_Load.java:1529)
[statistics] disconnected
Job ProductDemo_Load ended at 16:46 15/11/2015. [exit code=1]
I see there's something wrong with the import, but what exactly?
What should I do in order to succesfully load the data from the input files in the database?
I did the exact steps from this little tutorial:
Talend Job: load data into database.
Most talend output components have create table if not exists option.. Did u checked this in your tsqliteoutput..error seems that when talend is inserting data into empty database your table it is not able to find it as it does not exists.. So you to tell talend to create the table first..

Pentaho Kettle is not working for Vertica DB

I need to parse a CSV file and write the data to a Vertica database. The issue is that I get an error when I create a Vertica database connection in Spoon. The following is the error at the end of the post.
I tried copying the following two JAR files and adding them to libext/jdbc:
vertica-jdbc-4.1.14.jar and vertica-jdk5-6.1.2-0.jar
But the above didn't help. I am looking for pointers!
Error:
Error connecting to database [Vertica Dev] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database
Exception while loading class
com.vertica.jdbc.Driver
org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database
Exception while loading class
com.vertica.jdbc.Driver
at org.pentaho.di.core.database.Database.normalConnect(Database.java:366)
The two JAR files you copied are of two different versions of Vertica and do not use the same class.
vertica-jdk5-6.1.2-0.jar will expose com.vertica.jdbc.Driver whereas version 4 will expose com.vertica.Driver.
The error message thus makes obvious that Pentaho is looking for com.vertica.jdbc.Driver (version 5, thus). If it fails, it probably is because the JAR version 4 is loaded first.
Try to delete the version 4 only from the libext/jdbc, keep the version 5, and restart Pentaho.
On a side note, this class is hardcoded in Pentaho, so if you do need to use the JAR version 4 and feel adventurous, you just need to get the Pentaho source, update VerticaDatabaseMeta.java, and recompile.