How to access pagination values in kettle from pentaho cde? - pentaho

I have a table component with data source attached to kettle transformation.
Now, I want to implement server side pagination and access the parameters "pageStart" and "pageSize" set by pentaho in kettle transformation using getVariables. I have followed https://forums.pentaho.com/threads/143504-how-to-use-CDE-table-component-of-Paginate-server-side/ to set up server side paginate.
I do see the pageSize and pageStart parameters set in the post parameters in the doQuery in the network console but cannot access them in kettle transformation. Turns out I can only access parameters prefixed by "parameter" in the kettle transformation but not the ones without prefix "parameter", see the screenshot
I can access the paramDateStart and paramDateEnd but not pageSize and pageStart in the transforamtion.
How can I achieve this so that I can only load certain data set based on parameters pageStart and pageSize from the server?

Related

Get result data from pentaho data integration metadata injection

In pentaho data integration I am using metadata injection within a stream of a transformation. How can I get the result of the metadata injection back to my stream in order to continue transforming the data outside of the metadata injection. Copy rows to result does not seem to be working here like it does with a transformation within a transformation.
Found it myself. In the Options tab you can select the step within the template to read the data from and below you can set the fields.
metadata injection options

Define Database connection for two different datasets using variables in Pentaho

I have a requirement of connecting to two different datasets using a variable which compares these datasets. I'm using two different table input steps where the database connection names, hostnames are hard coded.
Instead of using hardcoded I want to use a variable which defines these connections and should be able to connect to them
You can define variables in the kettle.properties file, located in the .kettle directory. Then you can use these variables in your database connection settings.
You can also define variables in your own .properties files and read them in using the Set Variables job entry.
set the variables like:
db_name.host=localhost
db_name.db=databasename
db_name.user=username
Then access those variables in your job/transformation by using the format ${db_name.host} etc.
Use JNDI to set up all your connection parameters:
(1) edit the file: data-integration/simple-jndi/jdbc.properties, and add your DB connection strings, for example:
db1/type=javax.sql.DataSource
db1/driver=com.mysql.jdbc.Driver
db1/url=jdbc:mysql://127.0.0.1:3305/mydb
db1/user=user1
db1/password=password1
db2/type=javax.sql.DataSource
db2/driver=com.mysql.jdbc.Driver
db2/url=jdbc:mysql://mydbserver:3306/mydb
db2/user=user2
db2/password=password2
Here we created two JNDI name db1 and db2 which we can use in PDI jobs/transformations.
(2) In your PDI job/transformation, add a parameter, i.e. mydb, through the menu 'Edit' -> Settings... -> Parameters Tab. You can add more such DB parameters if more than one is used.
(3) In the Table Input step, click Edit... or New.. button and in the coming dialog, switch the item in Access: box to JNDI and then add the ${mydb} in the JNDI name at the upper-right corner. you can also use plain text db1 and db2 which we defined in (1) to identify the DB connection.
Using JNDI to manage DB connections, we were able to switch between staging and production DBs simply by using the parameters. You can do the similar to PRD.

JMeter for database load testing

I'm trying to use Apache JMeter to perform a database load test using a list of pre-generated SQL statements from a file. The pre-generated SQL statements are various stored procedures that were captured from a trace so they have the needed values for the parameters as part of the execution statement.
I'm following the same design as a load test for an HTTP Request
from an external file by setting the variable from the CSV_Data_Set_Config as the PATH value of the HTTP Request but replacing the HTTP Request with the JDBC Request and putting the variable from the CSV_Data_Set_Config for the SQL statement. Each example I've seen only takes the contents of the file as variables into a predefined SQL statement but nothing takes each line from the file as a complete statement to execute.
In addition to using the CSV_Data_Set_Config I've tried going another route by trying to read lines from the file using the CSVRead function and putting the statement in the parameter values of the JDBC Request and using the ? to fill in the SQL Statement at runtime however it seems to cut off the line after the first period in the three part name. For example a line would be exec {database}.{owner}.{procedure} and the request would only send in exec {database}
Can you use JMeter in this manner with the JDBC Request controller?

using variable names for a database connection in Pentaho Kettle

I am working on PDI kettle. Can we define a variable and use it in a database connection name. So that if in future if i need to change the connections in multiple transformations i would just change the variable value in kettle properties file?
Just use variables in the Database Connection.
For instance ${DB_HostName}, and ${DB_Name} etc.
Then just put it in your kettle.properties:
DB_HostName=localhost
You can see what fields that support variables by the S in the blue diamond.

Connecting DB Connection from properties file outside the Kettle is not working

I am trying to remove DB Connection from ktr file and I am trying to connect to DBConnection by using the properties file which contains information about the connection. I used this link as reference;
Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel.
I followed all the steps but I am not able to get the required output.
I want to connect to the database using properties file and have to excute the SQL using the DB defined in the properties file and the output has to be transfered into the output(Excel,csv,output-table etc).
Try something like this:
1- A Index job for start all (is my way)
This job call a transformation whose job is to load the connection data to the database
2- The transformation that load the connection data pass this variables like parameters
3- The middle job is only for repeat the process if is necessary, only work like a bridge, pass the parameters
4- In this transformation does all DB work
5- The datasource look like this.
PS: sorry for my poor english :(