I am creating a transformation in pentaho di to extract data from google analytics. I need to set in "Query Definition" the start date and end date as yesterday and today. I understand this can be done by create two varialbes e.g. ${todsy},${yesterday}. However, I don't know how to set these to change values dynamically at every run. ANy idea on how to do this?
Thanks,
I can think of an easy way to do this. The first thing is that you can't declare and use the variables in the same transformation. I would suggest you to approach this problem in the following way:
Create a transformation before this one, say "set variables transformation". In this transformation you will set the variables.
You can use Get System Info step to set today's and yesterday's dates as the variables. Use copy rows to result step to pass these rows to the next transformation.
In the next transformation, which will be the one you have attached the screenshot of, use the Get Variables step and use these variables in your input step. Or you can use Get rows from result step as well.
You don't need to worry about the dates now, because dates will be generated and your variables get the values dynamically.
You can check this article if you want to learn more about how to pass the values from one transformation to another:
https://anotherreeshu.wordpress.com/2014/12/23/using-copy-rows-to-result-in-pentaho-data-integration/
Hope it helps!
for that, you have to use a job, add the first transformation and inside it use
get system info step then add today's and yesterday's date as a variable, and link to the set variable step. Set the scope of variable as parent job,
in second job use **get variables **.
It took me a while to solve this myself and the way I ended up doing it is as follows:
I created a transformation (called 'set formatted_today variable') the transformation contains two objects:
The transformation contained a 'table input' object with a query like:
select to_char(current_timestamp, 'YYYY-MM-DD-HH-MI') as formatted_today
The output of my 'table input' goes to a 'set variables' object, you can use the 'get fields button to wire the fields you've named in your query to the variable you want to set. In this case, my field is called 'formatted_today' and so is my variable.
In my main job, I have a 'set session variables' object that creates my 'formatted_today' variable.
Immediately after it, I call my 'set formatted_today variable' transformation
Anywhere I need this variable I insert ${formatted_today} in the text
Related
I make one table named QueryTable that store 4 SQL queries each have different meta data
I want to store these four queries result in Excel sheet
First I have taken executable SQL task and configured the connection and Result Set as a Full Result Set, Query statement.
After that open Result Set tab and create Query_variable as a object type.
2) Drag the For-Each_loop container and set Foreach ADO Enumerator in collection part and assign Query_variable
In variable mapping part create new variable as string type to store four queries. Result.
3) Finally add I one data flow task add OLEDB source configure with Same variable (That I have given in for each loop container).
Rightnow it is showing default value what i have given in User::Variable
I can iterate same No of column (Meta-data) queries and store in excel destination
But the Problem is when variable goes to next query that holds lesser or greater no of column.Here package fail cant handle different meta data table
Please assist me ,Can we iterate different meta data queries same time with proper output?
I Hope I have Explain the Problem what i facing exactly
Set the default value of User::Variable to one of the queries, so that BIDS can validate it at design time.
You can also try setting "DelayValidation" to true, but that might not be enough in this case.
Set the delay validation to true for both the data flow and the for each loop container.
I have an excel with 300 rows. I need to use each of these rows as a field name in a transformation.
I was thinking of creating a job that for each row of a table sets a variable that I use afterwards on my transformation.
I tried defining a variable as the value I have in one row and the transformation works. Now I need a loop that gets value after value and redefines the variable I created then executes the transformation.
I tried to define a Job that has the following:
Start -> Transformation(ExcelFileCopyRowsToResult) -> SetVariables -> Transformation(The transf that executes using whatever the variable name is at the moment).
The problem is that the variable I defined never changes and the transformation result is always the same because of that.
Executing a transformation for each row in a result set is a standard way of doing things in PDI. You have most of it correct, but instead of setting a variable (which only happens once in the job flow), use the result rows directly.
First, configure the second transformation to Execute for each row in the Edit window.
You can then use one of two ways to pass the fields into the transformation, depending on which is easier for you:
Start the transformation with a get rows from result. This should get you one row each time. The fields will be in stream directly and can be used as such.
Pass the fields as parameters, so they can be used like variables. I use this one more often, but it takes a bit more setup.
Inside the second transformation, go to the properties and enter variable names you want in the Parameters tab.
Save the transformation.
In the job, open the transformation edit window and go to Parameters.
Click Get Parameters.
Type the field name from the first transformation under Stream Column Name for each parameter.
I am creating pentaho jobs
In first set variable box i am passing value sysdate the first dfp job working perfectly.
In second set variable box i am passing value sysdate+1 ,so sysdate+1 file is picked correctly to process but second dfp job only getting error.
Is this logic is possible in pentaho Jobs?
I have numerous example of that kind that works perfectly every night. And I guess the Set variables have the appropriate level (valid in parent job).
So the bug is probably in the value you give to the variable in Set variable 2. The value sysdate+1 has the literal value (the string "sysdate+1") not the Date of tomorrow.
You must first compute that value. Which is done in a transformation replacing Set variable 2, which would do something like this:
Fairly straightforward question I think, I just haven't been able to find a clear example. I have a very complex transformation that I'm breaking down into a job. Having never created a job before, I'm struggling to send the data from one transformation to another. I used Copy Rows to Result in the first one and Get Rows From Result in the second one, but I feel like I'm still missing something. When I used Get Rows, I had to specify the row names - there was no sort of Get Fields button. I also can't preview the data in the transformation without running the job and having it save to an Excel file. When I did that, ALL of the fields were in the output file -- instead of just the ones I'd specified in the second transformation.
I've searched through the documentation and tried Googling but I can't find a clear walkthrough just on how to smoothly move data from one transformation to another. Any responses would be appreciated even if it's just pointing me towards something I've overlooked.
Thanks!
The most commom way is to use copy rows to result at the end of one KTR and use get rows from result as the starting point for the next one. Though you really can't "see" the result while operating in the next KTR, what you can do to ease the reading is set a preview window and leave it open to see all the columns names and data.
Whoever if you want to set just a few lines of code through to the next KTR you can use Set variables as the ending step of the first KTR and capture those variables at anytime in the second using Get Variables steps. Don't forget that if you do so you need to set the variables in the parent KJB(the Job that called the first KTR) with no Default value, and the Variable scope type of the Set variables step has to be set to Valid in the parent job.
The best way is to create KTR's, run/test each. This way you can examine resulting data and then integrate all individual transformations into the final job.
I am passing a value to the sub transformation, sub transformation takes the value fine as i have used java-script step to Alert it.
But i have a table input step in the sub transformation step, where i need to used the parent transformation value as a parameter to the TABLE INPUT step to run a query against it, but its not working, as the table input step does not understand the field, how can i achieve this behavior?
I am stuck at this point and can't go further.
The only option that is left is to use the Pentaho JOBS, but is it possible using Mapping inside a transformation?
I tried to setVariable function from the javascript in the sub transformation but nothing works.
I expect that your sub transformation is similar to the one in the figure below. Are you sure you are passing the parameters correctly? The important is to:
have same number of parameters in Mapping input specification as parameters used in Table input step
Replace variables in script checked
Insert data from step filled
parameter ? used in SQL query
If you need to pass more parameters to the table input, the number of parameters in previous step (Mapping input specification in case of my example) needs to respect the number of parameters you use in table input. Then you use ? more times in your query. E.g. for 3 params you could have:
WHERE name = ? AND surname = ? AND age = ?
Also you need to respect the order of parameters which come from previous step: