How to Loop inside Pentaho Data Integration Transformation - pentaho

I have following pdi job structure:
START ---> TR1 ---> TR2 ---> TR3
where:
TR1 will return 3 rows,
TR2 had execute every input row enabled by me, and will return 5 rows,
TR3 had execute every input row enabled by me, and it was supposedly run 15 times (5 times for each of those 3 TR2).
My expectation was:
TR1 will run exactly 1 time, TR2 will run exactly 3 times in parallel (since TR1 returning 3 rows),
and TR3 will run exactly 15 times (since each of TR2 returning 5 rows).
But then the reality was
TR2 was executed 3 times, as expected,
but TR3 is only executed 1 time, not as expected.
My questions are:
Why is this happening?
How to make TR3 act as like loop inside TR2's rows

for that you need to create job try this sequence
Start-tr1-(job1(tr2-subjob2(tr3)))-End
job1 and job2 will be checked with execute every input raw.
Dis-
create job1 and connect to tr1 and put tr2 inside it after that create subjob2 and connect to tr2 after that put tr3 inside subjob2

I assume, TR represents transformation and all the TR's are part of a job? if that's the case, then all you need to do is - in TR1 send the results to TR2 by connecting the last step in TR1 with "Copy rows to result", upon which double click on the TR2 with in the job and go to "Advanced" and check "Copy previous results to parameters" and "execute to every input row".
Now, within that add the column names from TR1 under the parameters tab of TR2 in the same sequence with same names. And then within the TR2 properties add those as parameters with a null default value so that you can use the values generated from the previous transformation as variables in TR2. And now include TR3 within the TR2 so that it even executes for each row generated by TR1. I hope I didn't confuse. Let me know if doesn't make sense.

Related

Limit number of script triggers by field

I am trying to figure out how to limit the number of times an ID can be selected.
I have a list of mentors, some who can be selected 1 time and others who can be selected 2 times. I am using a button that performs a Set Field script. When the button is clicked the ID value is copied to another list. It remains in the original list but also is shown in another. I want to say something like:
If field "mentor count" = 2 then You can select 2 times, else you can select 1 time.
I have no idea how to go about it.
Do you have any suggestions please?
There are a total of 3 lists. One is mentors, 1 is students and the other is both. The user selects a mentor and student to match up. Each row in this table is a new match.
I tried conditional action which failed. I am thinking I will need a script.

Sequential Transformation with fix Chunk Sizes in SSIS [duplicate]

I have a table of 811 records. I want to get five records at a time and assign it to variable. Next time when I run the foreach loop task in SSIS, it will loop another five records and overwrite the variable. I have tried doing with cursor but couldn't find the solution. Any help will be highly appreciated. I have table like this for e.g.
ServerId ServerName
1 Abc11
2 Cde22
3 Fgh33
4 Ijk44
5 Lmn55
6 Opq66
7 Rst77
. .
. .
. .
I want query should take first five names as follows and assign it to variable
ServerId ServerName
1 Abc11
2 Cde22
3 Fgh33
4 Ijk44
5 Lmn55
Then next loop takes another five name and overwrite the variable value and so on till the last record is consumed.
Taking ltn's answer into consideration this is how you can achieve limiting the rows in SSIS.
The Design will look like
Step 1 : Create the variables
Name DataType
Count int
Initial int
Final int
Step 2 : For the 1st Execute SQL Task write the sql to store the count
Select count(*) from YourTable
In the General tab of this task Select the ResultSet as Single Row.
In the ResultSet tab map the result to the variable
ResultName VariableName
0 User::Count
Step 3 : In the For Loop container enter the expression as shown below
Step 4 : Inside the For Loop drag an Execute SQL Task and write the expression
In Parameter Mapping map the initial variable
VariableName Direction DataType ParameterName ParameterSize
User::Initial Input NUMERIC 0 -1
Result Set tab
Result Name Variable Name
0 User::Final
Inside the DFT u can write the sqL to get the particular rows
Click on Parameters and select the variable INITIAL and FINAL
if your data will not be update between paging cycles and the sort order is always the same then you could try an approach similiar to:
CREATE PROCEDURE TEST
(
#StartNumber INT,
#TakeNumber INT
)
AS
SELECT TOP(#TakeNumber)
*
FROM(
SELECT
RowNumber=ROW_NUMBER() OVER(ORDER BY IDField DESC),
NameField
FROM
TableName
)AS X
WHERE RowNumber>=#StartNumber

Loop 10 records at a time and assign it to variable

I have a table of 900 records.
I want to get 10 records at a time and assign it to variable.
Next time when I run the for each loop task in SSIS,
it will loop another 10 records and overwrite the variable.
Any help will be highly appreciated.
I have table like this for e.g
EMPID
0001
00045
00067
00556
00078
00345
00002
00004
00005
00006
00007
00008
this is want I have tried execute sql task to pull 900 records to variable, connect Execute sql task to For each loop, inside for each loop have Data flow task, the source has sql query and destination is table.
select * from Dbo.JPKGD0__STP
where EMPID in ?
but this will pass each empid in 1 loop , so i wanted to pass 10 empids each time.
Please let me know if I need to use different approach/or other tasks to achieve this.
Step (1) - Create variables
You have to create two variables of type int:
#[User::RowCount] >> type int
#[User::Counter] >> type int
#[User::strQuery] >> type string
Assign the following expression to #[User::strQuery]:
"SELECT EMPID
FROM Dbo.JPKGD0__STP
ORDER BY EMPIDASC
OFFSET " + (DT_WSTR,50)#[User::Counter] + " ROWS
FETCH NEXT 10 ROWS ONLY "
Step (2) - Get Row Count
First, add an Execute SQL Task with the following command:
SELECT Count(*) FROM Dbo.JPKGD0__STP;
And store the result in #[User::RowCount] variable (check this link for more information).
Step (3) - For Loop Container
Now, Add a For Loop Container with the following expressions:
InitExpression: #[User::Counter] = 0
EvalExpression: #[User::Counter] < #[User::RowCount]
AssignExpression: #[User::Counter] = #[User::Counter] + 10
Inside the For loop container, add a Data flow task, with an OLE DB Source and a destination. In the OLE DB Source, select the Access Mode as SQL Command from variable and select #[User::strQuery] as a source.
References
Row Offset in SQL Server
SQL Server OFFSET FETCH
SSIS Basics: Using the Execute SQL Task to Generate Result Sets
ORDER BY Clause (Transact-SQL)

Change random assignment to cyclical assignment SQL

EXAMPLE:
The issue is that I have, for example, 5 people to solve 100 cases, and the assignment has to be fair, I think that SQL through loops should be able to assign the first cases to the first 5 people, but then it has to go back to count and reassign, in case a new case falls.
I have two tables with the following fields
Technicians
ID_TEC-----NOM_TEC-----LINEA_TEC
and other whit cases
ID_CASE----DESCRIPTION_CASE
The problem arises because I have to assign cases to each technician. The assignment must be cyclic, that is:
CASE1 TECH1
CASE2 TECH2
CASE3 TECH3
CASE4 TECH1 ...
and when you load the data in the table and rerun the SP or run the job that assigns them, go back to the table, re-count the values ​​and reassign them according to the last assigned TECn. I hope the description is clearer!
You can assign the numbers 1-5 randomly to the tasks by doing:
select t.*,
(1 + row_number() over (order by newid()) % 5) as user_assignment
from t

What is the best way to reassign ordinal number of a move operation

I have a column in the sql server called "Ordinal" that is used to indicate the display order of the rows. It starts from 0 and skips 10 for the next row. so we have something like this:
Id Ordinal
1 0
2 20
3 10
It skips 10 because we wanted to be able to move item in between items (based on ordinal) without having to reassign ordinal number for the entire table.
As you can imagine eventually, Ordinal number will need to be reassign somehow for a move in between operation either on surrounding rows or for the entire table as the unused ordinal numbers between the target items are all used up.
Is there any algorithm that I can use to effectively reorder the ordinal number for the move operation taken in the consideration like long term maintainability of the table and minimizing update operations of the table?
You can re-number the sequences using a somewhat complicated UPDATE statement:
UPDATE u
SET u.sequence = 10 * (c.num_below-1)
FROM test u
JOIN (
SELECT t.id, count(*) AS num_below
FROM test t
JOIN test tr ON tr.sequence <= t.sequence
GROUP BY t.id
) c ON c.id=u.id
The idea is to obtain a count of items with the sequence lower than that of the current row, multiply the count by ten, and assign it as the new count.
The content of test before the UPDATE:
ID Sequence
__ ________
1 0
2 10
3 20
4 12
The content of test after the UPDATE:
ID Sequence
__ ________
1 0
2 30
3 10
4 20
Now the sequence numbers are evenly spread again, so you can continue inserting in the middle until you run out of new sequence numbers; then you can re-number again.
Demo.
These won't answer your question directly--I just thought I might suggest some other approaches:
One possibility--don't try to do it by hand. Have your software manage the numbers. If they need re-writing, just save them with new numbers.
a second--use a "Linked List" instead. In each record store the index of the next record you want displayed, then have your code load that directly into a linked list.
Yet another simple approach. Let's say you're inserting a new record with an ordinal equal x.
First, check if there's a row having ordinal value equal x. In case there's one, just update all the records having the ordinal value equal or bigger than x increasing them by y. Then, you are safe to insert a new record.
This way you're sure you'll not run update every time and of course, you'll keep the order.