Calculate Rank Pattern without any order value - sql

My Data is like this -
You can check 3 columns, jil_equipment_id,req_group,operand.
Based on these 3 columns i have to generate a new "Patern" Column.
The patern column is a patern and starts from 2 and increases by 1 for each repeated combination of jil_equipment_id,req_group,operand.
The final data will look like this.
Please suggest me any possible approach. I am not able to use the RANK()/DENSE_RANK() Function on this.

You can use row_number(). You want to use the partition by as well:
select t.*,
(1 + row_number() over (partition by jil_equipment_id, req_group, operand
order by content_id
)
) as pattern
from t;

select *,Row_Number() over(partition by jil_equipment_id,req_group,operand order by jil_equipment_id,req_group,operand) + 1 as pattern
from tab
you can use row_number() function for this.

Related

ROW_NUMBER function does not start from 1

I would like to ask about strange behaviour in SQL Server whilst using ROW_NUMBER() Function. Typically it should start from 1 and Order values by the selected column in Order By clause, which for the most scenarios works for me just as it is supposed to, but I have a particular case when I use a basic Select Statement:
SELECT
ROW_NUMBER() OVER (ORDER BY VIN) AS RN,
*
FROM dbo.RawData
and I get such result:
RN VIN
6301 JTEBR3FJ00K096082
6302 JTEBR3FJ00K096132
6303 JTEBR3FJ00K096146
6304 JTEBR3FJ00K096163
6305 JTEBR3FJ00K096180
6306 JTEBR3FJ00K096275
1801 5TDDZRFHX0S820530
1802 5TDDZRFHX0S824111
1803 5TDDZRFHX0S824500
1804 5TDDZRFHX0S825971
1805 5TDDZRFHX0S826456
and those are the first columns in the return table. The whole ROW_NUMBER function works randomly, after chain from 6301 to 6306, the chain from 1801 to 1940 starts etc.
The VIN column (the one I sort data based on) is set to nvarchar(17)
could you please help with solving the issue which might occur in this case?
I would be grateful for any tips what might be wrong
You can use ORDER BY to order the rows in a desired way:
SELECT ROW_NUMBER() OVER (ORDER BY VIN) AS RN
,*
FROM dbo.RawData
ORDER BY RN;
As the row_number is calculated in the SELECTE, you can use its value in the ORDER BY clause without the need of nested query.

Sequence generation based on a conditon using Informatica

I need to achieve following data transformation using Informatica,
The first picture is the sample input data
The data after transformations should look like as below,
Here for the different type I have the sequence staring from 200 in this case bfd should be considered as one group and (klm,kln) together as other group. The new id is id+the sequence number
Should I do this using UDF or by creating a procedure.. or using some sets of transformations..
I am new to informatica and am confused about what approach should I follow..
Thanks for the help in advance!
you can do this using two separate sequence transformation.
first one will start from 0 and increment by 1.
second one will start from 200 and increment by 1.
Then use an IIF condition to generate new_id column. if you have few different type of sequence, you can do it with IIF. But if you have hundreds, probably you need to use some UDF etc.
seq_1 = attach NEXTVAL from sequence generator 1
seq_2 = attach NEXTVAL from sequence generator 2
new_id = TO_INTEGER ( IIF( group ='bfd', id || seq_1,
IIF( group = 'klm' or group ='kln', id || seq_2)
)
)
You can write a query to do this:
select t.*,
(case when group = 'bfd'
then id * 10 + row_number() over (partition by group order by id)
when group in ('klm', 'kln')
then id * 100 + row_number() over (partition by case when group in ('klm', 'kln') then 1 else 2 end order by id)
end) as new_id
from t;

Count half of rest of a partition by from position

I'm trying to achieve the following results:
now, the group comes from
SUM(CASE WHEN seqnum <= (0.5 * seqnum_rev) THEN i.[P&L] END) OVER(PARTITION BY i.bracket_label ORDER BY i.event_id) AS [P&L 50%],
I need that in each iteration it counts the total of rows from the end till position (seq_inv) and sum the amounts in P&L only for the half of it from that position.
for example, when
seq = 2
seq_inv will be = 13, half of it is 6 so I need to sum the following 6 positions from seq = 2.
when seq = 4 there are 11 positions till the end (seq_inv = 11), so half is 5, so I want to count 5 positions from seq = 4.
I hope this makes sense, I'm trying to come up with a rule that will be able to adapt to the case I have, since the partition by is what gives me the numbers that need to be summed.
I was also thinking if there was something to do with a partition by top 50% or something like that, but I guess that doesn't exist.
I have the advantage that I've helped him before and have a little extra context.
That context is that this is just the later stage of a very long chain of common table expressions. That means self-joins and/or correlated sub-queries are unfortunately expensive.
Preferably, this should be answerable using window functions, as the data set is already available in the appropriate ordering and partitioning.
My reading is this...
The SUM(5:9) (meaning the sum of rows 5 to row 9, inclusive) is equal to SUM(5:end) - SUM(10:end)
That leads me to this...
WITH
cumulative AS
(
SELECT
*,
SUM([P&L]) OVER (PARTITION BY bracket_label ORDER BY event_id DESC) AS cumulative_p_and_l
FROM
data
)
SELECT
*,
cum_val - LEAD(cumulative_p_and_l, seq_inv/2, 0) OVER (PARTITION BY bracket_label ORDER BY event_id) AS p_and_l_50_perc,
cum_val - LEAD(cumulative_p_and_l, seq_inv/4, 0) OVER (PARTITION BY bracket_label ORDER BY event_id) AS p_and_l_25_perc,
FROM
cumulative
NOTE: Using , &, % in column names is horrendous, don't do it ;)
EDIT: Corrected the ORDER BY in the cumulative sum.
I don't think that window functions can do what you want. You could use a correlated subquery instead, with the following logic:
select
t.*,
(
select sum(t1.P&L]
from mytable t1
where t1.seq - t.seq between 0 and t.seq_inv/2
) [P&L 50%]
from mytable t

Sql Islands and Gaps Merge Contiguous records if relevant fields hold same values

I have created a test case here for my problem https://rextester.com/ZRXSQ14415
Its must each easier to show the problem to explain what I am trying to achieve.
I have a list of records across time and I wish to merge contiguous records into a single record.
Each record has a period Date, Risk Levels and a couple of flags. When these risks and flags are the same the records should be merged when they are different then they should be a separate row.
On the Rextester example, i have almost achieved my goal, however look at rows 3 + 4 of the result.
What I want to achieve is that rows 3 + 4 would be combined such that row 3
StartDate End Date Name ... ...
17.03.2019 20.03.2019 CPWJ40-A ... ...
As all flags and risk levels are the same.
Change the SEQ expression to
..
ROW_NUMBER() OVER (ORDER BY PeriodDate) - ROW_NUMBER() OVER (Partition BY ImplicitRisk,QCReadyRisk,IsQualityControlReady, ActivePeriod ORDER BY PeriodDate) AS SEQ
..
This way you'll get the proper grouping of islands of ImplicitRisk,QCReadyRisk,IsQualityControlReady, ActivePeriod.
This answer is purely to complement Serg answer with the full query.
SELECT MIN(d.PeriodDate) AS StartDate,
MAX(d.PeriodDate) AS EndDate,
ImplicitRisk,
QcReadyRisk,
IsQualityControlReady,
ActivePeriod,
LocationEventName
FROM
(
SELECT c.*,
ROW_NUMBER() OVER (ORDER BY PeriodDate) - ROW_NUMBER() OVER (Partition BY LocationEventId, ImplicitRisk, QCReadyRisk, IsQualityControlReady, ActivePeriod ORDER BY PeriodDate) AS grp
FROM tab c
--order by PeriodDate
) d
group by ImplicitRisk, QcReadyRisk, IsQualityControlReady, ActivePeriod, LocationEventName, grp
order by 1

How to implement lag function in teradata.

Input :
Output :
I want the output as shown in the image below.
In the output image, 4 in 'behind' is evaluated as tot_cnt-tot and the subsequent numbers in 'behind', for eg: 2 is evaluated as lag(behind)-tot & as long as the 'rank' remains same, even 'behind' should remain same.
Can anyone please help me implement this in teradata?
You appears to want :
select *, (select count(*)
from table t1
where t1.rank > t.rank
) as behind
from table t;
I would summarize the data and do:
select id, max(tot_cnt), max(tot),
(max(tot_cnt) -
sum(max(tot)) over (order by id rows between unbounded preceding and current row)
) as diff
from t
group by id;
This provides one row per id, which makes a lot more sense to me. If you want the original data rows (which are all duplicates anyway), you can join this back to your table.