Finding Latest First x among consecutive x from table - sql

I am trying to write a query to find first latest 1's from each group as below. For example, for Group 1, It shouldn't be 1/2/2022 since it has 1/6/2022 which was shown later. Shouldn't be 1/7/2022 too for Group 1.
Please let me know if you have any idea.
Thanks!
Table x (AsOfDate, Group_Id, Value)
AsOfDate Group_Id Value
1/1/2022 1 0
1/1/2022 2 1
1/2/2022 1 1
1/2/2022 2 1
1/3/2022 1 0
1/3/2022 2 0
1/4/2022 1 0
1/4/2022 2 0
1/5/2022 1 0
1/5/2022 2 1
1/6/2022 1 1
1/6/2022 2 0
1/7/2022 1 1
1/7/2022 2 0
Output
AsOfDate Group_Id
1/6/2022 1
1/5/2022 2

What you want is find the earliest date of the last group for continuous row with Value = 1
Use LAG() window function to find the continuous group of Value
use dense_rank() to rank it by grp find the latest group (r = 1)
min() to get the "first" AsOfDate
select AsOfDate = min(AsOfDate),
Group_Id
from
(
select *, r = dense_rank() over (partition by Group_Id, Value
order by grp desc)
from
(
select *, grp = sum(g) over (partition by Group_Id order by AsOfDate)
from
(
select *, g = case when Value <> lag(Value) over (partition by Group_Id
order by AsOfDate)
then 1
else 0
end
from x
) x
) x
) x
where Value = 1
and r = 1
group by Group_Id

Related

return row where column value changed from last change

I have a table and i want to know the minimum date since the last change grouped by 2 columns
in the data, I want to know the lates PartNumberID by location, with the min date since the last change.
*Expected row it's not part of the table
DATA:
Location
RecordAddedDate
PartNumberID
ExpectedRow
7
2022-06-23
1
I want this row
8
2022-06-23
1
I want this row
8
2022-06-24
1
8
2022-06-25
1
9
2022-06-23
1
I want this row
15
2022-06-23
1
15
2022-06-24
1
15
2022-06-25
2
15
2022-06-26
1
I want this row
15
2022-06-27
1
Expected output:
Location
RecordAddedDate
PartNumberID
7
2022-06-23
1
8
2022-06-23
1
9
2022-06-23
1
15
2022-06-26
1
I'm on sql
I have tried with but I dont know how to stop when the value change
with cte as (
select t.LocationID, t.RecordAddedDate, t.PartNumberID
FROM mytable t
INNER JOIN (select PL.LocationID, PL.RecordAddedDate, PL.PartNumberID
FROM mytable PL INNER JOIN
(SELECT PSCc.LocationID, MAX(PSCc.RecordAddedDate) AS DateSetup
FROM mytable PSCc
WHERE PSCc.RecordDeleted = 0
GROUP BY PSCc.LocationID) AS PSCc ON PSCc.LocationID = PL.LocationID AND PSCc.DateSetup = RecordAddedDate) as tt on t.RecordAddedDate<=tt.RecordAddedDate and t.LocationID= tt.LocationID and t.PartNumberID= tt.PartNumberID
)
select *
from cte c
where not exists(
select 1 from cte
where cte.LocationID = c.LocationID
and cte.PartNumberID=c.PartNumberID
and cte.RecordAddedDate<c.RecordAddedDate
)
order by LocationID,RecordAddedDate
Thank you
use lag() to find the last change (order by RecordAddedDate desc) in PartNumberID.
cumulative sum sum(isChange) to group the related rows under same group no. grp = 0 with be the rows of the last change
To get the min - RecordAddedDate, use row_number()
with
cte1 as
(
select *,
isChange = case when PartNumberID
= isnull(lag(PartNumberID) over (partition by Location
order by RecordAddedDate desc),
PartNumberID)
then 0
else 1
end
from mytable
),
cte2 as
(
select *, grp = sum(isChange) over (partition by Location order by RecordAddedDate desc)
from cte1
),
cte3 as
(
select *, rn = row_number() over (partition by Location order by RecordAddedDate)
from cte2 t
where t.grp = 0
)
select *
from cte3 t
where t.rn = 1
db<>fiddle demo

Selecting top most row in Bigquery based on conditions

I have a huge table, where sometimes 1 product ID has multiple specifications. I want to select the newest but unfortunately, I don't have the date information. please consider this example dataset
Row ID Type Sn Sn_Ind
1 3 SLN SL20 20
2 1 SL SL 0
3 2 SL SL 0
4 1 M SL21 10
5 3 M SL21 10
6 1 SLN SL20 20
I used the below query to somehow group the products in give them row numbers like
with cleanedMasterData as(
SELECT *
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Sn DESC, Sn_Ind DESC) AS rn
FROM `project.dataset.table`
)
-- where rn = 1
)
select * from cleanedMasterData
Please find below the example table after cleaning
Row ID Type Sn Sn_Ind rn
1 1 SL SL 0 1
2 1 M SL21 10 2
3 1 SLN SL20 20 3
4 2 SL SL 0 1
5 3 M SL21 10 1
6 3 SLN SL20 20 2
but if you see for ID 2 and 3, I can easily select the top row with where rn = 1
but for ID 1, my preferred row would be 2 because that is the newest.
My question here is how do I prioritise a value in column so that I can get the desired solution like :
Row ID Type Sn Sn_Ind rn
1 1 M SL21 10 1
2 2 SL SL 0 1
3 3 M SL21 10 1
As the values are fixed in Sn column - for ex SL, SL20, SL19, SL21 etc - If somehow I can give weightage to these values and create a new temp column with weightage and sort based on it, then?
Thank you for your support in advance!!
Consider below
SELECT *
FROM `project.dataset.table`
WHERE TRUE
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY IF(Sn = 'SL', 0, 1) DESC, Sn DESC) = 1
If applied to sample data in your question - output is
It wasn't difficult, I tried a few things and it worked out. If anyone can optimize the below solution even more that would be awesome.
first the dataset
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 ID, 'SLN' Type, 'SL20' Sn, 20 Sn_Ind UNION ALL
SELECT 1 , 'SL' , 'SL' , 0 UNION ALL
SELECT 2 , 'SL' , 'SL' , 0 UNION ALL
SELECT 1 , 'M' , 'SL21' , 10 UNION ALL
SELECT 3 , 'M' , 'SL21' , 10 UNION ALL
SELECT 1 , 'SLN' , 'SL20' , 20
)
with weightage as(
SELECT
*,
MAX(CASE Sn WHEN 'SL' THEN 0 ELSE 1 END) OVER (PARTITION BY Sn) AS weightt,
FROM
`project.dataset.table`
ORDER BY
weightt DESC, Sn DESC
), main as (
select * EXCEPT(rn, weightt)
from (
select * ,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY weightt DESC, Sn DESC) AS rn
from weightage )
where rn = 1
)
select * from main
after this, I can get the desired result
Row ID Type Sn Sn_Ind
1 1 M SL21 10
2 2 SL SL 0
3 3 M SL21 10

Is there a way to get first row of a group in postgres based on Max(date)

Input :
id name value1 value2 date
1 A 1 1 2019-01-01
1 A 2 2 2019-02-15
1 A 3 3 2019-01-15
1 A 1 1 2019-07-13
2 B 1 2 2019-01-01
2 B 1 3 2019-02-15
2 B 2 1 2019-07-13
3 C 2 4 2019-02-15
3 C 1 2 2019-01-01
3 C 1 9 2019-07-13
3 C 3 1 2019-02-15
Expected Output :
id name value1 value2 date
1 A 1 Avg(value2) 2019-07-13
2 B 2 Avg(value2) 2019-07-13
3 C 1 Avg(value2) 2019-07-13
You can use window functions. rank() over() can be used to identify the first record in each group, and avg() over() will give you a window average of value2 in each group:
select id, name, value1, avg_value2 value2, date
from (
select
t.*,
avg(value2) over(partition by id, name) avg_value2,
rank() over(partition by id, name order by date desc) rn
from mytable t
) t
where rn = 1
sort your data in the right way, use the window function row_number() as identifier and select the first entry of every partition.
with temp_data as
(
select
row_number() over (partition by debug.tbl_data.id order by debug.tbl_data.date desc) as index,
*,
avg(debug.tbl_data.value2)over (partition by debug.tbl_data.id) as data_avg
from debug.tbl_data
order by id asc, debug.tbl_data.date desc
)
select
*
from temp_data
where index = 1
You seem to want the most common value of value1. In statistics, this is called the "mode". You can do this as:
select id, name,
mode() within group (order by value1) as value1_mode,
avg(value2),
max(date)
from t
group by id, name;

How to Generate Row number Partition by two column match in sql

Tbl1
---------------------------------------------------------
Id Date Qty ReOrder
---------------------------------------------------------
1 1-1-18 1 3
2 2-1-18 0 3
3 3-1-18 2 3
4 4-1-18 3< >3
5 5-1-18 2 3
6 6-1-18 0 3
7 7-1-18 1 3
8 8-1-18 0 3
---------------------------------------------------------
I want the result like below
---------------------------------------------------------
Id Date Qty ReOrder
---------------------------------------------------------
1 1-1-18 1 3
5 5-1-18 2 3
---------------------------------------------------------
if ReOrder not same with Qty then date will be same upto after reorder=Qty
You can use cumulative approach with row_number() function :
select top (1) with ties *
from (select *, max(case when qty = reorder then 'v' end) over (order by id desc) grp
from table
) t
order by row_number() over(partition by grp order by id);
Unfortunately this will require SQL Server, But you can also do:
select *
from (select *, row_number() over(partition by grp order by id) seq
from (select *, max(case when qty = reorder then 'v' end) over (order by id desc) grp
from table
) t
) t
where seq = 1;

Select and aggregate last records base on order

I have different versions of the charges in a table. I want to grab and sum the last charge grouped by Type.
So I want to add 9.87, 9.63, 1.65.
I want the Parent ID , sum(9.87 + 9.63 + 1.65) as the results of this query.
We use MSSQL
ID ORDER CHARGES TYPE PARENT ID
1 1 6.45 1 1
2 2 1.25 1 1
3 3 9.87 1 1
4 1 6.54 2 1
5 2 5.64 2 1
6 3 0.84 2 1
7 4 9.63 2 1
8 1 7.33 3 1
9 2 5.65 3 1
10 3 8.65 3 1
11 4 5.14 3 1
12 5 1.65 3 1
WITH recordsList
AS
(
SELECT Type, Charges,
ROW_NUMBER() OVER (PArtition BY TYPE
ORDER BY [ORDER] DESC) rn
FROM tableName
)
SELECT SUM(Charges) totalCharge
FROM recordsLIst
WHERE rn = 1
SQLFiddle Demo
Use row_number() to identify the rows to be summed, and then sum them:
select SUM(charges)
from (select t.*,
ROW_NUMBER() over (PARTITION by type order by id desc) as seqnum
from t
) t
where seqnum = 1
Alternatively you could use a window aggregate MAX():
SELECT SUM(Charges)
FROM (
SELECT
[ORDER],
Charges,
MaxOrder = MAX([ORDER]) OVER (PARTITION BY [TYPE])
FROM atable
) s
WHERE [ORDER] = MaxOrder
;
SELECT t.PARENT_ID, SUM(t.CHARGES)
FROM dbo.test73 t
WHERE EXISTS (
SELECT 1
FROM dbo.test73
WHERE [TYPE] = t.[TYPE]
HAVING MAX([ORDER]) = t.[ORDER]
)
GROUP BY t.PARENT_ID
Demo on SQLFiddle