Decode maximum number in rows for sql - sql

I am using the #standardsql in bigquery and trying to code the maksimum ranking of each customer_id as 1, and the rest of it are 0
This is the query result so far
The query for ranking is this
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY booking_date Asc) as ranking
What i need is to create another column like this where it decode the maximum ranking of each customerid as 1, and the number below it as 0 just like the below table
Thanks

Based on your sample data, your ranking is unstable, because you have multiple rows with the same key values. In any case, you can still do what you want without subqueries, just using case:
select t.*,
row_number() over (partition by customer_id order by booking_date asc) as ranking,
(case when row_number() over (partition by customer_id order by booking_date asc) =
count(*) over (partition by customer_id)
then 1 else 0
end) as custom_coded
from t;
A more traditional way of doing essentially the same thing would be to use a descending sort:
select t.*,
row_number() over (partition by customer_id order by booking_date asc) as ranking,
(case when row_number() over (partition by customer_id order by booking_date desc) = 1
then 1 else 0
end) as custom_coded
from t;

We can wrap your current query, and then use MAX as an analytic function with a partition by customer to compare each ranking value against the max ranking for each customer. When the ranking value equals the maximum value for a customer, then we assign 1 for the custom_coded, otherwise we assign 0.
SELECT
customer_id, item_bought, booking_date, ranking,
CASE WHEN ranking = MAX(ranking) OVER (PARTITION BY customer_id)
THEN 1 ELSE 0 END AS custom_coded
FROM
(
SELECT customer_id, item_bought, booking_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY booking_date) ranking
FROM yourTable
) t;

Related

Complex Ranking in SQL (Teradata)

I have a peculiar problem at hand. I need to rank in the following manner:
Each ID gets a new rank.
rank #1 is assigned to the ID with the lowest date. However, the subsequent dates for that particular ID can be higher but they will get the incremental rank w.r.t other IDs.
(E.g. ADF32 series will be considered to be ranked first as it had the lowest date, although it ends with dates 09-Nov, and RT659 starts with 13-Aug it will be ranked subsequently)
For a particular ID, if the days are consecutive then ranks are same, else they add by 1.
For a particular ID, ranks are given in date ASC.
How to formulate a query?
You need two steps:
select
id_col
,dt_col
,dense_rank()
over (order by min_dt, id_col, dt_col - rnk) as part_col
from
(
select
id_col
,dt_col
,min(dt_col)
over (partition by id_col) as min_dt
,rank()
over (partition by id_col
order by dt_col) as rnk
from tab
) as dt
dt_col - rnk caluclates the same result for consecutives dates -> same rank
Try datediff on lead/lag and then perform partitioned ranking
select t.ID_COL,t.dt_col,
rank() over(partition by t.ID_COL, t.date_diff order by t.dt_col desc) as rankk
from ( SELECT ID_COL,dt_col,
DATEDIFF(day, Lag(dt_col, 1) OVER(ORDER BY dt_col),dt_col) as date_diff FROM table1 ) t
One way to think about this problem is "when to add 1 to the rank". Well, that occurs when the previous value on a row with the same id_col differs by more than one day. Or when the row is the earliest day for an id.
This turns the problem into a cumulative sum:
select t.*,
sum(case when prev_dt_col = dt_col - 1 then 0 else 1
end) over
(order by min_dt_col, id_col, dt_col) as ranking
from (select t.*,
lag(dt_col) over (partition by id_col order by dt_col) as prev_dt_col,
min(dt_col) over (partition by id_col) as min_dt_col
from t
) t;

How to get records from table based on conditions

Select only latest amount, if null then before that.
table a
customer|amount|date
001|2 |20201101
001|null|20201102
001|3 |20201103
002|8.9 |20201101
002|7 |20201008
002|null|20201106
Result
001|null|20201101
001|null|20201102
001|3 |20201103
002|null|20201101
002|null|20201008
002|7 |20201106
amount data should be taken latest as per date , other record will be null, if amount is null for the latest date it should take the previous not null value.
My current attempt:
select top 1 [amount]
from table
where [amount] is not null
order by date desc
If you want to set all but the most recent value to NULL:
select customer_code, date,
(case when seqnum = 1 then amount end) as amount
from (select t.*,
row_number() over (partition by customer_code order by (amount is not null) desc, date desc) as seqnum
from table t
) t
where customer_code = '001'
order by date desc
Probably what you are looking for is a window function:
SELECT *
FROM (SELECT *,
row_number() over
(partition by customer
order by amount desc, date desc) as rn
FROM your_table
WHERE amount is not null)
WHERE rn = 1
You can use row_number or dense_rank depending on your needs
Create a view that returns all inserted values in descending order. Then select the first or second row according to the condition.

Gaps and Islands - How to Sum Each Group of Consecutive Rows by ID

Below is my current SQL code and output. I only need to get the sum of EFF_DAYS for consecutive (or single) rows where CD is equal to STG (highlighted in yellow).
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) RN,
Z2.*
FROM (
SELECT CASE WHEN (LAG_CD IS NULL OR LAG_CD NOT IN ('STG')) AND CD IN ('STG')
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
WHEN CD = LAG_CD AND CD IN ('STG')
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
WHEN CD = LAG_CD AND CD != LEAD_CD
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
END AS CASES,
Z.* FROM (
SELECT ID,
LAG(CD) OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) AS LAG_CD,
LEAD(CD) OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) AS LEAD_CD,
CD,
TMSP,
EFF_DT,
END_EFF_DT,
DATEDIFF(day, EFF_DT, END_EFF_DT) AS EFF_DAYS
FROM #POSTCHG_ROWS
WHERE ID IN ('ABC123', 'XYZ789')
) Z
) Z2 ORDER BY TMSP, EFF_DT
I've tried all kinds of row number and rank stuff, but I can't seem to get the CASES column correct. I've spent hours looking at other gap-island sql solutions, but haven't come across the exact scenario below.
Ideally my CASES column would be output like below so I can GROUP BY CASES, ID, the starting TMSP of the consecutive row block and then calculate: SUM(EFF_DAYS).
Below is my goal output:
You are only interested with series of adjacent "CTG" rows. I think that the simplest approach is a window count of non-"STG" values to define the groups, then filtering and aggregation:
select
id,
min(tmsp) tmsp,
min(eff_dt) eff_dt,
sum(datediff(day, eff_dt, end_eff_dt)) sum_eff_days
from (
select
p.*
sum(case when cd = 'STG' then 0 else 1 end)
over(partition by id order by tmsp) grp
from #postchg_rows p
) p
where cd = 'STG'
group by id, grp

Organizing SQL data based on date

I am trying to organize my SQL data based off of the dates from which the orders were made.
My data:
SELECT DISTINCT ORDER_NO, ITEM, VERSION_NO,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY ORDER_NO ORDER BY NOT_BEFORE_DATE
ASC) = 1
THEN 'what-if'
ELSE 'wh'
END) AS VERSION_NEW
,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY ORDER_NO ORDER BY
NOT_BEFORE_DATE ASC) = 2
THEN 'initial'
ELSE 'other'
END) AS VERSION
FROM FDT_MAPTOOL
WHERE ITEM IN (1032711)
;
My results:
I want my data to be ordered by PO# and the date it was created.
As you can see in my picture the First two line have the same ITEM and same PO (Order_No). I need the first two to say Initial on the side because they are the first two based on the dates. They were created first. Everything after should say other.
I am not sure if PL/SQL is needed for this?
Thank you!
Use a different analytic function so that more than one row can have the value of 1 e.g.
SELECT DISTINCT ORDER_NO, ITEM, VERSION_NO,
(CASE WHEN DENSE_RANK() OVER (PARTITION BY ORDER_NO ORDER BY NOT_BEFORE_DATE
ASC) = 1
THEN 'what-if'
ELSE 'wh'
END) AS VERSION_NEW
,
(CASE WHEN DENSE_RANK() OVER (PARTITION BY ORDER_NO ORDER BY
NOT_BEFORE_DATE ASC) = 1
THEN 'initial'
ELSE 'other'
END) AS VERSION
FROM FDT_MAPTOOL
WHERE ITEM IN (1032711)
;
Either rank() OR dense_rank() should work here instead of row_number()
nb: note sure if you really need "select distinct"

SQL Finding five largest numbers instead of one Max in a table

I have a table and I need to run a query that contains some aggregation Functions like Maximum , Average , Standard Deviation , ...
but instead of one Maximum I should return 5 largest number.
the simplified query is something like this:
SELECT OSI_KEY , MAX(VALUE) , AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
and I need some Magical ;) Query like this:
SELECT OSI_KEY , MAX1(VALUE) ,MAX2(VALUE) ,MAX3(VALUE) ,MAX4(VALUE) , MAX5(VALUE) ,
AVG(VALUE) , STDDEV(VALUE), variance(VALUE)
FROM DATA_VALUES_5MIN_6_2013
GROUP BY OSI_KEY
ORDER BY OSI_KEY
I appreciate your considerations.
Oracle has an NTH_VALUE() function. Unfortunately, it is only an analytic function and not a window function. This leads to the strange construct of SELECT DISTINCT with a bunch of analytic functions:
SELECT DISTINCT OSI_KEY,
MAX(VALUE) OVER (PARTITION BY OSI_KEY),
NTH_VALUE(VALUE, 2) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_2,
NTH_VALUE(VALUE, 3) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_3,
NTH_VALUE(VALUE, 4) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_4,
NTH_VALUE(VALUE, 5) OVER (PARTITION BY OSI_KEY ORDER BY VALUE DESC) as MAX_5,
AVG(VALUE) OVER (PARTITION BY OSI_KEY),
STDDEV(VALUE) OVER (PARTITION BY OSI_KEY),
variance(VALUE) OVER (PARTITION BY OSI_KEY)
FROM DATA_VALUES_5MIN_6_2013
ORDER BY OSI_KEY;
You can also do this using conditional aggregation, with a row_number() or dense_rank() in a subquery.
SELECT OSI_KEY, MaxValue FROM (
SELECT OSI_KEY, MAX(value) AS MaxValue FROM table GROUP BY OSI_KEY
)
ORDER BY MaxValue DESC
FETCH FIRST 5 ROWS ONLY;