How to increment the dense rank based on condition - sql

Below enclosed is my requirement.
I want to increment the dense rank function with the cap of each 5 line items by the partition of seller_state and warehouse_id code. for more clarification I have attached sample data of my requirement kindly help me on same.
below mentioned queries are my tries.
CASE
WHEN icta_amount < 0 THEN (DENSE_RANK() OVER (PARTITION BY seller_state ORDER BY seller_state,warehouse_id)) % 5
WHEN icta_amount >= 0 THEN (DENSE_RANK() OVER (PARTITION BY seller_state ORDER BY seller_state,warehouse_id))% 5
END AS DENSE_RANK,
if i add warehouse_id in partition clause in all the places i am getting only 1 don't know the meaning of that.
Thank you in advance.

I'd start with a row_number partitioned by the seller_state and warehouse_id, floor that into groups of five, and then dense_rank over it:
SELECT seller_state, warehouse_id,
DENSE_RANK() OVER (PARTITION BY seller_state, warehouse_id
ORDER BY seller_state, warehouse_id, FLOOR((rn - 1) / 5.0))
FROM (SELECT seller_state, warehouse_id,
ROW_NUMBER() OVER (PARTITION BY seller_state, warehouse_id) AS RN
FROM mytable) t
SQLFiddle demo

If you prefer to use dense_rank() then you'd want to use integer division to mark off the blocks:
with data as (
select seller_state, warehouse_id,
row_number() over (partition by seller_state, warehouse_id
order by seller_state, warehouse_id) as rn
from T
)
select seller_state, warehouse_id,
dense_rank() over (order by seller_state, warehouse_id, (rn - 1) / 5) as rnk
from data;
You could also mark the spots where the counter should increase and them accumulate them. Count off each of the rows in the same state and warehouse. When you find one with where row number mod 5 = 1 mark it as step for your ranking counter. This will immediately reset on a change of state or warehouse as desired:
sum(case when rn % 5 = 1 then 1 end)
over (order by seller_state_warehouse_id, rn) as rnk
Some platforms do not require the order by where it is simply repeating the partition by columns while others do.
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d32dba8fccbfb8b85a7eb26f4c8f4849

Related

Complex Ranking in SQL (Teradata)

I have a peculiar problem at hand. I need to rank in the following manner:
Each ID gets a new rank.
rank #1 is assigned to the ID with the lowest date. However, the subsequent dates for that particular ID can be higher but they will get the incremental rank w.r.t other IDs.
(E.g. ADF32 series will be considered to be ranked first as it had the lowest date, although it ends with dates 09-Nov, and RT659 starts with 13-Aug it will be ranked subsequently)
For a particular ID, if the days are consecutive then ranks are same, else they add by 1.
For a particular ID, ranks are given in date ASC.
How to formulate a query?
You need two steps:
select
id_col
,dt_col
,dense_rank()
over (order by min_dt, id_col, dt_col - rnk) as part_col
from
(
select
id_col
,dt_col
,min(dt_col)
over (partition by id_col) as min_dt
,rank()
over (partition by id_col
order by dt_col) as rnk
from tab
) as dt
dt_col - rnk caluclates the same result for consecutives dates -> same rank
Try datediff on lead/lag and then perform partitioned ranking
select t.ID_COL,t.dt_col,
rank() over(partition by t.ID_COL, t.date_diff order by t.dt_col desc) as rankk
from ( SELECT ID_COL,dt_col,
DATEDIFF(day, Lag(dt_col, 1) OVER(ORDER BY dt_col),dt_col) as date_diff FROM table1 ) t
One way to think about this problem is "when to add 1 to the rank". Well, that occurs when the previous value on a row with the same id_col differs by more than one day. Or when the row is the earliest day for an id.
This turns the problem into a cumulative sum:
select t.*,
sum(case when prev_dt_col = dt_col - 1 then 0 else 1
end) over
(order by min_dt_col, id_col, dt_col) as ranking
from (select t.*,
lag(dt_col) over (partition by id_col order by dt_col) as prev_dt_col,
min(dt_col) over (partition by id_col) as min_dt_col
from t
) t;

Hive/Spark Repeat Dense Rank N Times

I have a table and I need to repeat the rank/dense rank value n times. Ive seen some posts where the numbering restarts by some partition but for my case I do not have a column I am partitioning.
I am looking for something like this
This is how I have my code currently
WITH d_rank_tbl AS(
SELECT
id,
1+ (dense_rank() over (order by id) - 1) % 10 as d_rank
FROM id_bucket)
SELECT
id,
dense_rank() over (partition by d_rank order by rand())
FROM d_rank_tbl
How about arithmetic instead?
select t.*,
floor((row_number() over (order by id) + 2) / 3) as d_rank
from id_bucket;
The + 2 is so the numbering starts at 1 instead of 0.

I want to generate continuously number by 2 column and batch wise

I want to generate continuously number with the combination of 2 columns and in batch size of 5. Anybody can help to solve this?
An adoption of #GordonLinoff's answer...
SELECT
name,
rank,
DENSE_RANK() OVER (ORDER BY name DESC, Rank, ((seqnum - 1) / 5)) AS rno
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY name, rank ORDER BY (SELECT null)) AS seqnum
FROM
yourTable
)
sequenced
ORDER BY
3
You can use row_number() and arithmetic:
select name, rank,
((seqnum - 1) / 5) + 1 as rno
from (select t.*,
row_number() as (partition by name, rank order by (select null)) as seqnum
from t
) t
order by seqnum;

Decode maximum number in rows for sql

I am using the #standardsql in bigquery and trying to code the maksimum ranking of each customer_id as 1, and the rest of it are 0
This is the query result so far
The query for ranking is this
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY booking_date Asc) as ranking
What i need is to create another column like this where it decode the maximum ranking of each customerid as 1, and the number below it as 0 just like the below table
Thanks
Based on your sample data, your ranking is unstable, because you have multiple rows with the same key values. In any case, you can still do what you want without subqueries, just using case:
select t.*,
row_number() over (partition by customer_id order by booking_date asc) as ranking,
(case when row_number() over (partition by customer_id order by booking_date asc) =
count(*) over (partition by customer_id)
then 1 else 0
end) as custom_coded
from t;
A more traditional way of doing essentially the same thing would be to use a descending sort:
select t.*,
row_number() over (partition by customer_id order by booking_date asc) as ranking,
(case when row_number() over (partition by customer_id order by booking_date desc) = 1
then 1 else 0
end) as custom_coded
from t;
We can wrap your current query, and then use MAX as an analytic function with a partition by customer to compare each ranking value against the max ranking for each customer. When the ranking value equals the maximum value for a customer, then we assign 1 for the custom_coded, otherwise we assign 0.
SELECT
customer_id, item_bought, booking_date, ranking,
CASE WHEN ranking = MAX(ranking) OVER (PARTITION BY customer_id)
THEN 1 ELSE 0 END AS custom_coded
FROM
(
SELECT customer_id, item_bought, booking_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY booking_date) ranking
FROM yourTable
) t;

How to group numbered rows into groups

In Sql Server I need to take repeating sets of row numbers and group those into segments or sub groups. I'm trying to achieve column B using Sql. I've achieved column a using the row_number() function but I'm not sure how to get to Column B.
Here is the logic for row_number()
1 + ((row_number() over (order by TimeStamp, FileName, OrderID) - 1) % 5) AS [Row_Number]
Your row_number() is of the form:
row_number() over (partition by colA order by colB)
What you seem to want is:
dense_rank() over (order by colA)
That is, the partition key(s) used for the row_number() should be the order by keys for the dense_rank().
EDIT:
Your code is:
1 + ((row_number() over (order by TimeStamp, FileName, OrderID) - 1) % 5) AS [Row_Number]
In this case, there is no partition by. What you really want simply integer division. This is easy:
1 + ((row_number() over (order by TimeStamp, FileName, OrderID) - 1) / 5) AS [Row_Number]
I would go with a simple solution:
SELECT [Row_Number], GroupNumber
FROM (
SELECT [Row_Number]
, row_number()over(partition by [Row_Number] order by [Row_Number]) as GroupNumber
--Note: The order by clause above should be replaced with however you are ordering the groups of row numbers)
FROM YourTableOrInlineView
) z
ORDER BY GroupNumber, [Row_Number]