Assign Ranks to records based on condition - sql

I have a table with following records:
=======================
| ID | device_num |
=======================
| 1 | 11 |
| 2 | 11 |
| 3 | NULL |
| 4 | 11 |
| 5 | 11 |
| 6 | NULL |
| 7 | 11 |
| 8 | 12 |
| 9 | 12 |
| 10 | 13 |
| 11 | NULL |
| 12 | 13 |
| 13 | 13 |
| 14 | 13 |
| 15 | 14 |
| 16 | 14 |
=======================
I want to assign a rank to each device number based on following cases:
1- Rank should be started with 1.
2- Each record should be compared with its previous record if both has the same device number then assign a same rank to records.
3- Do not assign rank to NULL records. In case if we get same device number after the Null record then rank should be increased by one.
4- If device number not matches with previous record then increase rank by 1.
Desired output:
===============================
| ID | RANK | device_num |
===============================
| 1 | 1 | 11 |
| 2 | 1 | 11 |
| 3 | | NULL |
| 4 | 2 | 11 |
| 5 | 2 | 11 |
| 6 | | NULL |
| 7 | 3 | 11 |
| 8 | 4 | 12 |
| 9 | 4 | 12 |
| 10 | 5 | 13 |
| 11 | | NULL |
| 12 | 6 | 13 |
| 13 | 6 | 13 |
| 14 | 6 | 13 |
| 15 | 7 | 14 |
| 16 | 7 | 14 |
===============================
I have tried with PostgreSQL Rank functions dense,rank etc but not able to assign rank in this order.

Try this:
SELECT id, device_num,
CASE
WHEN device_num IS NOT NULL
THEN DENSE_RANK() OVER (ORDER BY CASE
WHEN device_num IS NOT NULL
THEN min_id
END)
END AS RANK
FROM (
SELECT id, device_num,
MIN(id) OVER (PARTITION BY device_num, grp) AS min_id
FROM (
SELECT id, device_num,
ROW_NUMBER() OVER (ORDER BY id) -
ROW_NUMBER() OVER (PARTITION BY CASE
WHEN device_num IS NULL THEN -1
ELSE device_num
END
ORDER BY id) AS grp
FROM mytable) AS t) AS s
ORDER BY id
I've made the assumptions that:
there is an auto increment id field in your table that specifies row order.
device_num field is of integer type and -1 is not a valid value for the field.
Demo here

I write a query like this:
;WITH t1 AS (
SELECT *
, ROW_NUMBER() OVER (ORDER BY (SELECT null)) AS rn
FROM yourTable
) -- To add a row-number to yourTable
, t2 AS (
SELECT *
, CASE
WHEN device_num IS NULL THEN 0
WHEN ISNULL(LAG(device_num) OVER (ORDER BY rn), -1) <> ISNULL(device_num, -1) THEN 1
ELSE 0
END AS willChange
FROM t1
)
SELECT device_num
, CASE
WHEN device_num IS NULL THEN null
ELSE SUM(willChange) OVER (ORDER BY rn)
END [RANK]
FROM t2;
EDIT : By using ID: it can be like this:
;WITH t AS (
SELECT *
, CASE
WHEN device_num IS NULL THEN 0
WHEN ISNULL(LAG(device_num) OVER (ORDER BY ID), -1) <> ISNULL(device_num, -1) THEN 1
ELSE 0
END AS willChange
FROM yourTable
)
SELECT device_num
, CASE
WHEN device_num IS NULL THEN null
ELSE SUM(willChange) OVER (ORDER BY ID)
END [RANK]
FROM t;

Related

Compute a Decreasing Cumulative Sum in SQL Server

What I'm trying to achieve is a cumulative sum on a non-negative column where it decreases by 1 on every row, however the result must also be non-negative.
For example, for the following table, summing over the "VALUE" column ordered by the "ID" column:
| ID | VALUE |
-----------------
| 1 | 0 |
| 2 | 0 |
| 3 | 2 |
| 4 | 0 |
| 5 | 0 |
| 6 | 3 |
| 7 | 0 |
| 8 | 2 |
| 9 | 0 |
| 10 | 0 |
| 11 | 0 |
| 12 | 0 |
I expect:
| ID | VALUE | SUM |
-------------------------
| 1 | 0 | 0 |
| 2 | 0 | 0 |
| 3 | 2 | 2 |
| 4 | 0 | 1 |
| 5 | 0 | 0 |
| 6 | 3 | 3 |
| 7 | 0 | 2 |
| 8 | 2 | 3 |
| 9 | 0 | 2 |
| 10 | 0 | 1 |
| 11 | 0 | 0 |
| 12 | 0 | 0 |
Your question is not very well described. My best interpretation is that you want to count down from positive value, starting over when you hit the next one.
You can define the groups with a cumulative sum of the non-zero values. Then use a cumulative sum on the groups:
select t.*,
(case when max(value) over (partition by grp) < row_number() over (partition by grp order by id) - 1
then 0
else (max(value) over (partition by grp) -
(row_number() over (partition by grp order by id) - 1)
)
end) as my_value
from (select t.*,
sum(case when value <> 0 then 1 else 0 end) over (order by id) as grp
from t
) t
Here is a db<>fiddle.
EDIT:
It strikes me that you might want to keep all the "positive" values and count down -- remembering if they don't go down to zero. Alas, I think the simplest method is a recursive CTE in this case:
with tn as (
select t.id, t.value, row_number() over (order by id) as seqnum
from t
),
cte as (
select tn.id, tn.value, tn.seqnum, tn.value as s
from tn
where id = 1
union all
select tn.id, tn.value, tn.seqnum,
(case when cte.s = 0
then tn.value
when tn.value = 0 and cte.s > 0
then cte.s - 1
-- when tn.value > 0 and cte.value > 0
else tn.value + cte.s - 1
end)
from cte join
tn
on tn.seqnum = cte.seqnum + 1
)
select *
from cte;
The db<>fiddle has both solutions.

SQL group by changing column

Suppose I have a table sorted by date as so:
+-------------+--------+
| DATE | VALUE |
+-------------+--------+
| 01-09-2020 | 5 |
| 01-15-2020 | 5 |
| 01-17-2020 | 5 |
| 02-03-2020 | 8 |
| 02-13-2020 | 8 |
| 02-20-2020 | 8 |
| 02-23-2020 | 5 |
| 02-25-2020 | 5 |
| 02-28-2020 | 3 |
| 03-13-2020 | 3 |
| 03-18-2020 | 3 |
+-------------+--------+
I want to group by changes in value within that given date range, and add a value that increments each time as an added column to denote that.
I have tried a number of different things, such as using the lag function:
SELECT value, value - lag(value) over (order by date) as count
GROUP BY value
In short, I want to take the table above and have it look like:
+-------------+--------+-------+
| DATE | VALUE | COUNT |
+-------------+--------+-------+
| 01-09-2020 | 5 | 1 |
| 01-15-2020 | 5 | 1 |
| 01-17-2020 | 5 | 1 |
| 02-03-2020 | 8 | 2 |
| 02-13-2020 | 8 | 2 |
| 02-20-2020 | 8 | 2 |
| 02-23-2020 | 5 | 3 |
| 02-25-2020 | 5 | 3 |
| 02-28-2020 | 3 | 4 |
| 03-13-2020 | 3 | 4 |
| 03-18-2020 | 3 | 4 |
+-------------+--------+-------+
I want to eventually have it all in one small table with the earliest date for each.
+-------------+--------+-------+
| DATE | VALUE | COUNT |
+-------------+--------+-------+
| 01-09-2020 | 5 | 1 |
| 02-03-2020 | 8 | 2 |
| 02-23-2020 | 5 | 3 |
| 02-28-2020 | 3 | 4 |
+-------------+--------+-------+
Any help would be very appreciated
you can use a combination of Row_number and Dense_rank functions to get the required results like below:
;with cte
as
(
select t.DATE,t.VALUE
,Dense_rank() over(partition by t.VALUE order by t.DATE) as d_rank
,Row_number() over(partition by t.VALUE order by t.DATE) as r_num
from table t
)
Select t.Date,t.Value,d_rank as count
from cte
where r_num = 1
You can use a lag and cumulative sum and a subquery:
SELECT value,
SUM(CASE WHEN prev_value = value THEN 0 ELSE 1 END) OVER (ORDER BY date)
FROM (SELECT t.*, LAG(value) OVER (ORDER BY date) as prev_value
FROM t
) t
Here is a db<>fiddle.
You can recursively use lag() and then row_number() analytic functions :
WITH t2 AS
(
SELECT LAG(value,1,value-1) OVER (ORDER BY date) as lg,
t.*
FROM t
)
SELECT t2.date,t2.value, ROW_NUMBER() OVER (ORDER BY t2.date) as count
FROM t2
WHERE value - lg != 0
Demo
and filter through inequalities among the returned values from those functions.

How to move all non-null values to the top of my column?

I have the following data in my table:
| Id | lIST_1 |
----------------------
| 1 | NULL |
| 2 | JASON |
| 3 | NULL |
| 4 | BANDORAN |
| 5 | NULL |
| 6 | NULL |
| 7 | SMITH |
| 8 | NULL |
How can I write a query to get the output below?
| Id | lIST_1
-----------------------
| 1 | JASON |
| 2 | BANDORAN |
| 3 | SMITH |
| 4 | NULL |
| 5 | NULL |
| 6 | NULL |
| 7 | NULL |
| 8 | NULL |
You can use order by:
select row_number() over (order by (select null)) as id, t.list_1
from t
order by (case when list_1 is not null then 1 else 2 end)
It is unclear why you would want id to change values, but you can use row_number() for that.
EDIT:
If you want to change the id, then you can do:
with toupdate as (
select row_number() over (order by (case when list_id is not null then 1 else 2 end), id
) as new_id,
t.*
from t
)
update toupdate
set id = new_id
where id <> new_id; -- no need to update if the value remains the same

How to sum rows before a condition is met in SQL

I have a table which has multiple records for the same id. Looks like this, and the rows are sorted by sequence number.
+----+--------+----------+----------+
| id | result | duration | sequence |
+----+--------+----------+----------+
| 1 | 12 | 7254 | 1 |
+----+--------+----------+----------+
| 1 | 12 | 2333 | 2 |
+----+--------+----------+----------+
| 1 | 11 | 1000 | 3 |
+----+--------+----------+----------+
| 1 | 6 | 5 | 4 |
+----+--------+----------+----------+
| 1 | 3 | 20 | 5 |
+----+--------+----------+----------+
| 2 | 1 | 230 | 1 |
+----+--------+----------+----------+
| 2 | 9 | 10 | 2 |
+----+--------+----------+----------+
| 2 | 6 | 0 | 3 |
+----+--------+----------+----------+
| 2 | 1 | 5 | 4 |
+----+--------+----------+----------+
| 2 | 12 | 3 | 5 |
+----+--------+----------+----------+
E.g. for id=1, i would like to sum the duration for all the rows before and include result=6, which is 7254+2333+1000+5. Same for id =2, it would be 230+10+0. Anything after the row where result=6 will be left out.
My expected output:
+----+----------+
| id | duration |
+----+----------+
| 1 | 10592 |
+----+----------+
| 2 | 240 |
+----+----------+
The sequence has to be in ascending order.
I'm not sure how I can do this in sql.
Thank you in advance!
I think you want:
select t2.id, sum(t2.duration)
from t
where t.sequence <= (select t2.sequence
from t t2
where t2.id = t.id and t2.result = 6
);
In PrestoDB, I would recommend window functions:
select id, sum(duration)
from (select t.*,
min(case when result = 6 then sequence end) over (partition by id) as sequence_6
from t
) t
where sequence <= sequence_6;
You can use a simple aggregate query with a condition that uses a subquery to recover the sequence corresponding to the record whose sequence is 6 :
SELECT t.id, SUM(t.duration) total_duration
FROM mytable t
WHERE t.sequence <= (
SELECT sequence
FROM mytable
WHERE id = t.id AND result = 6
)
GROUP BY t.id
This demo on DB Fiddle with your test data returns :
| id | total_duration |
| --- | -------------- |
| 1 | 10592 |
| 2 | 240 |
Basic group by query should solve your issue
select
id,
sum(duration) duration
from t
group by id
for the certain rows:
select
id,
sum(duration) duration
from t
where id = 1
group by id
if you want to include it in your result set
select id, duration, sequence from t
union all
select
id,
sum(duration) duration
null sequence
from t
group by id

Sum all sub group last value by group

Consider the following table:
ID | ITEM | GROUP_ID | VAL | COST
---+------+----------+-----------+-------
1 | A | 1 | 1 | 12
2 | B | 1 | 2 | 12
3 | C | 1 | 3 | 12
4 | D | 1 | 4 | 13
5 | D | 1 | 5 | 12
6 | E | 2 | 1 | 17
7 | E | 2 | 2 | 10
8 | E | 2 | 3 | 11
9 | E | 2 | 4 | 12
10 | F | 2 | 5 | 15
11 | F | 2 | 6 | 13
12 | F | 2 | 7 | 11
13 | F | 2 | 8 | 12
how to get the result as follow:
GROUP_ID | VAL | COST
----------+-----------+-------
1 | 15 | 48
2 | 36 | 24
The val is the sum by group id.
The cost is the sum of last value by item.
Use analytic function ROW_NUMBER() on postgres, oracle or sql server
SqlFiddleDemo
WITH last_item as (
SELECT group_id, sum(cost) as sum_cost
FROM (
SELECT t.*,
ROW_NUMBER() over (partition by item order by id desc) as rn
FROM Table1 t
) as t
WHERE rn = 1
GROUP BY t.group_id
),
val_sum as (
SELECT t.group_id, SUM(val) as sum_val
FROM Table1 t
GROUP BY t.group_id
)
SELECT v.group_id, v.sum_val, l.sum_cost
FROM val_sum v
INNER JOIN last_item l
ON v.group_id = l.group_id
OUTPUT
| group_id | sum_val | sum_cost |
|----------|---------|----------|
| 1 | 15 | 48 |
| 2 | 36 | 24 |
Try this
WITH LastRow (id)
AS (
SELECT MAX(id)
FROM TheTable
GROUP BY item, group_id
)
SELECT group_Id, SUM(val), SUM(CASE WHEN B.id IS NULL THEN 0 ELSE cost END)
FROM TheTable A
LEFT OUTER JOIN LastRow B ON A.id = B.id
GROUP BY group_id
EDIT:
SQL Fiddle Demo
Thanks #Juan Carlos Oropeza for creating the SQL Fiddle test data