Oracle SQL conditional ranking - sql

In my query, I am doing multiple types of ranking and for one of ranking types, I want to rank the row only if certain column is not null. Else I don't want ranking to happen.
For example here's a sample table:
+------+------------+------------+--------+--------+
| col1 | col2 | col3 | rank 1 | rank 2 |
+------+------------+------------+--------+--------+
| a | 2018-01-20 | 2018-03-04 | 2 | 2 |
| a | 2018-01-24 | 2018-04-04 | 1 | 1 |
| b | 2018-01-02 | 2018-05-03 | 1 | 1 |
| c | 2017-01-02 | 2017-05-08 | 3 | 2 |
| d | 2016-05-24 | null | 1 | null |
| c | 2018-02-05 | 2018-05-03 | 2 | 1 |
| c | 2018-07-28 | null | 1 | null |
+------+------------+------------+--------+--------+
rank1 is calculated alright based on partition by col1 order by col2 desc
rank 2 should be calculated the same way, but only when when col3 is null, else it should be null.
How can I achieve both ranks in a single query? I tried to use case statement for rank2, but it skips the ranking when col3 is null,

If I understand corrcly, you can try to use CASE WHEN with sum window function
CASE WHEN check col3 isn't null do accumulate else display NULL
CREATE TABLE T(
col1 VARCHAR(5),
col2 DATE,
col3 DATE
);
INSERT INTO T VALUES ( 'a' , to_date('2018-01-20','YYYY-MM-DD') , to_date('2018-03-04','YYYY-MM-DD'));
INSERT INTO T VALUES ( 'a' , to_date('2018-01-24','YYYY-MM-DD') , to_date('2018-04-04','YYYY-MM-DD'));
INSERT INTO T VALUES ( 'b' , to_date('2018-01-02','YYYY-MM-DD') , to_date('2018-05-03','YYYY-MM-DD'));
INSERT INTO T VALUES ( 'c' , to_date('2017-01-02','YYYY-MM-DD') , to_date('2017-05-08','YYYY-MM-DD'));
INSERT INTO T VALUES ( 'd' , TO_DATE('2016-05-24','YYYY-MM-DD') , null);
INSERT INTO T VALUES ( 'c' , TO_DATE('2018-02-05','YYYY-MM-DD') , to_date('2018-05-03','YYYY-MM-DD'));
INSERT INTO T VALUES ( 'c' , TO_DATE('2018-07-28','YYYY-MM-DD') , null);
Query 1:
select t1.*,
rank() OVER(partition by col1 order by col2 desc) rank1,
(CASE WHEN COL3 IS NOT NULL THEN
SUM(CASE WHEN COL3 IS NOT NULL THEN 1 ELSE 0 END) OVER(partition by col1 order by col2 desc)
ELSE
NULL
END) rank2
FROM T t1
Results:
| COL1 | COL2 | COL3 | RANK1 | RANK2 |
|------|----------------------|----------------------|-------|--------|
| a | 2018-01-24T00:00:00Z | 2018-04-04T00:00:00Z | 1 | 1 |
| a | 2018-01-20T00:00:00Z | 2018-03-04T00:00:00Z | 2 | 2 |
| b | 2018-01-02T00:00:00Z | 2018-05-03T00:00:00Z | 1 | 1 |
| c | 2018-07-28T00:00:00Z | (null) | 1 | (null) |
| c | 2018-02-05T00:00:00Z | 2018-05-03T00:00:00Z | 2 | 1 |
| c | 2017-01-02T00:00:00Z | 2017-05-08T00:00:00Z | 3 | 2 |
| d | 2016-05-24T00:00:00Z | (null) | 1 | (null) |

I think you might want:
select count(col3) over (partition by col1 order by col2 desc)
Note that this is equivalent to row_number() rather than rank(). For your data these are equivalent.

Related

Muliple "level" conditions on partition by SQL

I have to populate a teradata table from another source where that can be simplify like that:
+------+------+------------+------------+
| Col1 | Col2 | Col3 | Col4 |
+------+------+------------+------------+
| 1234 | 0 | 01/01/2009 | 01/04/2019 |
| 1234 | 3 | 01/01/2010 | 01/05/2020 |
| 2345 | 1 | 20/02/2013 | 01/04/2019 |
| 2345 | 0 | 20/02/2013 | 01/04/2018 |
| 2345 | 2 | 31/01/2009 | 01/04/2017 |
| 3456 | 0 | 01/01/2009 | 01/04/2019 |
| 3456 | 1 | 01/01/2015 | 01/04/2019 |
| 3456 | 1 | 01/01/2015 | 01/05/2017 |
| 3456 | 3 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+
Col1 is duplicated in source so we have rules to select the right row (with col1 unique in final result)
For if value in col1 :
If value is duplicated then select the most recent date in Col3
If (and only if) it is still duplicated then select row with col2=1
If still duplicated then select most recent date in col4.
Considering the the previous table we should get the following result :
+------+------+------------+------------+
| Col1 | Col2 | Col3 | Col4 |
+------+------+------------+------------+
| 1234 | 3 | 01/01/2010 | 01/05/2020 |
| 2345 | 1 | 20/02/2013 | 01/04/2019 |
| 3456 | 1 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+
I start using partition by to group each value occurrences in col 3 but i have no good idea on how to apply the conditions for each partion in a sql query
Thank you for your help
You can use QUALIFY in Teradata to simplify the syntax:
SELECT col1, col2, col3, col4
FROM mytable
QUALIFY ROW_NUMBER() OVER(
PARTITION BY col1 -- Group rows by "col1" values
ORDER BY col3 DESC, CASE WHEN col2 = 1 THEN 1 ELSE 2 END, col4 DESC -- Order rows
) = 1 -- Get "first" row in each group
Otherwise, this is the same as the answer above.
You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by col1
order by col3 desc,
(case when col2 = 1 then 1 else 2 end),
col4 desc
) as seqnum
from t
) t
where seqnum = 1;

How to move all non-null values to the top of my column?

I have the following data in my table:
| Id | lIST_1 |
----------------------
| 1 | NULL |
| 2 | JASON |
| 3 | NULL |
| 4 | BANDORAN |
| 5 | NULL |
| 6 | NULL |
| 7 | SMITH |
| 8 | NULL |
How can I write a query to get the output below?
| Id | lIST_1
-----------------------
| 1 | JASON |
| 2 | BANDORAN |
| 3 | SMITH |
| 4 | NULL |
| 5 | NULL |
| 6 | NULL |
| 7 | NULL |
| 8 | NULL |
You can use order by:
select row_number() over (order by (select null)) as id, t.list_1
from t
order by (case when list_1 is not null then 1 else 2 end)
It is unclear why you would want id to change values, but you can use row_number() for that.
EDIT:
If you want to change the id, then you can do:
with toupdate as (
select row_number() over (order by (case when list_id is not null then 1 else 2 end), id
) as new_id,
t.*
from t
)
update toupdate
set id = new_id
where id <> new_id; -- no need to update if the value remains the same

remove null values and merge sql server 2008 r2

I have a table (TestTable) as follows
PK | COL1 | COL2 | COL3
1 | 3 | NULL | NULL
2 | 3 | 43 | 1.5
3 | 4 | NULL | NULL
4 | 4 | NULL | NULL
5 | 4 | 48 | 10.5
6 | NULL | NULL | NULL
7 | NULL | NULL | NULL
8 | NULL | NULL | NULL
9 | 5 | NULL | NULL
10 | 5 | NULL | NULL
11 | 5 | 55 | 95
I would like a result as follows
PK | COL1 | COL2 | COL3
1 | 3 | 43 | 1.5
2 | 4 | 48 | 10.5
3 | 5 | 55 | 95
You can do this, But it won't give you a serial number for the PK:
SELECT
PK,
MAX(Col1) AS Col1,
MAX(Col2) AS Col2,
MAX(Col3) AS Col3
FROM TestTable
WHERE Col1 IS NOT NULL
AND Col2 IS NOT NULL
AND COL3 IS NOT NULL
GROUP BY PK;
| PK | COL1 | COL2 | COL3 |
|----|------|------|------|
| 2 | 3 | 43 | 1.5 |
| 5 | 4 | 48 | 10.5 |
| 11 | 5 | 55 | 95 |
If you want to generate a rownumber for the column pk, you can do this:
WITH CTE
AS
(
SELECT
PK,
MAX(Col1) AS Col1,
MAX(Col2) AS Col2,
MAX(Col3) AS Col3
FROM TestTable
WHERE Col1 IS NOT NULL
AND Col2 IS NOT NULL
AND COL3 IS NOT NULL
GROUP BY PK
), Ranked
AS
(
SELECT *, ROW_NUMBER() OVER(ORDER BY PK) AS RN
FROM CTE;
)
SELECT RN AS PK, Col1, COL2, COL3 FROM Ranked
SQL Fiddle Demo
This will give you:
| PK | COL1 | COL2 | COL3 |
|----|------|------|------|
| 1 | 3 | 43 | 1.5 |
| 2 | 4 | 48 | 10.5 |
| 3 | 5 | 55 | 95 |
This can be obtained in two steps like so:
1st step: Get rid of unnecessary rows:
delete from testTable
where Col1 is null
or Col2 is null
or Col3 is null
2nd step: Set the correck PK values using a CTE (update test table):
;with sanitizeCTE
as(
select ROW_NUMBER() over (order by PK) as PK,
Col1, Col2, Col3
from testTable
)
update t
set t.PK = CTE.PK
from testTable t
join sanitizeCTE cte
on t.Col1 = cte.Col1
and t.Col2 = cte.Col2
and t.Col3 = cte.Col3
Tested here: http://sqlfiddle.com/#!3/91e86/1

SQL Select for unique value in a column

I have a table like
| ID | COL1 | COL2 |
| 1 | 1 | w |
| 1 | 2 | x |
| 2 | 1 | y |
| 2 | 2 | z |
When I query it, I'd like to get
| ID | COL2:1 | COL2:2 | <--- (when COL1=1 and COL1 =2)
| 1 | w | x |
| 2 | y | z |
I've tried GROUP BY and JOIN for the same table but I get duplicates and not grouped data. I need some pointers for how to get the results I'm expecting.
You can use MAX() and a CASE statement for this:
SELECT ID
,MAX(CASE WHEN Col1 = 1 THEN Col2 END) AS Col2_1
,MAX(CASE WHEN Col1 = 2 THEN Col2 END) AS Col2_2
FROM YourTable
GROUP BY ID
Demo: SQL Fiddle

Add top row value to current row based on condition

my current tabular output is
---------------------------------------------------
| id col1 col2 |
---------------------------------------------------
| 1 | test1 | 1 |
| 2 | test11 | 0 |
| 3 | test12 | 0 |
| 4 | test13 | 0 |
| 5 | test14 | 0 |
| 6 | test2 | 2 |
| 7 | test21 | 0 |
| 8 | test22 | 0 |
| 9 | test23 | 0 |
| 10 | test24 | 0 |
---------------------------------------------------
Expected output is
---------------------------------------------------
| id col1 col2 |
---------------------------------------------------
| 1 | test1 | 1 |
| 2 | test11 | 1 |
| 3 | test12 | 1 |
| 4 | test13 | 1 |
| 5 | test14 | 1 |
| 6 | test2 | 2 |
| 7 | test21 | 2 |
| 8 | test22 | 2 |
| 9 | test23 | 2 |
| 10 | test24 | 2 |
---------------------------------------------------
Is this possible without cursor? Is there a way that I can add top row value to current row value on a condition when current row value is 0?
You could find the last non-zero value of col2 like:
select id
, col1
, (
select top 1 col2
from YourTable yt2
where yt2.id <= yt1.id
and yt2.col2 <> 0
order by
yt2.id desc
)
from YourTable yt1
Example at SQL Fiddle.
It doesn't look like you're after a running sum. This should work in 2005 or later:
DECLARE #tmp TABLE
(
id INT PRIMARY KEY
, col1 VARCHAR(20)
, col2 INT
);
INSERT #tmp
VALUES
(1, 'test1', 1)
, (2, 'test11', 0)
, (3, 'test12', 0)
, (4, 'test13', 0)
, (5, 'test14', 0)
, (6, 'test2', 2)
, (7, 'test21', 0)
, (8, 'test22', 0)
, (9, 'test23', 0)
, (10, 'test24', 0);
SELECT
t1.id
, t1.col1
, CASE t1.col2
WHEN 0
THEN t2.col2
ELSE
t1.col2
END col2
FROM
#tmp t1
OUTER APPLY
(
SELECT
TOP 1 col2
FROM
#tmp t3
WHERE
t3.id <= t1.id
AND
t3.col2 > 0
ORDER BY
t3.id DESC
) t2