How to not count NULL values in DENSE_RANK()?

How to not count NULL values in DENSE_RANK()? - sql

Say I have the following table:
col
NULL
1
1
2
Then I select:
SELECT col, DENSE_RANK() OVER(ORDER BY col) as rnk from table
Then I get:
col rnk
NULL 1
1 2
1 2
2 3
What I want to get is this:
col rnk
NULL NULL
1 1
1 1
2 2
But if I query:
SELECT col, CASE WHEN col IS NOT NULL THEN DENSE_RANK() OVER(ORDER BY col) END as rnk from table
Then I get:
col rnk
NULL NULL
1 2
1 2
2 3
Is there a way to disregard NULLs when ranking, other than using a WHERE clause? I have some other columns whose rows cannot be omitted.

Use partition by:
SELECT col,
(CASE WHEN col IS NOT NULL
THEN DENSE_RANK() OVER (PARTITION BY (CASE WHEN col IS NOT NULL THEN 1 ELSE 2 END)
ORDER BY col
)
END) as rnk
FROM table;

Below is for BigQuery Legacy SQL
SELECT col, CASE WHEN col IS NOT NULL THEN rnk END AS rnk
FROM (
SELECT
col, (col IS NULL) AS tmp,
DENSE_RANK() OVER(PARTITION BY tmp ORDER BY col) AS rnk
FROM table
)

Related

With some WHERE clause criteria, have SQL output go to next line

I am trying to write a query where I have some criteria where I pivot the results. However, due to output file constraints I am looking for the output to create a new line after the pivot exceeds X, even if the ID and such is otherwise the same.
What I am trying to do:
|--ID--|-Value-|
| 1 | val1 |
| 1 | val2 |
| 1 | val3 |
| 2 | val1 |
|--ID--|-Col1-|-Col2-|
| 1 | Val1| Val2|
| 1 | Val3| |
| 2 | Val1| |
SELECT *
FROM table
PIVOT(max(value) for field1 in (t1,t2)
as pvt
ORDER BY UNIQUE_ID
This is just a pivot example to pivot this particular column. However the output has a very strict number of column requirement so I'd be looking for any pivot beyond the 5th to "overflow" to the next row while retaining the unique id. I am looking at PIVOT but I dont think it will work here.
Is this even possible within the Snowflake platform or do I need to explore other options?

This requirement is purely presentation matter and in my opinion should not be performed at the database level. With that being said it is possible to achieve it by numbering rows in group and performing modulo division:
Samle data:
CREATE OR REPLACE TABLE tab
AS
SELECT 1 AS id, 'val1' AS value UNION
SELECT 1 AS id, 'val2' AS value UNION
SELECT 1 AS id, 'val3' AS value UNION
SELECT 2 AS id, 'val1' AS value;
Query:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) - 1 AS rn
FROM tab
)
SELECT
id
,MAX(CASE WHEN rn % 2 = 0 THEN value END) AS col1
,MAX(CASE WHEN rn % 2 = 1 THEN value END) AS col2
FROM cte
GROUP BY id, FLOOR(rn / 2)
ORDER BY id, FLOOR(rn / 2);
Intermediate result:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) - 1 AS rn
FROM tab
)
SELECT id,value, rn, FLOOR(rn / 2) AS row_index, rn % 2 AS column_index
FROM cte
ORDER BY ID, rn;
Generalized:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) - 1 AS rn
FROM tab
)
SELECT
id
,MAX(CASE WHEN rn % N = 0 THEN value END) AS col1
,MAX(CASE WHEN rn % N = 1 THEN value END) AS col2
-- ....
,MAX(CASE WHEN rn % N = N-1 THEN value END) AS colN
FROM cte
GROUP BY id, FLOOR(rn / N)
ORDER BY id, FLOOR(rn / N);

count rows which have max value less than specified parameter

I want to find in my table, max value which is less than specified in parameter and get count of rows that have the same value as max value. For example in my table I have values: (4,1,3,1,4,4,10), and it is list of parameters in string "2,9,10,4". I have to split string to separate parameters. Base on this sample values I want to get something like that:
param | max value | count
2 | 1 | 2
9 | 4 | 3
10 | 4 | 3
4 | 3 | 1
And it is my sample query:
select
[param]
, max([val]) [max_value_by_param]
, max(count) [count]
from(
select
n.value as [param]
,a.val
, count(*) as [count]
from (--mock of table
select 1 as val union all
select 3 as val union all
select 4 as val union all
select 1 as val union all
select 3 as val union all
select 4 as val union all
select 4 as val union all
select 10 as val
) a
join (select [value] from string_split('2,9,10,4', ',')) n--list of params
on a.val < n.[value]
group by n.value, a.val
) tmp
group by [param]
Is it possible to do it better/easier ?

Here is a way to express this using apply:
select s.value as param, a.val, a.cnt
from string_split('2,9,10,4', ',') s outer apply
(select top (1) a.val, count(*) as cnt
from a
group by a.val
having a.val < s.value
order by a.val desc
) a;
Here is a db<>fiddle.
But the fastest method is probably going to be:
with av as (
select a.val, count(*) as cnt
from a
group by a.val
union all
select s.value, null as cnt
from string_split('2,9,10,4', ',') s
)
select val, a_val, a_cnt
from (select av.*,
max(case when cnt is not null then val end) over (order by val, (case when cnt is null then 1 else 2 end)) as a_val,
max(case when cnt is not null then cnt end) over (order by val, (case when cnt is null then 1 else 2 end)) as a_cnt
from av
) av
where cnt is null;
This only aggregates the data once and should return all parameters, even those with no preceding values in a.

ROW_Number with Custom Group

I am trying to have row_number based on custom grouping but I am not able to produce it.
Below is my Query
CREATE TABLE mytbl (wid INT, id INT)
INSERT INTO mytbl Values(1,1),(2,1),(3,0),(4,2),(5,3)
Current Output
wid id
1 1
2 1
3 0
4 2
5 3
Query
SELECT *, RANK() OVER(PARTITION BY wid, CASE WHEN id = 0 THEN 0 ELSE 1 END ORDER BY ID)
FROM mytbl
I would like to rank the rows based on custom condition like if ID is 0 then I have start new group until I have non 0 ID.
Expected Output
wid id RN
1 1 1
2 1 1
3 0 1
4 2 2
5 3 2

Guessing here, as we don't have much clarification, but perhaps this:
SELECT wid,
id,
COUNT(CASE id WHEN 0 THEN 1 END) OVER (ORDER BY wid ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) +1 AS [Rank]
FROM mytbl ;

If I understand you correctly, you may use the next approach. Note, that you need to have an ordering column (I assume this is wid column):
Statement:
;WITH ChangesCTE AS (
SELECT
*,
CASE WHEN LAG(id) OVER (ORDER BY wid) = 0 THEN 1 ELSE 0 END AS ChangeIndex
FROM mytbl
), GroupsCTE AS (
SELECT
*,
SUM(ChangeIndex) OVER (ORDER BY wid) AS GroupIndex
FROM ChangesCTE
)
SELECT
wid,
id,
DENSE_RANK() OVER (ORDER BY GroupIndex) AS Rank
FROM GroupsCTE
Result:
wid id Rank
1 1 1
2 1 1
3 0 1
4 2 2
5 3 2

without much clarification on the logic required, my understanding is you want to increase the Rank by 1 whenever id = 0
select wid, id,
[Rank] = sum(case when id = 0 then 1 else 0 end) over(order by wid)
+ case when id <> 0 then 1 else 0 end
from mytbl

Try this,
CREATE TABLE #mytbl (wid INT, id INT)
INSERT INTO #mytbl Values(1,1),(2,1),(3,0)
,(4,2),(5,3),(6,0),(7,4),(8,5),(9,6)
;with CTE as
(
select *,ROW_NUMBER()over(order by wid)rn
from #mytbl where id=0
)
,CTE1 as
(
select max(rn)+1 ExtraRN from CTE
)
select a.* ,isnull(ca.rn,ca1.ExtraRN) from #mytbl a
outer apply(select top 1 * from CTE b
where a.wid<=b.wid )ca
cross apply(select ExtraRN from CTE1)ca1
drop table #mytbl
Here both OUTER APPLY and CROSS APPLY will not increase cardianility estimate.It will always return only one rows.

Case statements: SQL Server

id date value
------------------
1 1 null
1 2 a
1 3 b
1 4 null
2 1 null
2 2 null
2 3 null
2 4 null
2 5 null
If value is null in all id's then max of date of that id and if we have value then max of date with value id.
Required output is:
id date value
-----------------
1 3 b
2 5 null

Typical method for this type of problem is row_number(). You can create a CASE expression to define a priority:
select id,
date,
value
from (
select id,
date,
value,
row_number() over (partition by id order by case when value is not null then 1 else 2 end asc, date desc) rn
from UnnamedTable
) t1
where t1.rn = 1

Sql Fiddle Demo
WITH cte as (
SELECT id,
[date],
[value],
ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [value] DESC, [date] DESC) as rn
FROM Table1
)
SELECT *
FROM cte
WHERE rn = 1

Count consecutive duplicate values in SQL

I have a table like so
ID OrdID Value
1 1 0
2 2 0
3 1 1
4 2 1
5 1 1
6 2 0
7 1 0
8 2 0
9 2 1
10 1 0
11 2 0
I want to get the count of consecutive value where the value is 0. Using the example above the result will be 3 (Rows 6, 7 and 8). I am using sql server 2008 r2.

I am going to presume that id is unique and increasing. You can get counts of consecutive values by using the different of row numbers. The following counts all sequences:
select grp, value, min(id), max(id), count(*) as cnt
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by value order by id)
) as grp
from table t
) t
group by grp, value;
If you want the longest sequence of 0s:
select top 1 grp, value, min(id), max(id), count(*) as cnt
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by value order by id)
) as grp
from table t
) t
group by grp, value
having value = 0
order by count(*) desc

A query using not exists to find consecutive 0s
select top 1 min(t2.id), max(t2.id), count(*)
from mytable t
join mytable t2 on t2.id <= t.id
where not exists (
select 1 from mytable t3
where t3.id between t2.id and t.id
and t3.value <> 0
)
group by t.id
order by count(*) desc
http://sqlfiddle.com/#!3/52989/3

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to not count NULL values in DENSE_RANK()? - sql

Use partition by: SELECT col, (CASE WHEN col IS NOT NULL THEN DENSE_RANK() OVER (PARTITION BY (CASE WHEN col IS NOT NULL THEN 1 ELSE 2 END) ORDER BY col ) END) as rnk FROM table;

Below is for BigQuery Legacy SQL SELECT col, CASE WHEN col IS NOT NULL THEN rnk END AS rnk FROM ( SELECT col, (col IS NULL) AS tmp, DENSE_RANK() OVER(PARTITION BY tmp ORDER BY col) AS rnk FROM table )

Related

With some WHERE clause criteria, have SQL output go to next line

count rows which have max value less than specified parameter

ROW_Number with Custom Group

Case statements: SQL Server

Count consecutive duplicate values in SQL

Categories

Resources