Setting rank to NULL using RANK() OVER in SQL

Setting rank to NULL using RANK() OVER in SQL - sql

In a SQL Server DB, I have a table of values that I am interested in ranking.
When I perform a RANK() OVER (ORDER BY VALUE DESC) as RANK, I get the following results (in a hypothetical table):
RANK | USER_ID | VALUE
------------------------
1 | 33 | 30000
2 | 10 | 20000
3 | 45 | 10000
4 | 12 | 5000
5 | 43 | 2000
6 | 32 | NULL
6 | 13 | NULL
6 | 19 | NULL
6 | 28 | NULL
The problem is, I do not want the rows which have NULL for a VALUE to get a rank - I need some way to set the rank for these to NULL. So far, searching the web has brought me no answers on how I might be able to do this.
Thanks for any help you can provide.

You can try a CASE statement:
SELECT
CASE WHEN Value IS NULL THEN NULL
ELSE RANK() OVER (ORDER BY VALUE DESC)
END AS RANK,
USER_ID,
VALUE
FROM yourtable

The CASE statement provided earlier would count the NULL records in the rank if the SORT BY was ascending rather than descending. This would start the ranking at 5 rather than 1 - probably not what is desired.
To ensure that the nulls do not get counted in the rank, you can force them to the bottom by adding an initial sort criteria on whether the value IS NULL or not, like so:
SELECT
CASE WHEN Value IS NULL THEN NULL
ELSE RANK() OVER
(ORDER BY CASE WHEN Value IS NULL THEN 1 ELSE 0 END, VALUE DESC)
END AS RANK,
USER_ID,
VALUE
FROM yourtable
*** credit to Hugo Kornelis: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/deb8a0aa-aaab-442b-a667-11220333a4e0/rank-without-counting-null-values?forum=transactsql

Related

ORACLE SELECT DISTINCT VALUE ONLY IN SOME COLUMNS

+----+------+-------+---------+---------+
| id | order| value | type | account |
+----+------+-------+---------+---------+
| 1 | 1 | a | 2 | 1 |
| 1 | 2 | b | 1 | 1 |
| 1 | 3 | c | 4 | 1 |
| 1 | 4 | d | 2 | 1 |
| 1 | 5 | e | 1 | 1 |
| 1 | 5 | f | 6 | 1 |
| 2 | 6 | g | 1 | 1 |
+----+------+-------+---------+---------+
I need get a select of all fields of this table but only getting 1 row for each combination of id+type (I don't care the value of the type). But I tried some approach without result.
At the moment that I make an DISTINCT I cant include rest of the fields to make it available in a subquery. If I add ROWNUM in the subquery all rows will be different making this not working.
Some ideas?
My better query at the moment is this:
SELECT ID, TYPE, VALUE, ACCOUNT
FROM MYTABLE
WHERE ROWID IN (SELECT DISTINCT MAX(ROWID)
FROM MYTABLE
GROUP BY ID, TYPE);

It seems you need to select one (random) row for each distinct combination of id and type. If so, you could do that efficiently using the row_number analytic function. Something like this:
select id, type, value, account
from (
select id, type, value, account,
row_number() over (partition by id, type order by null) as rn
from your_table
)
where rn = 1
;
order by null means random ordering of rows within each group (partition) by (id, type); this means that the ordering step, which is usually time-consuming, will be trivial in this case. Also, Oracle optimizes such queries (for the filter rn = 1).
Or, in versions 12.1 and higher, you can get the same with the match_recognize clause:
select id, type, value, account
from my_table
match_recognize (
partition by id, type
all rows per match
pattern (^r)
define r as null is null
);
This partitions the rows by id and type, it doesn't order them (which means random ordering), and selects just the "first" row from each partition. Note that some analytic functions, including row_number(), require an order by clause (even when we don't care about the ordering) - order by null is customary, but it can't be left out completely. By contrast, in match_recognize you can leave out the order by clause (the default is "random order"). On the other hand, you can't leave out the define clause, even if it imposes no conditions whatsoever. Why Oracle doesn't use a default for that clause too, only Oracle knows.

How to use SQL to filter off rows that are in excess (but have not just met) threshold limit?

I have a table like the following:
ID|PROMOTION|USER_ID|LIMIT|CUMULATIVE_USAGE|TDATE
01|111111111|AAAAAAA| 2 | 1 |07-21-2020
02|111111111|AAAAAAA| 2 | 3 |07-22-2020
03|111111111|AAAAAAA| 2 | 5 |07-23-2020 <-- remove
04|222222222|AAAAAAA| 4 | 1 |08-21-2020
05|222222222|AAAAAAA| 4 | 3 |08-22-2020
06|222222222|AAAAAAA| 4 | 5 |08-23-2020
07|333333333|AAAAAAA| 5 | 1 |09-21-2020
08|333333333|AAAAAAA| 5 | 3 |09-22-2020
09|333333333|AAAAAAA| 5 | 5 |09-23-2020
For each user/promotion id, I want to filter off rows that did not just cross the limit but had already been in excess of the limit i.e. row where ID=3 in this case.
What SQL logic could I use to do this?

You can use lag():
select t.*
from (select t.*,
lag(cumulative_usage) over (partition by promotion, user_id order by id) as prev_cumulative_usage
from t
) t
where usage <= prev_cumulative_usage or
prev_cumulative_usage is null;

how to check the number continure or not in postgreSQL?

Hi I have database like:
id | lownumber | highnumber | content
---------------------------------------
1 | 10 | 13 | text
2 | 14 | 19 | book
3 | 6 | 9 | table
...
I want to check the lownumber and highnumber contine or not. I mean the previous highnumber + 1 equal the next line lownumber or not? how to do that in postgresql?

You can get the exceptions using lag():
select t.*
from (select t.*, lag(highnumber) over (order by id) as prev_highnumber
from t
) t
where lownumber <> prev_highnumber + 1;
Note: "previous" is ambiguous. I don't know from the question if it refers to the previous row based on id or lownumber. If the latter, then change the order by.

How I calculate gaps between rows

Now I include some parallelism to my app ProcessXX, I'm not sure the data can be process in the right order. So Im working in a query to return the lower and upperbound to pass to ProcessZZ.
My table avl_pool has avl_id and has_link and some other fields and a steady flow of data, when new data arrive they start with has_link=null, when ProcessX finish with the rows has_link have the link value xxxx is some number.
Now on the next step I have to process only those rows with links, but I cant skip rows, because order is very important.
In this case I need ProcessZZ(23561211, 23561219)
rn | avl_id | has_link
1 | 23561211 | xxxx -- start
2 | 23561212 | xxxx
3 | 23561213 | xxxx
4 | 23561214 | xxxx
5 | 23561215 | xxxx
6 | 23561216 | xxxx
7 | 23561217 | xxxx
8 | 23561218 | xxxx
9 | 23561219 | xxxx -- end
10 | 23561220 | null
11 | 23561221 | xxxx
12 | 23561222 | xxxx
13 | 23561223 | xxxx
Currently I have:
-- starting avl_id need to be send to ProcessZZ
SELECT MIN(avl_id) as min_avl_id
FROM avl_db.avl_pool
WHERE NOT has_link IS NULL
-- first avl_id still on hands of ProcessXX ( but can be null )
SELECT MIN(avl_id) as max_avl_id -- here need add a LAG
FROM avl_db.avl_pool
WHERE has_link IS NULL
AND avl_id > (SELECT MIN(avl_id)
FROM avl_db.avl_pool
WHERE NOT has_link IS NULL)
-- In case everyone has_link already the upper limit is the last one on the table.
SELECT MAX(avl_id) as max_avl_id
FROM avl_db.avl_pool
I can put everthing in muliple CTE and return both result, but I think this can be handle like some island, but not sure how.
So the query should looks like
SELECT min_avl_id, min_avl_id
FROM cte
min_avl_id | min_avl_id
23561211 | 23561219

If I understand correctly, you want to assign a sequential number to each block. This number is demarcated by the NULL values in has_link.
If this is the problem, then a cumulative sum solves the problem:
select p.*,
sum(case when has_link is null then 1 else 0 end) over (order by rn) as grp
from avl_db.avl_pool p;
This actually includes the NULL values in the output. The simplest method is probably then a subquery:
select p.*
from (select p.*,
sum(case when has_link is null then 1 else 0 end) over (order by rn) as grp
from avl_db.avl_pool p
) p
where has_link is not null;

Filtering data with from statement

So let's say I have this table with these rows in it
Table name: MYTABLE
ID | NUMBER | FK_ID
1 | 0 | 26
2 | 0 | 26
3 | 1 | 26
4 | 0 | 27
5 | 1 | 27
Now I want to filter out only the rows that that go under the same FK_ID and have two or more NUMBER 0's in them.
So for instance if I would apply this filter here, I would only see one row which corresponds to the FK_ID 26 because it has two NUMBER 0s in it's MYTABLE data.
Is this even possible to do or should I just handle the whole data in my programming language not filter it like that from DB.

SELECT FK_ID ,
COUNT(DECODE(NUMBER ,0,1))
FROM TEST_DATA
GROUP BY FK_ID
HAVING COUNT(DECODE(NUMBER ,0,1)) >= 2
Fiddle here : http://sqlfiddle.com/#!4/44d70/4

Does this query work for you?
SELECT
FK_ID
FROM MYTABLE
WHERE NUMBER = 0
GROUP BY FK_ID
HAVING COUNT(*) >= 2;
Also, consider renaming the NUMBER column, as NUMBER is a reserved word in Oracle.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Setting rank to NULL using RANK() OVER in SQL - sql

You can try a CASE statement: SELECT CASE WHEN Value IS NULL THEN NULL ELSE RANK() OVER (ORDER BY VALUE DESC) END AS RANK, USER_ID, VALUE FROM yourtable

Related

ORACLE SELECT DISTINCT VALUE ONLY IN SOME COLUMNS

How to use SQL to filter off rows that are in excess (but have not just met) threshold limit?

how to check the number continure or not in postgreSQL?

How I calculate gaps between rows

Filtering data with from statement

Categories

Resources