Rank Function inside case statement - sql

I am trying to use Rank Function inside a case statement and give where rank_number = 1 , it's throwing error as unexpected where Condition. Can some one help me how to assign rank in where clause inside case statement

You can't use the RANK() analytic function (or any other one, for that matter) in the WHERE clause of a query. The results of the rank computation are not yet available. But they are available in the SELECT clause or the ORDER BY clause. One workaround would be to subquery:
SELECT *
FROM
(
SELECT t.*, RANK() OVER (ORDER BY blah) rnk
FROM yourTable t
) s
WHERE rnk = 1;
Some databases support a QUALIFY clause, where it is possible to use analytic functions. Assuming you are using something like Teradata or BigQuery, you could use:
SELECT *
FROM yourTable
WHERE 1 = 1
QUALIFY RANK() OVER (ORDER BY blah) = 1;

Related

Select second higher value with Rank() function in PosgreSql

I tried this :
SELECT
code_nuance, nb_voix,
RANK () OVER ( ORDER BY nb_voix DESC) AS rank
FROM election_2015.resultat_nuance_departement
WHERE rank = 2;
And same with alias :
SELECT
rnd.code_nuance, rnd.nb_voix,
RANK () OVER ( ORDER BY rnd.nb_voix DESC) AS rank
FROM election_2015.resultat_nuance_departement rnd
WHERE rank = 2;
Rank is not recognized in the WHERE close.
It says "Rank doesn't exist"
Any one?
Any suggestions welcomed, ty !
As with all other column aliases, you cannot use the alias in the where clause. For this purpose a subquery is handy:
SELECT x.*
FROM (SELECT rnd.code_nuance, rnd.nb_voix,
RANK() OVER (ORDER BY rnd.nb_voix DESC) AS rank
FROM election_2015.resultat_nuance_departement rnd
) x
WHERE rank = 2;
If the values are all unique, you can also use FETCH:
select rnd.*
from election_2015.resultat_nuance_departement rnd
order by rnd.db_voix desc
offset 1 fetch first 1 row only;
from the postgresql docs:
An output column's name can be used to refer to the column's value in ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses; there you must write out the expression instead.
This is because WHERE clause is resolved before column aliases are considered.
Another solution could be using a subquery.

How to get sample data from a table or a view in Aster Teradata without using order by?

I am trying to get sample data from a table in Aster Teradata using order by using the following code:
SELECT "col"
FROM (SELECT "col",
Row_number()
OVER (
ORDER BY 1) AS RANK
FROM "nisha_test"."test_table") a
WHERE rank <= 10000
I want to get random 10000 rows without using order by.
If you want a sample you should use the built-in sample feature.
For Aster (or Vantage MLE, but with a slightly different syntax) there's a RandomSample operator, e.g.
SELECT * FROM RandomSample (
ON (SELECT 1) PARTITION BY 1 -- dummy data, but needed
InputTable ('nisha_test.test_table')
NumSample ('10000')
)
For Teradata there's the SAMPLE clause, e.g.
select *
from nisha_test.test_table
SAMPLE 10000
You can also use the QUALIFY clause in Teradata to remove the outer SELECT:
SELECT col
FROM nisha_test.test_table
QUALIFY ROW_NUMBER() OVER (ORDER BY NULL) <= 10000
In Teradata, I think you can use a constant value in the ORDER BY. You may even be able to exclude the ORDER BY altogether: ROW_NUMBER() OVER()
We can use the LIMIT keyword to get random values from a table or a view in Aster DB.
select * from "nisha_test"."test_table" limit 10000;

Group function is not allowed here

When I run the following query, I get
ORA-00934: group function is not allowed here
what is the problem ?
select c.Numcom,c.Nompr,c.salaire_fix
from commercialv c,comercialv c1
where c.salaire_fix=(max(c1.salaire_fix) );
You cannot use an aggregate function in a WHERE clause.
Given your use case, you probably want a subquery:
select c.Numcom,c.Nompr,c.salaire_fix
from commercialv c
where c.salaire_fix=(select max(salaire_fix) from comercialv);
The rational is that aggregate functions works on a set. The WHERE clause on the other hand, has only access to the data of one row.
You can do what you want with analytic functions:
select Numcom, Nompr, salair_fix
from (select c.Numcom, c.Nompr, c.salaire_fix,
max(c.salaire_fix) over () as maxs
from commercialv c
) c
where c.salaire_fix = c.maxs;
As for your query, aggregation functions are not permitted in the where clause.
You could also do this query using MAX() as a window function (or analytic function if you prefer the Oracle lingo):
SELECT numcom, nompr, salaire_fix FROM (
SELECT numcom, nompr, salaire_fix, MAX(salaire_fix) OVER ( ) AS max_salaire_fix
FROM commercialv
) WHERE salaire_fix = max_salaire_fix;
You could also use RANK():
SELECT numcom, nompr, salaire_fix FROM (
SELECT numcom, nompr, salaire_fix, RANK() OVER ( ORDER BY salaire_fix DESC ) AS salaire_fix_rank
FROM commercialv
) WHERE salaire_fix_rank = 1;
Or even ROWNUM:
SELECT * FROM (
SELECT numcom, nompr, salaire_fix
FROM commercialv
ORDER BY salaire_fix DESC
) WHERE rownum = 1;
The only difficulty with the last is that it will get only one row even if there are additional rows with the maximum value of salaire_fix. The first two queries will get more than one row in that case.
You can't use group function in where clause so you can use having clause. Example:
SELECT DEPTNO,COUNT(*)
FROM EMP
GROUP BY DEPTNO
HAVING COUNT(*) >= 2;

Postgres Window Function Syntax

Why does the following query:
select ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY time DESC) as rownum FROM users where rownum < 20;
produce the following error?
ERROR: column "rownum" does not exist
LINE 1: ...d ORDER BY time DESC) as rownum FROM users where rownum < 2...
How can I structure this query so that I get the first 20 items, as defined by my window function?
user_id and time are both defined columns on users.
It would work like this:
SELECT *
FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY time DESC) AS rownum
FROM users
) x
WHERE rownum < 20;
The point here is the sequence of events. Window functions are applied after the WHERE clause. Therefore rownum is not visible, yet. You have to put it into a subquery or CTE and apply the WHERE clause on rownum in the next query level.
Per documentation:
Window functions are permitted only in the SELECT list and the ORDER BY
clause of the query. They are forbidden elsewhere, such as in GROUP BY,
HAVING and WHERE clauses. This is because they logically execute
after the processing of those clauses. Also, window functions execute
after regular aggregate functions. This means it is valid to include
an aggregate function call in the arguments of a window function, but
not vice versa.
Because the where clause executes before the select so it does not know about that alias yet. Do it like this:
select *
from (
select ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY time DESC) as rownum
FROM users
) s
where rownum < 20;

Why no windowed functions in where clauses?

Title says it all, why can't I use a windowed function in a where clause in SQL Server?
This query makes perfect sense:
select id, sales_person_id, product_type, product_id, sale_amount
from Sales_Log
where 1 = row_number() over(partition by sales_person_id, product_type, product_id order by sale_amount desc)
But it doesn't work. Is there a better way than a CTE/Subquery?
EDIT
For what its worth this is the query with a CTE:
with Best_Sales as (
select id, sales_person_id, product_type, product_id, sale_amount, row_number() over (partition by sales_person_id, product_type, product_id order by sales_amount desc) rank
from Sales_log
)
select id, sales_person_id, product_type, product_id, sale_amount
from Best_Sales
where rank = 1
EDIT
+1 for the answers showing with a subquery, but really I'm looking for the reasoning behind not being able to use windowing functions in where clauses.
why can't I use a windowed function in a where clause in SQL Server?
One answer, though not particularly informative, is because the spec says that you can't.
See the article by Itzik Ben Gan - Logical Query Processing: What It Is And What It Means to You and in particular the image here. Window functions are evaluated at the time of the SELECT on the result set remaining after all the WHERE/JOIN/GROUP BY/HAVING clauses have been dealt with (step 5.1).
really I'm looking for the reasoning behind not being able to use
windowing functions in where clauses.
The reason that they are not allowed in the WHERE clause is that it would create ambiguity. Stealing Itzik Ben Gan's example from High-Performance T-SQL Using Window Functions (p.25)
Suppose your table was
CREATE TABLE T1
(
col1 CHAR(1) PRIMARY KEY
)
INSERT INTO T1 VALUES('A'),('B'),('C'),('D'),('E'),('F')
And your query
SELECT col1
FROM T1
WHERE ROW_NUMBER() OVER (ORDER BY col1) <= 3
AND col1 > 'B'
What would be the right result? Would you expect that the col1 > 'B' predicate ran before or after the row numbering?
There is no need for CTE, just use the windowing function in a subquery:
select id, sales_person_id, product_type, product_id, sale_amount
from
(
select id, sales_person_id, product_type, product_id, sale_amount,
row_number() over(partition by sales_person_id, product_type, product_id order by sale_amount desc) rn
from Sales_Log
) sl
where rn = 1
Edit, moving my comment to the answer.
Windowing functions are not performed until the data is actually selected which is after the WHERE clause. So if you try to use a row_number in a WHERE clause the value is not yet assigned.
"All-at-once operation" means that all expressions in the same
logical query process phase are evaluated logically at the same time.
And great chapter Impact on Window Functions:
Suppose you have:
CREATE TABLE #Test ( Id INT) ;
INSERT INTO #Test VALUES ( 1001 ), ( 1002 ) ;
SELECT Id
FROM #Test
WHERE Id = 1002
AND ROW_NUMBER() OVER(ORDER BY Id) = 1;
All-at-Once operations tell us these two conditions evaluated logically at the same point of time. Therefore, SQL Server can
evaluate conditions in WHERE clause in arbitrary order, based on
estimated execution plan. So the main question here is which condition
evaluates first.
Case 1:
If ( Id = 1002 ) is first, then if ( ROW_NUMBER() OVER(ORDER BY Id) = 1 )
Result: 1002
Case 2:
If ( ROW_NUMBER() OVER(ORDER BY Id) = 1 ), then check if ( Id = 1002 )
Result: empty
So we have a paradox.
This example shows why we cannot use Window Functions in WHERE clause.
You can think more about this and find why Window Functions are
allowed to be used just in SELECT and ORDER BY clauses!
Addendum
Teradata supports QUALIFY clause:
Filters results of a previously computed ordered analytical function according to user‑specified search conditions.
SELECT Id
FROM #Test
WHERE Id = 1002
QUALIFY ROW_NUMBER() OVER(ORDER BY Id) = 1;
Snowflake - Qualify
QUALIFY does with window functions what HAVING does with aggregate functions and GROUP BY clauses.
In the execution order of a query, QUALIFY is therefore evaluated after window functions are computed. Typically, a SELECT statement’s clauses are evaluated in the order shown below:
From
Where
Group by
Having
Window
QUALIFY
Distinct
Order by
Limit
Databricks - QUALIFY clasue
Filters the results of window functions. To use QUALIFY, at least one window function is required to be present in the SELECT list or the QUALIFY clause.
You don't necessarily need to use a CTE, you can query the result set after using row_number()
select row, id, sales_person_id, product_type, product_id, sale_amount
from (
select
row_number() over(partition by sales_person_id,
product_type, product_id order by sale_amount desc) AS row,
id, sales_person_id, product_type, product_id, sale_amount
from Sales_Log
) a
where row = 1
It's an old thread, but I'll try to answer specifically the question expressed in the topic.
Why no windowed functions in where clauses?
SELECT statement has following main clauses specified in keyed-in order:
SELECT DISTINCT TOP list
FROM JOIN ON / APPLY / PIVOT / UNPIVOT
WHERE
GROUP BY WITH CUBE / WITH ROLLUP
HAVING
ORDER BY
OFFSET-FETCH
Logical Query Processing Order, or Binding Order, is conceptual interpretation order, it defines the correctness of the query. This order determines when the objects defined in one step are made available to the clauses in subsequent steps.
----- Relational result
1. FROM
1.1. ON JOIN / APPLY / PIVOT / UNPIVOT
2. WHERE
3. GROUP BY
3.1. WITH CUBE / WITH ROLLUP
4. HAVING
---- After the HAVING step the Underlying Query Result is ready
5. SELECT
5.1. SELECT list
5.2. DISTINCT
----- Relational result
----- Non-relational result (a cursor)
6. ORDER BY
7. TOP / OFFSET-FETCH
----- Non-relational result (a cursor)
For example, if the query processor can bind to (access) the tables or views defined in the FROM clause, these objects and their columns are made available to all subsequent steps.
Conversely, all clauses preceding the SELECT clause cannot reference any column aliases or derived columns defined in SELECT clause. However, those columns can be referenced by subsequent clauses such as the ORDER BY clause.
OVER clause determines the partitioning and ordering of a row set before the associated window function is applied. That is, the OVER clause defines a window or user-specified set of rows within an Underlying Query Result set and window function computes result against that window.
Msg 4108, Level 15, State 1, …
Windowed functions can only appear in the SELECT or ORDER BY clauses.
The reason behind is because the way how Logical Query Processing works in T-SQL. Since the underlying query result is established only when logical query processing reaches the SELECT step 5.1. (that is, after processing the FROM, WHERE, GROUP BY and HAVING steps), window functions are allowed only in the SELECT and ORDER BY clauses of the query.
Note to mention, window functions are still part of relational layer even Relational Model doesn't deal with ordered data. The result after the SELECT step 5.1. with any window function is still relational.
Also, speaking strictly, the reason why window function are not allowed in the WHERE clause is not because it would create ambiguity, but because the order how Logical Query Processing processes SELECT statement in T-SQL.
Links: here, here and here
Finally, there's the old-fashioned, pre-SQL Server 2005 way, with a correlated subquery:
select *
from Sales_Log sl
where sl.id = (
Select Top 1 id
from Sales_Log sl2
where sales_person_id = sl.sales_person_id
and product_type = sl.product_type
and product_id = sl.product_id
order by sale_amount desc
)
I give you this for completeness, merely.
Basically first "WHERE" clause condition is read by sql and the same column/value id looked into the table but in table row_num=1 is not there still. Hence it will not work.
Thats the reason we will use parentheses first and after that we will write the WHERE clause.
Yes unfortunately when you do a windowed function SQL gets mad at you even if your where predicate is legitimate. You make a cte or nested select having the value in your select statement, then reference your CTE or nested select with that value later. Simple example that should be self explanatory. If you really HATE cte's for some performance issue on doing a large data set you can always drop to temp table or table variable.
declare #Person table ( PersonID int identity, PersonName varchar(8));
insert into #Person values ('Brett'),('John');
declare #Orders table ( OrderID int identity, PersonID int, OrderName varchar(8));
insert into #Orders values (1, 'Hat'),(1,'Shirt'),(1, 'Shoes'),(2,'Shirt'),(2, 'Shoes');
--Select
-- p.PersonName
--, o.OrderName
--, row_number() over(partition by o.PersonID order by o.OrderID)
--from #Person p
-- join #Orders o on p.PersonID = o.PersonID
--where row_number() over(partition by o.PersonID order by o.orderID) = 2
-- yields:
--Msg 4108, Level 15, State 1, Line 15
--Windowed functions can only appear in the SELECT or ORDER BY clauses.
;
with a as
(
Select
p.PersonName
, o.OrderName
, row_number() over(partition by o.PersonID order by o.OrderID) as rnk
from #Person p
join #Orders o on p.PersonID = o.PersonID
)
select *
from a
where rnk >= 2 -- only orders after the first one.