Select second higher value with Rank() function in PosgreSql - sql

I tried this :
SELECT
code_nuance, nb_voix,
RANK () OVER ( ORDER BY nb_voix DESC) AS rank
FROM election_2015.resultat_nuance_departement
WHERE rank = 2;
And same with alias :
SELECT
rnd.code_nuance, rnd.nb_voix,
RANK () OVER ( ORDER BY rnd.nb_voix DESC) AS rank
FROM election_2015.resultat_nuance_departement rnd
WHERE rank = 2;
Rank is not recognized in the WHERE close.
It says "Rank doesn't exist"
Any one?
Any suggestions welcomed, ty !

As with all other column aliases, you cannot use the alias in the where clause. For this purpose a subquery is handy:
SELECT x.*
FROM (SELECT rnd.code_nuance, rnd.nb_voix,
RANK() OVER (ORDER BY rnd.nb_voix DESC) AS rank
FROM election_2015.resultat_nuance_departement rnd
) x
WHERE rank = 2;
If the values are all unique, you can also use FETCH:
select rnd.*
from election_2015.resultat_nuance_departement rnd
order by rnd.db_voix desc
offset 1 fetch first 1 row only;

from the postgresql docs:
An output column's name can be used to refer to the column's value in ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses; there you must write out the expression instead.
This is because WHERE clause is resolved before column aliases are considered.
Another solution could be using a subquery.

Related

BigQuery - Extract last entry of each group

I have one table where multiple records inserted for each group of product. Now, I want to extract (SELECT) only the last entries. For more, see the screenshot. The yellow highlighted records should be return with select query.
The HAVING MAX and HAVING MIN clause for the ANY_VALUE function is now in preview
HAVING MAX and HAVING MIN were just introduced for some aggregate functions - https://cloud.google.com/bigquery/docs/release-notes#February_06_2023
with them query can be very simple - consider below approach
select any_value(t having max datetime).*
from your_table t
group by t.id, t.product
if applied to sample data in your question - output is
You might consider below as well
SELECT *
FROM sample_table
QUALIFY DateTime = MAX(DateTime) OVER (PARTITION BY ID, Product);
If you're more familiar with an aggregate function than a window function, below might be an another option.
SELECT ARRAY_AGG(t ORDER BY DateTime DESC LIMIT 1)[SAFE_OFFSET(0)].*
FROM sample_table t
GROUP BY t.ID, t.Product
Query results
You can use window function to do partition based on key and selecting required based on defining order by field.
For Example:
select * from (
select *,
rank() over (partition by product, order by DateTime Desc) as rank
from `project.dataset.table`)
where rank = 1
You can use this query to select last record of each group:
Select Top(1) * from Tablename group by ID order by DateTime Desc

Rank Function inside case statement

I am trying to use Rank Function inside a case statement and give where rank_number = 1 , it's throwing error as unexpected where Condition. Can some one help me how to assign rank in where clause inside case statement
You can't use the RANK() analytic function (or any other one, for that matter) in the WHERE clause of a query. The results of the rank computation are not yet available. But they are available in the SELECT clause or the ORDER BY clause. One workaround would be to subquery:
SELECT *
FROM
(
SELECT t.*, RANK() OVER (ORDER BY blah) rnk
FROM yourTable t
) s
WHERE rnk = 1;
Some databases support a QUALIFY clause, where it is possible to use analytic functions. Assuming you are using something like Teradata or BigQuery, you could use:
SELECT *
FROM yourTable
WHERE 1 = 1
QUALIFY RANK() OVER (ORDER BY blah) = 1;

Use Rank() over a pseudo Column Name

I have a table with columns:
StudentName
Marks1
Marks2
from which I need to perform a query that will calculate the average of two marks and rank the rows from highest average to least.
I executed the following query:
SELECT
*,
(SELECT AVG(c) FROM (VALUES(Marks1),(Marks2)) T (c)) AS Average,
RANK() OVER (ORDER BY Average DESC) AS Position
from Marks;
But that gives an error:
Average is an Invalid Column Name.
How do I fix this? How do I give a query to perform Rank() over Average.
You can't reference a column by its alias in the SELECT; the only place you can reference its alias is in the ORDER BY clause.
What you can do, however, is move the subquery to the FROM, and then you can reference the column returned in your (outer) SELECT:
SELECT M.*,--List your columns here, don't use *
A.Average,
RANK() OVER (ORDER BY A.Average DESC) AS Position
FROM Marks M
CROSS APPLY(SELECT AVG(Mark) AS Average FROM (VALUES(Marks1),(Marks2)) V(Mark) ) A;
You should just use the average of the two marks inlined in the outer query:
SELECT *, RANK() OVER (ORDER BY (Marks1 + Marks2) / 2 DESC) AS Position
FROM Marks
ORDER BY (Marks1 + Marks2) / 2 DESC;

How to get the row number when using alias in orderby

I have a query. I use an alias in order by when using row_number and I got
[42703] ERROR: column "total_comments" does not exist error Position: 335
How can I fix this?
select
cr_seller_history_id,
c.created_at,
company_name,
business_name,
brand,
kep_mail,
address,
phone,
mail,
slug,
name,
point,
contact_positive,
contact_negative,
product_number,
(product_positive + product_negative) as total_comments,
ROW_NUMBER() OVER(ORDER BY total_comments) as rank
from cr_companies a
INNER JOIN cr_sellers b ON a.cr_company_id = b.cr_company_id
INNER JOIN cr_seller_histories c ON b.cr_seller_id = c.cr_seller_id
WHERE DATE(c.created_at) = DATE 'yesterday'
ORDER BY total_comments DESC NULLS LAST
The other solutions are a subquery, CTE, or a lateral join. So, you can write:
select . . .
v.total_comments,
row_number() over (order by v.total_comments) as rank
from cr_companies c join
cr_sellers s
on c.cr_company_id = s.cr_company_id join
cr_seller_histories sh
on s.cr_seller_id = sh.cr_seller_id, lateral
(values (product_positive + product_negative)) v(total_comments)
where DATE(c.created_at) = date 'yesterday'
order by v.total_comments desc nulls last;
Notice that I also changed the table aliases to be abbreviations for the table names. This is a best practice and makes it much easier to write, read, and modify queries.
the problem is in:
ROW_NUMBER() OVER(ORDER BY total_comments) as rank
you can't use alias like this - order by accepts alias in select, not in window function:
https://www.postgresql.org/docs/current/static/sql-select.html#SQL-SELECT-LIST
An output column's name can be used to refer to the column's value in
ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses;
there you must write out the expression instead.
instead try:
ROW_NUMBER() OVER(ORDER BY (product_positive + product_negative)) as rank
or use subquery - then alias can be used in window function

Postgres Window Function Syntax

Why does the following query:
select ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY time DESC) as rownum FROM users where rownum < 20;
produce the following error?
ERROR: column "rownum" does not exist
LINE 1: ...d ORDER BY time DESC) as rownum FROM users where rownum < 2...
How can I structure this query so that I get the first 20 items, as defined by my window function?
user_id and time are both defined columns on users.
It would work like this:
SELECT *
FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY time DESC) AS rownum
FROM users
) x
WHERE rownum < 20;
The point here is the sequence of events. Window functions are applied after the WHERE clause. Therefore rownum is not visible, yet. You have to put it into a subquery or CTE and apply the WHERE clause on rownum in the next query level.
Per documentation:
Window functions are permitted only in the SELECT list and the ORDER BY
clause of the query. They are forbidden elsewhere, such as in GROUP BY,
HAVING and WHERE clauses. This is because they logically execute
after the processing of those clauses. Also, window functions execute
after regular aggregate functions. This means it is valid to include
an aggregate function call in the arguments of a window function, but
not vice versa.
Because the where clause executes before the select so it does not know about that alias yet. Do it like this:
select *
from (
select ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY time DESC) as rownum
FROM users
) s
where rownum < 20;