How to rank users and get a subset from this rank with my user and the above and below user by rank position - sql

I am working on a query right now to get a ranking of my users. I have two tables one for users and the other one for profits where I save the amount and the user id to which is related. By getting the total of profits generated by a user I need to build a rank with three users, the user in the next higher ranked position to my user, my user and the user in the next lower ranked position to my user. For example:
id | name | total_profit | rank
-------+-----------------------------+--------------+------
10312 | John Doe | 7000.0 | 1
10329 | Michael Jordan | 5000.0 | 2
10333 | Kobe Bryant | 4000.0 | 3
10327 | Mike Bibby | 4000.0 | 3
10331 | Phil Jackson | 1000.0 | 4
In this if my user is Kobe Bryant I would need to get a rank with Michael Jordan, Kobe Bryant and Phil Jackson.
If my user is Mike Bibby I would need to get a rank with Michale Jordan, Mike Bybby and Phil Jackson.
Until now I have a query that returns me a full rank with all the users but I do not now what is a good way to get the three users that I want. I have tried to do this with ruby but I think it would be better if I do all this processing in the DB.
SELECT users.id, users.name, total_profit, rank() OVER(ORDER BY total_profit DESC)
FROM users
INNER JOIN (SELECT sum(profits.amount) AS total_profit, investor_id
FROM profits GROUP BY profits.investor_id) profits ON profits.investor_id = users.id
ORDER BY total_profit DESC;
I am using PostgresSQL 9.1.4

with s as (
select
users.id, users.name, total_profit,
rank() over(order by total_profit desc) as r
from
users
inner join
(
select sum(profits.amount) as total_profit,
investor_id
from profits
group by profits.investor_id
) profits on profits.investor_id = users.id
), u as (
select r from s where name = 'Kobe Bryant'
)
select distinct on (r) id, name, total_profit, r
from s
where
name = 'Kobe Bryant'
or r in (
(select r from u) - 1, (select r from u) + 1
)
order by r;

with cte_profits as (
select
sum(p.amount) as total_profit, p.investor_id
from profits as p
group by p.investor_id
), cte_users_profits as (
select
u.id, u.name, p.toral_profit,
dense_rank() over(order by p.total_profit desc) as rnk,
row_number() over(partition by up.total_profit order by up.id) as rn
from users as u
inner join cte_profits as p on p.investor_id = u.id
)
select c2.*
from cte as c
left outer join cte as c2 on
c2.id = c.id or
c2.rnk = c.rnk + 1 and c2.rn = 1 or
c2.rnk = c.rnk - 1 and c2.rn = 1
where c.name = 'Kobe Bryant'
order by c2.rnk
sql fiddle demo

Related

SQL MAX aggregate function not bringing the latest date

Purpose: I am trying to find the max date of when the teachers made a purchase and type.
Orders table
ID
Ordertype
Status
TeacherID
PurchaseDate
SchoolID
TeacherassistantID
1
Pencils
Completed
1
1/1/2021
1
1
2
Paper
Completed
1
3/5/2021
1
1
3
Notebooks
Completed
1
4/1/2021
1
1
4
Erasers
Completed
2
2/1/2021
2
2
Teachers table
TeacherID
Teachername
1
Mary Smith
2
Jason Crane
School table
ID
schoolname
1
ABC school
2
PS1
3
PS2
Here is my attempted code:
SELECT o.ordertype, o.status, t.Teachername, s.schoolname
,MAX(o.Purchasedate) OVER (PARTITION by t.ID) last_purchase
FROM orders o
INNER JOIN teachers t ON t.ID=o.TeacherID
INNER JOIN schools s ON s.ID=o.schoolID
WHERE o.status in ('Completed','In-progress')
AND o.ordertype not like 'notebook'
It should look like this:
Ordertype
Status
teachername
last_purchase
schoolname
Paper
Completed
Mary Smith
3/5/2021
ABC School
Erasers
Completed
PS1
2/1/2021
ABC school
It is bringing multiple rows instead of just the latest purchase date and its associated rows. I think i need a subquery.
Aggregation functions are not appropriate for what you are trying to do. Their purpose is to summarize values in multiple rows, not to choose a particular row.
Just a window function does not filter any rows.
You want to use window functions with filtering:
SELECT ordertype, status, Teachername, schoolname, Purchasedate
FROM (SELECT o.ordertype, o.status, t.Teachername, s.schoolname,
o.Purchasedate,
ROW_NUMBER() OVER (PARTITION by t.ID ORDER BY o.PurchaseDate DESC) as seqnum
FROM orders o JOIN
teachers t
ON t.ID = o.TeacherID
schools s
ON s.ID = o.schoolID
WHERE o.status in ('Completed', 'In-progress') AND
o.ordertype not like 'notebook'
) o
WHERE seqnum = 1;
You can use it in different way. it's better to use Group By for grouping the other columns and after that use Order by for reorder all records just like bellow.
SELECT top 1 o.ordertype, o.status, t.Teachername, s.schoolname
,o.Purchasedate
FROM orders o
INNER JOIN teachers t ON t.ID=o.TeacherID
INNER JOIN schools s ON s.ID=o.schoolID
having o.status in ('Completed','In-progress')
AND o.ordertype not like 'notebook'
group by o.ordertype, o.status, t.Teachername, s.schoolname
order by o.Purchasedate Desc

Top 10 of total amount paid aggregated by provider, partitioned by state - PostgreSQL

I have a database of medicare data with three tables: provider metadata (doctor's unique number, name, city, state, credentials, etc); hcpcs metadata (code, description, if it's for drugs or not); provider_services (doctor's unique number, hcpcs code, number of services completed by that doctor, average cost)
I'm trying to get the top 10 payments by state, aggregated by provider. However I'm running into an issue where 1) I can't figure out how to rank by the total payment and 2) I can't figure out how to aggregate the providers. Here's the best query I've gotten so far:
SELECT *
FROM (
SELECT p.npi,
p.nppes_provider_last_org_name AS last_name,
p.nppes_provider_first_name AS first_name,
p.nppes_provider_city AS city,
p.nppes_provider_state AS state,
(ps.average_medicare_payment_amt * ps.line_srvc_cnt) AS total_amount,
RANK() OVER (PARTITION BY p.nppes_provider_state ORDER BY ps.average_medicare_payment_amt desc) AS rank
FROM provider_services ps
JOIN provider p ON ps.npi = p.npi
) t
WHERE rank <= 10
GROUP BY t.last_name, t.npi, t.first_name, t.city, t.state, t.total_amount, t.rank
ORDER BY state ASC;
This results in something like:
| LAST | FIRST| STATE | TOTAL | RANK |
|-------|------|----|---------|---|
| DOE | JANE | AK | 3000.41 | 10|
| SMITH | JOHN | AK | 6000.98 | 7 |
| COLE | ANN | AK | 1000 | 4 |
| SMITH | JOHN | AK | 1560.32 | 1 |
So my issues are 1. the providers aren't aggregating (John Smith with the same unique number showing up multiple times) and 2. I can only get it to compile with that average_payment_amt and not total_amt so the rankings are really screwed up.
Consider following adjustments:
Avoid ever using SELECT * in aggregate queries with GROUP BY. It is a wonder this query was allowed in PostgreSQL without error but such use of SELECT * may be shorthand for all columns specified in GROUP BY.
Use calculated expression for total_amount in the window function's ORDER BY clause.
Apply an aggregation function like SUM on your total_amount and do not include it as grouping column. In fact, you do not mention how you want to aggregate by provider.
Rank based on state throws off aggregation based on different column: provider. Right now it appears you want to use rank only for filtering records and not display.
Below achieves the following:
Sums total payment amounts by provider for the top 10 payment amounts per state.
SELECT t.npi, t.last_name, t.first_name, t.city, t.state,
SUM(t.total_amount) AS total_amount
FROM (
SELECT p.npi,
p.nppes_provider_last_org_name AS last_name,
p.nppes_provider_first_name AS first_name,
p.nppes_provider_city AS city,
p.nppes_provider_state AS state,
(ps.average_medicare_payment_amt * ps.line_srvc_cnt) AS total_amount,
RANK() OVER (PARTITION BY p.nppes_provider_state
ORDER BY ps.average_medicare_payment_amt * ps.line_srvc_cnt DESC) AS rank
FROM provider_services ps
JOIN provider p ON ps.npi = p.npi
) t
WHERE rank <= 10
GROUP BY t.npi, t.last_name, t.first_name, t.city, t.state
ORDER BY t.state ASC;
Now, below achieves the following if this is your intention:
Displays records of top 10 payments per state in state and rank order (where providers can repeat if they ranked multiple times within or between states).
SELECT t.*
FROM (
SELECT p.npi,
p.nppes_provider_last_org_name AS last_name,
p.nppes_provider_first_name AS first_name,
p.nppes_provider_city AS city,
p.nppes_provider_state AS state,
(ps.average_medicare_payment_amt * ps.line_srvc_cnt) AS total_amount,
RANK() OVER (PARTITION BY p.nppes_provider_state
ORDER BY ps.average_medicare_payment_amt * ps.line_srvc_cnt DESC) AS rank
FROM provider_services ps
JOIN provider p ON ps.npi = p.npi
) t
WHERE rank <= 10
ORDER BY t.state, t.rank;
I am guessing that you actually want to aggregate in the subquery and rank by the total amount:
SELECT t.*
FROM (SELECT p.npi,
p.nppes_provider_last_org_name AS last_name,
p.nppes_provider_first_name AS first_name,
p.nppes_provider_state AS state,
SUM(ps.average_medicare_payment_amt * ps.line_srvc_cnt) AS total_amount,
RANK() OVER (PARTITION BY p.nppes_provider_state ORDER BY SUM(ps.average_medicare_payment_amt * ps.line_srvc_cnt) DESC) as rnk
FROM provider_services ps JOIN
provider p
ON ps.npi = p.npi
) t
WHERE rnk <= 10
ORDER BY state ASC, total_amount DESC;

SQL Max Value for a Specified Limit

I'm trying to return a list of years when certain conditions are met but I am having trouble with the MAX function and having it work with the rest of my logic.
For the following two tables:
coach
coach | team | wins | year
------+------+------+------
nk01 | a | 4 | 2000
vx92 | b | 1 | 2000
nk01 | b | 5 | 2003
vx92 | a | 2 | 2003
team
team | worldcupwin | year
-----+-------------+------
a | Y | 2000
b | N | 2000
a | Y | 2003
b | N | 2003
I want to get the following output:
years
-----
2000
Where the years printed are where the coaches' team with most wins during that year also won the world cup.
I decided to use the MAX function but quickly ran into the problem of not knowing how to use it to only be looking for max values for a certain year. This is what I've got so far:
SELECT y.year
FROM (SELECT c.year, MAX(c.wins), c.team
FROM coach AS c
WHERE c.year >= 1999
GROUP BY c.year, c.team) AS y, teams AS t
WHERE y.year = t.year AND t.worldcupwin = 'Y' AND y.team = t.team;
This query outputs all years greater than 1999 for me, rather than just those where a coach with the most wins also won the world cup.
(Using postgresql)
Any help is appreciated!
You can use correlated subquery
DEMO
SELECT c.year, c.team
FROM coachs AS c inner join teams t on c.team = t.team and c.year=t.year
WHERE c.year >= 1999 and exists (select 1 from coachs c1 where c.team=c1.team
having max(c1.wins)=c.wins)
and t.worldcupwin = 'Y'
OUTPUT:
year team
2000 a
The following query uses DISTINCT ON:
SELECT DISTINCT ON (year) c.year, wins, worldcupwin, c.team
FROM coach AS c
INNER JOIN team AS t ON c.team = t.team AND c.year = t.year
WHERE c.year > 1999
ORDER BY year, wins DESC
in order to return the records having the biggest number of wins per year
year wins worldcupwin team
---------------------------------
2000 4 Y a
2003 5 N b
Filtering out teams that didn't win the world cup:
SELECT year, team
FROM (
SELECT DISTINCT ON (year) c.year, wins, worldcupwin, c.team
FROM coach AS c
INNER JOIN team AS t ON c.team = t.team AND c.year = t.year
WHERE c.year > 1999
ORDER BY year, wins DESC) AS t
WHERE t.worldcupwin = 'Y'
ORDER BY year, wins DESC
gives the expected result:
year team
-------------
2000 a
Demo here
You can use the below to get the desired result:
EASY METHOD
SELECT TOP 1 c.year
FROM coach AS c INNER JOIN team AS t ON c.team = t.team AND c.year = t.year
WHERE t.worldcupwin = 'Y'
ORDER BY c.wins DESC;
use row_number() window function
select a.coach,a.team,a.win,a.year from
(select c.*,t.*,
row_number()over(order by wins desc) rn
from coach c join team t on c.team=t.team
where worldcupwin='Y'
) a where a.rn=1

Max function returning multiple values [SQL]

I have 3 tables: money, student, faculty. This query returns each faculty and highest stipend in each one of them.
select
f.name as "FACULTY_NAME",
max(stipend) as "MAX_STIPEND"
from
money m, student s
inner join
faculty f on f.id_faculty = s.faculty_id
where
m.student_id = s.id_student
group by
f.id_faculty, f.name;
Query works fine:
FACULTY_NAME | MAX_STIPEND
-----------------+---------------
IT Faculty | 50
Architecture | 60
Journalism | 40
However when I add s.name to original query to also show the name of the student who received max_stipend, query is not working like it used to - it returns all of the students
select
f.name as "FACULTY_NAME",s.name,
max(stipend) as "MAX_STIPEND"
from
money m, student s
inner join
faculty f on f.id_faculty = s.faculty_id
where
m.student_id = s.id_student
group by
f.id_faculty, f.name, s.name;
Query result:
FACULTY_NAME | s.name | MAX_STIPEND
----------------+-----------+---------------
IT Faculty | Joe | 50
IT Faculty | Lisa | 10
Architecture | Bob | 60
Journalism | Fred | 5
Architecture | Susan | 5
Journalism | Tom | 40
It does the same thing using right, left and inner joins. Can someone tell where the problem is?
First, you should be using proper JOIN syntax for all your joins.
Second, you can use Oracle's keep syntax:
select f.name as FACULTY_NAME,
max(stipend) as MAX_STIPEND,
max(s.name) keep (dense_rank first order by stipend desc)
from money m join
student s
on m.student_id = s.id_student join
faculty f
on f.id_faculty = s.faculty_id
group by f.id_faculty, f.name;
However when I add s.name to original query to also show the name of the student who received max_stipend, query is not working like it used to - it returns all of the students
When you add s.name you are looking for min value for each user.
If you want the name of user who has the MAX_STIPEND you should to move to window functions. For example Dense Rank in MS SQL Server.
with cte as
(select
f.name as "FACULTY_NAME",
s.name as "STUDENT_NAME",
stipend as "MAX_STIPEND",
DENSE_RANK() OVER
(PARTITION BY f.name, s.name ORDER BY i.stipend DESC) AS Rank
from
money m
inner join student s on m.student_id = s.id_student
inner join
faculty f on f.id_faculty = s.faculty_id
)
select "FACULTY_NAME", "STUDENT_NAME"
from cte
where rank = 1
Not all sql brands have windowed functions. Here the link for dense_rank on MySQL and also dense_Rank for Oracle

SQL Server - select min date and id from foreign key

These are my tables:
USER:
id_user name email last_access id_company
1 jhonatan abc#abc.com 2014-12-15 1
2 cesar cef#cef.com 2014-12-31 1
3 john 123#123.com 2015-01-09 2
4 steven 897#asdd.cpom 2015-01-02 2
5 greg sd#touch.com 2014-12-07 1
6 kyle fb#fb.com 2014-11-20 1
COMPANY:
id_company company
1 Facebook
2 Appslovers
I need to know, what are the users which has the MIN last_access per company (just one). It could be like this:
id_user name last_access company
6 kyle 2014-11-20 Facebook
4 steven 2015-01-02 Appslovers
Is it possible ?
Use window function
SELECT id_user,
NAME,
last_access,
company
FROM (SELECT id_user,
NAME,
last_access,
company,
Row_number()OVER(partition BY company ORDER BY last_access) rn
FROM users u
JOIN company c
ON u.id_company = c.id_company) a
WHERE rn = 1
or join both the tables find the min last_access date per company then join the result back to the users table to get the result
SELECT id_user,
NAME,
a.last_access,
a.company
FROM users u
JOIN(SELECT u.id_company,
Min(last_access) last_access,
company
FROM users u
JOIN company c
ON u.id_company = c.id_company
GROUP BY u.id_company,
company) a
ON a.id_company = u.id_company
AND u.last_access = a.last_access
This can be done in many ways, for example by using a window function like row_number to partition the data and then selecting the top rows from each group like this:
;with cte (id_user, name, last_access, company, seq) as (
select
id_user,
name,
last_access,
company,
seq = row_number() over (partition by u.id_company order by last_access)
from [user] u
inner join [company] c on u.id_company = c.id_company
)
select id_user, name, last_access, company
from cte where seq = 1