I'm trying to retrieve top 5 marks from school database in databricks deltatable using SQL query. So I wrote following query
select
rs.State, rs.Year, rs.CW, rs.Country, rs.SchoolName,
rs.EducationSystem, rs.MarksS1, rs.MarksS2, rs.MarksS3, rs.MarksS4,
rs.TotalMarks, rs.group_rank
from
(select
State, Year, Country, SchoolName, EducationSystem,
MarksS1, MarksS2, MarksS3, MarksS4, TotalMarks,
row_number() over (partition by State, Year, Country, SchoolName, EducationSystem
order by TotalMarks DESC, MarksS1 ASC) as group_rank
from
nm_combined_historical_data_distinct) rs
where
group_rank < 6
I'm trying to get top 5 students based on marks. But if top 5 students have same marks, I want my output like this
State year Country School Name Education System MarksS1 MarksS2 MarksS3 MarksS4 Total
AZ 2020 US XYZ ABC 95 91 92 95 373
AZ 2020 US XYZ ABC 95 91 92 95 373
AZ 2020 US XYZ ABC 95 91 92 95 373
AZ 2020 US XYZ ABC 95 91 92 95 373
AZ 2020 US XYZ ABC 95 91 92 95 373
But my output is coming in this way
State year Country School Name Education System MarksS1 MarksS2 MarksS3 MarksS4 Total
AZ 2020 US XYZ ABC 95 91 92 95 373
Can you suggest me how do I get my desired output
Instead of row_number() use rank():
rank() over (partition by State, Year, Country, SchoolName, EducationSystem
order by TotalMarks DESC, MarksS1 ASC
) as group_rank
Related
Table A
shop
amount
count
sameShopCount
shop5
100
1
1
shop2
99
2
1
shop3
98
3
1
shop4
97
4
1
shop1
96
5
1
shop2
95
6
2
shop4
94
7
2
shop5
93
8
2
shop5
92
9
3
shop1
91
10
2
shop5
90
11
4
shop3
89
12
2
Expected Result (order by amount desc):
shop
amount
expected result
shop5
100
1
shop2
99
2
shop3
98
3
shop4
97
4
shop1
96
5
shop2
95
2
shop4
94
4
shop5
93
1
shop5
92
1
shop1
91
5
shop5
90
1
shop3
89
3
I want to count shop column similar to count column in Table A. But also if shop exist more than 1 time it will reuse the first exist count number.
How can I achieved this with/without a temp table in SQL Server respectively? (SQL Server 2014 - build v12.0.6108.1)
I had tried something like:
ROW_NUMBER() OVER (ORDER BY amount DESC)
DENSE_RANK() OVER (PARTITION BY shop ORDER BY amount DESC)
Try using max and dense_rank window functions as the following:
with max_shop_amount as
(
select *,
max(amount) over (partition by shop) as mx
from table_name
)
select shop, amount,
dense_rank() over (order by mx desc) expected
from max_shop_amount
order by amount desc
See demo
I'm trying to return the top 3 spending customers per country for a table like this:
customer_id
country
spend
159
China
45
152
China
8
159
China
21
160
China
6
161
China
9
162
China
93
152
China
3
168
Germany
91
169
Germany
101
170
Germany
38
171
Germany
17
154
Germany
11
154
Germany
50
167
Germany
63
168
Germany
1
153
Japan
7
163
Japan
58
164
Japan
44
153
Japan
19
164
Japan
10
165
Japan
15
166
Japan
24
153
Japan
105
I've tried the below code but it's not returning the correct results.
SELECT customer_id, country, spend FROM (SELECT customer_id, country, spend,
#country_rank := IF(#current_country = country, #country_rank + 1, 1)
AS country_rank,
#current_country := country
FROM table1
ORDER BY country ASC, spend DESC) ranked_rows
WHERE country_rank<=3;
Since some customers are also repeat customers, I want to make sure that it's the sum of spend per customer that's being taken into account.
You appear to be using MySQL. If you're running version 8 or later, then just use ROW_NUMBER() here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY country ORDER BY spend DESC) rn
FROM table1
)
SELECT customer_id, country, spend
FROM cte
WHERE rn <= 3;
Here is our table
name math physics chemistry hindi english
pk 85 65 45 54 40
ashis 87 44 87 78 74
rohit 77 47 68 63 59
mayank 91 81 78 47 84
komal 47 51 73 61 55
we want to result show as (summing the grades essentially)
rank name total
1 mayank 381
2 ashis 370
3 rohit 314
4 pk 289
5 komal 287
SET #rank=0;
SELECT #rank:=#rank+1 AS rank,name,(math+physics+chemistry+hindi+english) as total
FROM tablename ORDER BY total DESC
this will produce your desired result as
rank | name | total
--------------------
1 | mayank | 381
2 | ashis | 370
for more details take a look mysql ranking results
Try this
SELECT #curRank := #curRank + 1 AS rank, name, (math + physics + chemistry + hindi + history) AS total FROM table, (SELECT #curRank := 0) r ORDER BY total DESC;
This will sum all the fields and sort them by descending order and add a rank.
By doing SELECT #curRank := 0 you can keep it all in one SQL statement without having to do a SET first.
What is difference between PERCENTILE_DISC and PERCENTILE_CONT,
I have a table ### select * from childstat
FIRSTNAME GENDER BIRTHDATE HEIGHT WEIGHT
-------------------------------------------------- ------ --------- ---------- ----------
lauren f 10-JUN-00 54 876
rosemary f 08-MAY-00 35 123
Albert m 02-AUG-00 15 923
buddy m 02-OCT-00 15 150
furkar m 05-JAN-00 76 198
simon m 03-JAN-00 87 256
tommy m 11-DEC-00 78 167
And I am trying differentiate between those percentile
select firstname,height,
percentile_cont(.50) within group (order by height) over() as pctcont_50_ht,
percentile_cont(.72) within group (order by height) over() as pctcont_72_ht,
percentile_disc(.50) within group (order by height) over () as pctdisc_50_ht,
percentile_disc(.72) within group (order by height) over () as pctdisc_72_ht
from childstat order by height
FIRSTNAME HEIGHT PCTCONT_50_HT PCTCONT_72_HT PCTDISC_50_HT PCTDISC_72_HT
-------------------------------------------------- ---------- ------------- ------------- ------------- -------------
buddy 15 54 76.64 54 78
Albert 15 54 76.64 54 78
rosemary 35 54 76.64 54 78
lauren 54 54 76.64 54 78
furkar 76 54 76.64 54 78
tommy 78 54 76.64 54 78
simon 87 54 76.64 54 78
But still can't understand how this two and what is use of those two functions..
PERCENTILE_DISC returns a value in your set/window, whereas PERCENTILE_CONT will interpolate;
In your query, when you use .72, PERCENTILE_CONT interpolates between 76 and 78, since 72% is neither one of them; PERCENTILE_DISC chooses 76 (the lowest of the ones)
I found this explanation very helpful
http://mfzahirdba.blogspot.com/2012/09/difference-between-percentilecont-and.html
ITEM REGION WK FORECASTQTY
---- ---------- ---------- -----------
TEST E 3 137
TEST E 2 190
TEST E 1 232
TEST E 4 400
SELECT
t.* ,
PERCENTILE_CONT(0.5)
WITHIN GROUP ( ORDER BY forecastqty)
OVER (PARTITION BY ITEM , region ) AS PERCENTILE_CONT ,
MEDIAN(forecastqty)
OVER (PARTITION BY ITEM , region ) AS MEDIAN ,
PERCENTILE_DISC(0.5)
WITHIN GROUP ( ORDER BY forecastqty)
OVER (PARTITION BY ITEM , region ) AS PERCENTILE_DISC
FROM
t ;
ITEM REGION WK FORECASTQTY PERCENTILE_CONT MEDIAN PERCENTILE_DISC
---- ---------- ---------- ----------- --------------- ---------- ---------------
TEST E 3 137 211 211 190
TEST E 2 190 211 211 190
TEST E 1 232 211 211 190
TEST E 4 400 211 211 190
this my table
student_numbers
ROLL_NO NAME CLASS HINDI MATHS SCIENCE
2 amit 11 91 91 81
3 anirudh 11 88 87 81
4 akash 11 82 81 85
5 pratik 10 81 99 98
7 rekha 10 79 97 82
6 neha 10 89 91 90
8 kamal 10 66 68 69
1 ankit 11 97 98 87
i want to add last three columns and rank on that total partitioned by class
this is what i tried
select roll_no,name,class,total,
rank() over (partition by class order by total desc) as rank
from student_numbers,(select hindi+maths+science total from student_numbers)
;
but this is showing a very large table,with duplicate student name having different total .
I'm not exactly sure what you are trying to accomplish -- order the highest grades by class? If so, something like this should work:
SELECT SN.Roll_No,
SN.Class,
SN2.Total,
RANK() OVER (PARTITION BY SN.Class ORDER BY SN2.Total DESC) as rank
FROM Student_Numbers SN
JOIN (
SELECT
Roll_no, hindi+maths+science as Total
FROM Student_Numbers
) SN2 ON SN.Roll_No = SN2.Roll_No
Here is the SQL Fiddle.
Good luck.