Return Rank In SQL Query Based On Multiple Columns - sql

I have a large SQL table called 'allscores' similar to the following:
user score quiz_id high_score
Bob 90 math 1
John 80 math 0
John 85 math 1
Steve 100 math 1
Bob 95 reading 0
Bob 100 reading 1
John 80 reading 1
The 'high_score' field is in the table to begin with and is always set to '1' for the row where a user's score is the highest for them for that quiz.
What I want is a SQL query that I can run on an individual user to pull their highest score from each of the two quizzes ('math' and 'reading') along with their overall rank among scores for that quiz. What I have so far is the following:
SELECT `user`, `score`, `quiz_id` FROM `allscores` WHERE `user`="Bob" AND `high_score`="1"
Which will output the following:
user score quiz_id
Bob 90 math
Bob 100 reading
This query is simply pulling the highest score for Bob from each quiz - what I want to add is the ranking of the score among the scores in that particular quiz - so an output like the following:
user score quiz_id rank
Bob 90 math 2
Bob 100 reading 1
Bob's rank is '2' for the math quiz as Steve has a higher score, but he is ranked '1' for reading as he has the highest score.
How would I add this ranking column to my existing query?

This uses MS T-SQL syntax, but if your flavor of SQL uses window functions, it should be similar.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t (
[user] varchar(10)
, score int
, quiz_id varchar(10)
, high_score bit
) ;
INSERT INTO t ([user], score, quiz_id, high_score)
VALUES
( 'Bob',90,'math',1 )
, ( 'John',80,'math',0 )
, ( 'Steve',100,'math',1 )
, ( 'Bob',95,'reading',0 )
, ( 'Bob',100,'reading',1 )
, ( 'John',85,'math',1 )
, ( 'John',80,'reading',1 )
;
MAIN QUERY:
SELECT s1.[user]
, s1.score
, s1.quiz_id
, s1.high_score
--, s1.isUserHighScore
, s1.ranking
FROM (
SELECT
t.[user]
, t.score
, t.quiz_id
, t.high_score
--, ROW_NUMBER() OVER (PARTITION BY t.[user],t.quiz_id ORDER BY t.score DESC) AS isUserHighScore
, DENSE_RANK() OVER (PARTITION BY t.quiz_id ORDER BY t.score DESC ) AS ranking
FROM t
) s1
WHERE s1.[user]='Bob'
--AND s1.isUserHighScore = 1
AND s1.high_score = 1
Results:
| user | score | quiz_id | high_score | isUserHighScore | ranking |
|------|-------|---------|------------|-----------------|---------|
| Bob | 90 | math | true | 1 | 2 |
| Bob | 100 | reading | true | 1 | 1 |
I use ROW_NUMBER() to determine the highest score in a quiz for a user, then use DENSE_RANK() to figure out the rank of a users score versus others. The difference between DENSE_RANK() and RANK() is essentially that DENSE_RANK() won't leave any gaps in the ranking. Example: 2 people scored a 90 and 1 scored an 80, then with DENSE_RANK() the 90s would both be Rank 1 and the 80 would be Rank 2. With RANK(), the 90s would be Rank 1 and the 80 would be Rank 3.

Related

Printing Average Score for Unique ID's

I have a table
id | playerID | score
1 | 1 | 100
2 | 1 | 155
3 | 5 | 132
etc..
What I want to do is get the Average 'score' for each unique playerID. I can get the DISTINCT playerIDs
SELECT DISTINCT
playerID
FROM
dbo.scores
and I can get the average of all scores,
SELECT
AVG(score)
FROM
dbo.scores
but I can't seem to figure out how to combine the two.
Use group by!
SELECT playerID, AVG(score) AS avg_score
FROM dbo.scores
GROUP BY playerID
The use of dbo suggests that you are using SQL Server. SQL Server does an integer average of integer values, so the average of 0 and 1 is 0, not 0.5.
I would recommend:
SELECT s.playerID, AVG(10. * s.score) AS avg_score
FROM dbo.scores s
GROUP BY s.playerID

Vertica SQL for running count distinct and running conditional count

I'm trying to build a department level score table based on a deeper product url level score table.
Date is not consecutive
Not all urls got score updates at same day (independent to each other)
dist_url should be running count distinct (cumulative count distinct)
dist urls and urls score >=30 are both count distinct
What I have now is:
Date url Store Dept Page Score
10/1 a US A X 10
10/1 b US A X 30
10/1 c US A X 60
10/4 a US A X 20
10/4 d US A X 60
10/6 b US A X 22
10/9 a US A X 40
10/9 e US A X 10
Date Store Dept Page dist urls urls score >=30
10/1 US A X 3 2
10/4 US A X 4 3
10/6 US A X 4 2
10/9 US A X 5 2
I think the dist_url can be done by using window function, just not sure on query.
Current query is as below, but it's wrong since not cumulative count distinct:
SELECT
bm.AnalysisDate,
su.SoID AS Store,
su.DptCaID AS DTID,
su.PageTypeID AS PTID,
COUNT(DISTINCT bm.SeoURLID) AS NumURLsWithDupScore,
SUM(CASE WHEN bm.DuplicationScore > 30 THEN 1 ELSE 0 END) AS Over30Count
FROM csn_seo.tblBotifyMetrics bm
INNER JOIN csn_seo.tblSEOURLs su
ON bm.SeoURLID = su.ID
WHERE su.DptCaID IS NOT NULL
AND su.DptCaID <> 0
AND su.PageTypeID IS NOT NULL
AND su.PageTypeID <> -1
AND bm.iscompliant = 1
GROUP BY bm.AnalysisDate, su.SoID, su.DptCaID, su.PageTypeID;
Please let me know if anyone has any idea.
Based on your question, you seem to want two levels of logic:
select date, store, dept,
sum(sum(start)) over (partition by dept, page order by date) as distinct_urls,
sum(sum(start_30)) over (partition by dept, page order by date) as distinct_urls_30
from ((select store, dept, page, url, min(date) as date, 1 as start, 0 as start_30
from t
group by store, dept, page, url
) union all
(select store, dept, page, url, min(date) as date, 0, 1
from t
where score >= 30
group by store, dept, page, url
)
) t
group by date, store, dept, page;
I don't understand how your query is related to your question.
Try as I might, I don't get your output either:
But I think you can avoid UNION SELECTs - Does this do what you expect?
NULLS don't figure in COUNT DISTINCTs - and here you can combine an aggregate expression with an OLAP one ...
And Vertica has named windows to increase readability ....
WITH
input(Date,url,Store,Dept,Page,Score) AS (
SELECT DATE '2019-10-01','a','US','A','X',10
UNION ALL SELECT DATE '2019-10-01','b','US','A','X',30
UNION ALL SELECT DATE '2019-10-01','c','US','A','X',60
UNION ALL SELECT DATE '2019-10-04','a','US','A','X',20
UNION ALL SELECT DATE '2019-10-04','d','US','A','X',60
UNION ALL SELECT DATE '2019-10-06','b','US','A','X',22
UNION ALL SELECT DATE '2019-10-09','a','US','A','X',40
UNION ALL SELECT DATE '2019-10-09','e','US','A','X',10
)
SELECT
date
, store
, dept
, page
, SUM(COUNT(DISTINCT url) ) OVER(w) AS dist_urls
, SUM(COUNT(DISTINCT CASE WHEN score >=30 THEN url END)) OVER(w) AS dist_urls_gt_30
FROM input
GROUP BY
date
, store
, dept
, page
WINDOW w AS (PARTITION BY store,dept,page ORDER BY date)
;
-- out date | store | dept | page | dist_urls | dist_urls_gt_30
-- out ------------+-------+------+------+-----------+-----------------
-- out 2019-10-01 | US | A | X | 3 | 2
-- out 2019-10-04 | US | A | X | 5 | 3
-- out 2019-10-06 | US | A | X | 6 | 3
-- out 2019-10-09 | US | A | X | 8 | 4
-- out (4 rows)
-- out
-- out Time: First fetch (4 rows): 45.321 ms. All rows formatted: 45.364 ms

SQL find and group consecutive number in rows without duplicate

So I have a table like this:
Taxi Client Time
Tom A 1
Tom A 2
Tom B 3
Tom A 4
Tom A 5
Tom A 6
Tom B 7
Tom B 8
Bob A 1
Bob A 2
Bob A 3
and the expected result will be like this:
Tom 3
Bob 1
I have used the partition function to count the consecutive value but the result become this:
Tom A 2
Tom A 3
Tom B 2
Bob A 2
Please help, I am not good in English, thanks!
This is a variation of a gaps-and-islands problem. You can solve it using window functions:
select taxi, count(*)
from (select t.taxi, t.client, count(*) as num_times
from (select t.*,
row_number() over (partition by taxi order by time) as seqnum,
row_number() over (partition by taxi, client order by time) as seqnum_c
from t
) t
group by t.taxi, t.client, (seqnum - seqnum_c)
having count(*) >= 2
)
group by taxi;
use distinct count
select taxi ,count( distinct cient)
from table_name
group by taxi
It seems your expected output is wrong
I don't see where you get the number 3 from. If you're trying to do what your question says and group by client in consecutive order only and then get the number of different groups, I can help you out with the following query. Bob has 1 group and Tom has 4.
Partition by taxi, ORDER BY taxi, time and check if this client matches the previous client for this taxi. If yes, do not count this row. If no, count this row, this is a new group.
SELECT FEE.taxi,
SUM(FEE.clientNotSameAsPreviousInSequence)
FROM
(
SELECT taxi,
CASE
WHEN PreviousClient IS NULL THEN
1
WHEN PreviousClient <> client THEN
1
ELSE
0
END AS clientNotSameAsPreviousInSequence
FROM
(
SELECT *,
LAG(client) OVER (PARTITION BY taxi ORDER BY taxi, time) AS PreviousClient
FROM table
) taxisWithPreviousClient
) FEE
GROUP BY FEE.taxi;

select the highest record between two table

I have two table. One table contains graduation records and the second table contains post graduation records. A candidate must have graduation, but it is not necessarily to have post graduation.
My question is to select the post graduation record if the candidate has post graduation else only graduation.
table 1 graduation_table
rollno | degree | division
--------------------------
001 | B.tech | 1st
002 | B.sc | 1st
003 | BA | 1st
table 2 postgraduation_table
rollno | degree | division
--------------------------
002 | M.sc | 1st
the result must be
rollno | degree | division
--------------------------
001 | B.tech | 1st
002 | M.sc | 1st
003 | BA | 1st
You want all rows from graduation_table which do not have a row in postgraduation_table plus those in postgraduation_table. This can be expressed with a not exists and union query:
select gt.rollno, gt.degree, gt.division
from graduation_table gt
where not exists (select *
from postgraduation_table pg
where pg.rollno = gt.rollno)
union all
select rollno, degree, division
from postgraduation_table
order by rollno;
Online example: http://rextester.com/IFCQR67320
select
rollno,
case when p.degree is null then g.degree else p.degree end as degree,
case when p.division is null then g.division else p.division end as division
from
grad g
left join
post p using (rollno)
Or better as suggested in the comments:
select
rollno,
coalesce (p.degree, g.degree) as degree,
coalesce (p.division, g.division) as division
from
grad g
left join
post p using (rollno)
Take a union of both tables, and introduce a position column, to rank the relative importance of the two tables. The postgraduate table has a pos value of 1, and the graduate table has a value of 2. Then, apply ROW_NUMBER() over this union query and assign a row number to each rollno group of records (presumed to be either one or at most two records). Finally, perform one more outer subquery to retain the most important record, postgraduate first, graduate second.
SELECT rollno, degree, division
FROM
(
SELECT
rollno, degree, division,
ROW_NUMBER() OVER (PARTITION BY rollno ORDER BY pos) rn
FROM
(
SELECT p.*, 1 AS pos p FROM postgraduation_table
UNION ALL
SELECT p.*, 2 FROM graduation_table p
) t
) t
WHERE t.rn = 1;
This should make your needs :
SELECT dg.rollno, CASE WHEN pg IS NOT NULL THEN pg.degree ELSE gd.degree END AS degree, dg.division
FROM graduation_table AS dg
LEFT OUTER JOIN postgraduation_table AS pg USING (rollno)
GROUP BY dg.rollno, dg.division;
Hope this help.

How to query the three best players in Oracle?

I have the following table:
NAME | SCORE
ALICE | 100
BOB | 90
CHARLES| 90
DUKE | 80
EVE | 70
...
My question is the following:
How can I extract with one query the name of the three best players? In my example the query should return four rows (ALICE, BOB, CHARLES and DUKE) because there are two silver medalists (they both have 90 points).
Thank You in advance.
Oracle has the DENSE_RANK analytical function for that exact purpose:
select name, score from (
select name, score, dense_rank() over(order by score desc nulls last) rank
-- ^^^^^^^^^^
-- reject NULL score at the end
from t
) V
where rank < 4
order by rank, name
See http://sqlfiddle.com/#!4/88445/5
How about the following
select *
from table1
where score >=
(select score from (
select score, rownum r from (
select distinct score from table1 order by score desc
) where rownum <= 3
) where r = 3)
order by score desc
See also this SQLFiddle: http://sqlfiddle.com/#!4/23e68/1