SQL Server FETCH NEXT X ONLY based on a specific column - sql

I have a request to get some online courses.
This is an example of results I can get :
course_code | course_date | participant_name
------------+-------------+------------------
21175A | 2021-12-01 | Jon Doe
21175A | 2021-12-01 | Lisa Tia
21175A | 2021-13-01 | Jon Doe
21175A | 2021-13-01 | Lisa Tia
As you can see, I get multiple rows for the same course depending on the dates and the participants, so I have to do some process in PHP to put everything in an array with keys (course_code)
I want to make a pagination to get the 25 first courses (and not rows)
SELECT course_code, course_date, participant_name
FROM courses
ORDER BY course_date
OFFSET 0
FETCH NEXT 25 ROWS ONLY
This won't work because a course can take multiple rows (4 here). So I wonder if I can write a query to get the 25 first courses (based on the column course_code) even if there are 100 rows.
Do you have an idea ? Thank you in advance
EDIT : This is what I did :
WITH cte AS
(
SELECT course_code
FROM courses
WHERE course_date <= '2021-12-31'
ORDER BY course_code
OFFSET 0 ROWS
FETCH NEXT 25 ROWS ONLY
)
SELECT *
FROM courses
INNER JOIN cte ON cte.course_code = courses.course_code
WHERE course_date <= '2021-12-31'
I added the WHERE clause in both to make the request slower

Use DENSE_RANK:
WITH cte AS (
SELECT *, DENSE_RANK() OVER (ORDER BY course_code) dr
FROM courses
)
SELECT course_code, course_date, participant_name
FROM cte
WHERE dr <= 25
ORDER BY course_date;
For the pagination part, just change the inequality in the WHERE clause. For example, to get the second page of 25 courses per page, you would use WHERE dr > 25 AND dr <= 50.

Related

SQL NOT IN Query on MS ACCESS/VBA

I have the following Course table with different extract dates.
ID | Course | ExtractDate
10 | 100000 | 2017-02-28
10 | 100001 | 2017-01-31
10 | 100002 | 2016-12-31
10 | 100003 | 2016-11-30
I need to perform the following SQL script to keep only records from the latest 3 months, and hence removing the record with Course code 10003.
Delete from [Course] where [ExtractDate] not in (select distinct top 3 [ExtractDate] from [Course] order by [ExtractDate] desc
Seems like VBA or MS Access does not support the function "not in", anybody has any workaround for this?
Thank you.
If you want records older than three months:
Delete from [Course]
where [ExtractDate] < dateadd("m", -3, date());
No set-based operation is needed.
NOT IN is the same as a left join where the target is null (thus there was no join)
SO...
from [Course]
where [ExtractDate] not in (
select distinct top 3 [ExtractDate]
from [Course]
order by [ExtractDate] desc
)
is the same as
from [Course]
left join (
select distinct top 3 [ExtractDate]
from [Course]
order by [ExtractDate] desc
) as x on [ExtractDate] = x.[ExtractDate]
where x.[ExtractDate] is null
However, I make no claim any of your code was correct -- just that those two are the same in SQL.

Row number in query result

I have query to get firms by theirs sales last year.
select
Name,
Sale
from Sales
order by
Sale DESC
and I get
Firm 2 | 200 000
Firm 1 | 190 000
Firm 3 | 100 000
And I would like to get index of row in result. For Firm 2 I would like to get 0 (or 1), for Firm 3 1 (or 2) and etc. Is this possible? Or at least create some sort of autoincrement column. I can use even stored procedure if it is needed.
Firebird 3.0 supports row_number() which is the better way to do this.
However for Firebird 2.5, you can get what you want with a correlated subquery:
select s.Name, s.Sale,
(select count(*) from Sales s2 where s2.sale >= s.sale) as seqnum
from Sales s
order by s.Sale DESC;

How to query the three best players in Oracle?

I have the following table:
NAME | SCORE
ALICE | 100
BOB | 90
CHARLES| 90
DUKE | 80
EVE | 70
...
My question is the following:
How can I extract with one query the name of the three best players? In my example the query should return four rows (ALICE, BOB, CHARLES and DUKE) because there are two silver medalists (they both have 90 points).
Thank You in advance.
Oracle has the DENSE_RANK analytical function for that exact purpose:
select name, score from (
select name, score, dense_rank() over(order by score desc nulls last) rank
-- ^^^^^^^^^^
-- reject NULL score at the end
from t
) V
where rank < 4
order by rank, name
See http://sqlfiddle.com/#!4/88445/5
How about the following
select *
from table1
where score >=
(select score from (
select score, rownum r from (
select distinct score from table1 order by score desc
) where rownum <= 3
) where r = 3)
order by score desc
See also this SQLFiddle: http://sqlfiddle.com/#!4/23e68/1

selecting top N rows for each group in a table

I am facing a very common issue regarding "Selecting top N rows for each group in a table".
Consider a table with id, name, hair_colour, score columns.
I want a resultset such that, for each hair colour, get me top 3 scorer names.
To solve this i got exactly what i need on Rick Osborne's blogpost "sql-getting-top-n-rows-for-a-grouped-query"
That solution doesn't work as expected when my scores are equal.
In above example the result as follow.
id name hair score ranknum
---------------------------------
12 Kit Blonde 10 1
9 Becca Blonde 9 2
8 Katie Blonde 8 3
3 Sarah Brunette 10 1
4 Deborah Brunette 9 2 - ------- - - > if
1 Kim Brunette 8 3
Consider the row 4 Deborah Brunette 9 2. If this also has same score (10) same as Sarah, then ranknum will be 2,2,3 for "Brunette" type of hair.
What's the solution to this?
If you're using SQL Server 2005 or newer, you can use the ranking functions and a CTE to achieve this:
;WITH HairColors AS
(SELECT id, name, hair, score,
ROW_NUMBER() OVER(PARTITION BY hair ORDER BY score DESC) as 'RowNum'
)
SELECT id, name, hair, score
FROM HairColors
WHERE RowNum <= 3
This CTE will "partition" your data by the value of the hair column, and each partition is then order by score (descending) and gets a row number; the highest score for each partition is 1, then 2 etc.
So if you want to the TOP 3 of each group, select only those rows from the CTE that have a RowNum of 3 or less (1, 2, 3) --> there you go!
The way the algorithm comes up with the rank, is to count the number of rows in the cross-product with a score equal to or greater than the girl in question, in order to generate rank. Hence in the problem case you're talking about, Sarah's grid would look like
a.name | a.score | b.name | b.score
-------+---------+---------+--------
Sarah | 9 | Sarah | 9
Sarah | 9 | Deborah | 9
and similarly for Deborah, which is why both girls get a rank of 2 here.
The problem is that when there's a tie, all girls take the lowest value in the tied range due to this count, when you'd want them to take the highest value instead. I think a simple change can fix this:
Instead of a greater-than-or-equal comparison, use a strict greater-than comparison to count the number of girls who are strictly better. Then, add one to that and you have your rank (which will deal with ties as appropriate). So the inner select would be:
SELECT a.id, COUNT(*) + 1 AS ranknum
FROM girl AS a
INNER JOIN girl AS b ON (a.hair = b.hair) AND (a.score < b.score)
GROUP BY a.id
HAVING COUNT(*) <= 3
Can anyone see any problems with this approach that have escaped my notice?
Use this compound select which handles OP problem properly
SELECT g.* FROM girls as g
WHERE g.score > IFNULL( (SELECT g2.score FROM girls as g2
WHERE g.hair=g2.hair ORDER BY g2.score DESC LIMIT 3,1), 0)
Note that you need to use IFNULL here to handle case when table girls has less rows for some type of hair then we want to see in sql answer (in OP case it is 3 items).

Get top results for each group (in Oracle)

How would I be able to get N results for several groups in
an oracle query.
For example, given the following table:
|--------+------------+------------|
| emp_id | name | occupation |
|--------+------------+------------|
| 1 | John Smith | Accountant |
| 2 | Jane Doe | Engineer |
| 3 | Jack Black | Funnyman |
|--------+------------+------------|
There are many more rows with more occupations. I would like to get
three employees (lets say) from each occupation.
Is there a way to do this without using a subquery?
I don't have an oracle instance handy right now so I have not tested this:
select *
from (select emp_id, name, occupation,
rank() over ( partition by occupation order by emp_id) rank
from employee)
where rank <= 3
Here is a link on how rank works: http://www.psoug.org/reference/rank.html
This produces what you want, and it uses no vendor-specific SQL features like TOP N or RANK().
SELECT MAX(e.name) AS name, MAX(e.occupation) AS occupation
FROM emp e
LEFT OUTER JOIN emp e2
ON (e.occupation = e2.occupation AND e.emp_id <= e2.emp_id)
GROUP BY e.emp_id
HAVING COUNT(*) <= 3
ORDER BY occupation;
In this example it gives the three employees with the lowest emp_id values per occupation. You can change the attribute used in the inequality comparison, to make it give the top employees by name, or whatever.
Add RowNum to rank :
select * from
(select emp_id, name, occupation,rank() over ( partition by occupation order by emp_id,RowNum) rank
from employee)
where rank <= 3
tested this in SQL Server (and it uses subquery)
select emp_id, name, occupation
from employees t1
where emp_id IN (select top 3 emp_id from employees t2 where t2.occupation = t1.occupation)
just do an ORDER by in the subquery to suit your needs
I'm not sure this is very efficient, but maybe a starting place?
select *
from people p1
join people p2
on p1.occupation = p2.occupation
join people p3
on p1.occupation = p3.occupation
and p2.occupation = p3.occupation
where p1.emp_id != p2.emp_id
and p1.emp_id != p3.emp_id
This should give you rows that contain 3 distinct employees all in the same occupation. Unfortunately, it will give you ALL combinations of those.
Can anyone pare this down please?