How to get top values when there is a tie - sql

I am having difficulty figuring out this dang problem. From the data and queries I have given below I am trying to see the email address that has rented the most movies during the month of September.
There are only 4 relevant tables in my database and they have been anonymized and shortened:
Table "cust":
cust_id
f_name
l_name
email
1
Jack
Daniels
jack.daniels#google.com
2
Jose
Quervo
jose.quervo#yahoo.com
5
Jim
Beam
jim.beam#protonmail.com
Table "rent"
inv_id
cust_id
rent_date
10
1
9/1/2022 10:29
11
1
9/2/2022 18:16
12
1
9/2/2022 18:17
13
1
9/17/2022 17:34
14
1
9/19/2022 6:32
15
1
9/19/2022 6:33
16
3
9/1/2022 18:45
17
3
9/1/2022 18:46
18
3
9/2/2022 18:45
19
3
9/2/2022 18:46
20
3
9/17/2022 18:32
21
3
9/19/2022 22:12
10
2
9/19/2022 11:43
11
2
9/19/2022 11:42
Table "inv"
mov_id
inv_id
22
10
23
11
24
12
25
13
26
14
27
15
28
16
29
17
30
18
31
19
31
20
32
21
Table "mov":
mov_id
titl
rate
22
Anaconda
3.99
23
Exorcist
1.99
24
Philadelphia
3.99
25
Quest
1.99
26
Sweden
1.99
27
Speed
1.99
28
Nemo
1.99
29
Zoolander
5.99
30
Truman
5.99
31
Patient
1.99
32
Racer
3.99
and here is my current query progress:
SELECT cust.email,
COUNT(DISTINCT inv.mov_id) AS "Rented_Count"
FROM cust
JOIN rent ON rent.cust_id = cust.cust_id
JOIN inv ON inv.inv_id = rent.inv_id
JOIN mov ON mov.mov_id = inv.mov_id
WHERE rent.rent_date BETWEEN '2022-09-01' AND '2022-09-31'
GROUP BY cust.email
ORDER BY "Rented_Count" DESC;
and here is what it outputs:
email
Rented_Count
jack.daniels#google.com
6
jim.beam#protonmail.com
6
jose.quervo#yahoo.com
2
and what I want it to be outputting:
email
jack.daniels#google.com
jim.beam#protonmail.com
From the results I am actually getting I have a tie for first place (Jim and Jack) and that is fine but I would like it to list both tieing email addresses not just Jack's so you cant do anything with rows or max I don't think.
I think it must have something to do with dense_rank but I don't know how to use that specifically in this scenario with the count and Group By?
Your creativity and help would be appreciated.

You're missing the FETCH FIRST ROWS WITH TIES clause. It will work together with the ORDER BY clause to get you the highest values (FIRST ROWS), including ties (WITH TIES).
SELECT cust.email
FROM cust
INNER JOIN rent
ON rent.cust_id = cust.cust_id
INNER JOIN inv
ON inv.inv_id = rent.inv_id
INNER JOIN mov
ON mov.mov_id = inv.mov_id
WHERE rent.rent_date BETWEEN '2022-09-01' AND '2022-09-31'
GROUP BY cust.email
ORDER BY COUNT(DISTINCT inv.mov_id) DESC
FETCH FIRST 1 ROWS WITH TIES

Related

SQL: how to average across groups, while taking a time constraint into account

I have a table named orders in a Postgres database that looks like this:
customer_id order_id order_date price product
1 2 2021-03-05 15 books
1 13 2022-03-07 3 music
1 14 2022-06-15 900 travel
1 11 2021-11-17 25 books
1 16 2022-08-03 32 books
2 4 2021-04-12 4 music
2 7 2021-06-29 9 music
2 20 2022-11-03 8 music
2 22 2022-11-07 575 travel
2 24 2022-11-20 95 food
3 3 2021-03-17 25 books
3 5 2021-06-01 650 travel
3 17 2022-08-17 1200 travel
3 19 2022-10-02 6 music
3 23 2022-11-08 70 food
4 9 2021-08-20 3200 travel
4 10 2021-10-29 2750 travel
4 15 2022-07-15 1820 travel
4 21 2022-11-05 8000 travel
4 25 2022-11-29 27 books
5 1 2021-01-04 3 music
5 6 2021-06-09 820 travel
5 8 2021-07-30 19 books
5 12 2021-12-10 22 music
5 18 2022-09-19 20 books
Here's a SQL Fiddle: http://sqlfiddle.com/#!17/262fc/1
I'd like to return the average money spent by customers per product, but only consider orders within the first 12 months of a given customer's first purchase within the given product group. (yes, this is challenging!)
For example, for customer 1, order ID 2 and order ID 11 would be factored into the average for books(because order ID 11 took place less than 12 months after customer 1's first order for books, which was order ID 2), but order ID 16 would not be factored into the average (because 8/3/22 is more than 12 months from customer 1's first purchase for books, which took place on 3/5/21).
Here is a matrix showing which orders would be included within a given product (denoted by "yes"):
The desired output would look as follows:
average_spent
books 22.20
music 7.83
travel 1530.71
food 82.50
How would I do this?
Thanks in advance for any assistance you can give!
You can use a subquery to check whether or not to include a product's price in the summation:
select o.product, sum(o.price)/count(*) val from orders o
where o.order_date < (select min(o1.order_date) from orders o1 where
o1.product = o.product and o.user_id = o1.user_id) + interval '12 months'
group by o.product
See fiddle

Using a top x count query as a where clause to show all qualifying records

I have a count of a top 2
My table has this data
Name Age price visited size
Jon 34 53 2018-01-01 9
Don 22 70 2018-03-01 15
Pete 76 12 2018-11-09 7
Jon 34 55 2018-09-13 9
Paul 90 64 2018-07-08 6
Pete 76 31 2018-03-25 7
Jon 75 34 2018-06-06 8
select top 2
name,
count(name) as cnt
from
tbl1
group by name
order by cnt desc
Which returns my top 2 names
Jon 3
Pete 2
This name will change dynamically as the query is run depending on who has made the most visits in total (this is very simplified the actual table has 1000's of entries).
What I would like to do is then use the result of that query to get the following all of which needs to be in a single query;
Name Age price visited size
Jon 34 53 2018-01-01 9
Jon 34 55 2018-09-13 9
Jon 75 34 2018-06-06 8
Pete 76 12 2018-11-09 7
Pete 76 31 2018-03-25 7
In summary, count who has visited the most and then display all the records under those names.
Thanks in advance
Here's one option using in:
select *
from yourtable
where name in (
select top 2 name
from yourtable
group by name
order by count(*) desc
)
order by name
Online Demo

Need assistance with a SQL query involving multiple sorts

I'm not sure if this is even possible, but my head starts to hurt when thinking about how to solve this. I've read on subqueries and PARTITION but I'm outside my knowledge. Here is a sample of my data:
TestID StudentID ComponentID Score
-------------------------------------
14919 3445 1 20
14919 3445 4 17
14919 3445 8 20
14919 3445 11 19
14919 3445 13 19
11339 3448 1 15
11339 3448 4 23
11339 3448 8 23
**11339 3448 11 22**
11339 3448 13 20
**14773 3448 1 20**
14773 3448 4 21
**14773 3448 8 23**
14773 3448 11 21
**14773 3448 13 21**
There can be multiple test attempts attached to the same StudentID. Attempts are noted by TestID.
I need to be able to query for the highest test score per TestComponentID over all attempts for each StudentID. There are only 5 component IDs. So for StudentID = 14773, between both ComponentID of 1, I just need the highest score. I would need the same for 4, 8, 11 and 13. I hope that makes sense. I highlighted the rows that would need to be returned. Any help is greatly appreciated.
Here is the query I've attempted. It just returns the same number of rows as the original.
SELECT DISTINCT
sts.StudentStandardizedTestID,
sts.StandardizedTestComponentID,
sts.StudentID,
MAX(sts.score) OVER (PARTITION BY sts.StudentID) HIGHSCORE
FROM
StandardizedTestScore sts
JOIN
StudentStandardizedTest sst ON sst.StudentStandardizedTestID = sts.StudentStandardizedTestID
AND sst.standardizedtestid = 1
WHERE
sst.TranscriptSchoolID = 10
AND sts.StandardizedTestComponentID = 1
OR sts.StandardizedTestComponentID = 4
OR sts.StandardizedTestComponentID = 8
OR sts.StandardizedTestComponentID = 11
OR sts.StandardizedTestComponentID = 13
ORDER BY
sts.studentid, sts.StandardizedTestComponentID
Below is the code to create your table and data.
CREATE TABLE StandardizedTestScore (`StudentStandardizedTestID` int(11) ,`studentid` int(11) ,`StandardizedTestComponentID` int(11),`score` int(11));
INSERT INTO StandardizedTestScore
(`TestID`, `studentid`, `componentid`, `score`)
VALUES
(14919,3445,1,20),
(14919,3445,4,17),
(14919,3445,8,20),
(14919,3445,11,19),
(14919,3445,13,19),
(11339,3448,1,15),
(11339,3448,4,23),
(11339,3448,8,23),
(11339,3448,11,22),
(11339,3448,13,20),
(14773,3448,1,20),
(14773,3448,4,21),
(14773,3448,8,23),
(14773,3448,11,21),
(14773,3448,13,21);
The query you are looking for is this..
SELECT studentid,StandardizedTestComponentID as componentID,MAX(score) AS score
FROM StandardizedTestScore
GROUP BY studentid,StandardizedTestComponentID
The results are this..
studentid ComponentID Score
3445 1 20
3445 4 17
3445 8 20
3445 11 19
3445 13 19
3448 1 20
3448 4 23
3448 8 23
3448 11 22
3448 13 21
It sounds to me like you need aggregation not sorting. Something like:
SELECT studentid,testid,componentid,MAX(score) AS score
FROM yourtable
GROUP BY studentid,testid,componentid

(SQL query) Set value for no data in same column compare to another row

Kindly give help,
I try to make query for Ms.Access with odbc connection from PHP page.
I have this tabel(table1)
class quantity date
1 30 01/04/2014
2 23 01/04/2014
3 23 01/04/2014
4 14 01/04/2014
5 5 01/04/2014
1 41 01/05/2014
2 38 01/05/2014
3 36 01/05/2014
4 28 01/05/2014
5 25 01/05/2014
6 1 01/05/2014
Kindly give help for query to get this output :
class quantity date
1 30 01/04/2014
2 23 01/04/2014
3 23 01/04/2014
4 14 01/04/2014
5 5 01/04/2014
6 0 0
1 41 01/05/2014
2 38 01/05/2014
3 36 01/05/2014
4 28 01/05/2014
5 25 01/05/2014
6 1 01/05/2014
In the output will show 0 as quantity for class 6 who actually no record for class 6 in 01/04/2014.
You could do this with union all. But a more general solution is this:
select c.class, d.date, nz(t1.quantity, 0)
from ((select distinct class from table1) as c cross join
(select distinct date from table1) as d
) left join
table1 as t1
on t1.class = c.class and t1.date = d.date

Using Sum() with multiple where clauses

I'm pretty new to this, so forgive if this has been posted (I had no idea what to even search on).
I have 2 tables, Accounts and Usage
AccountID AccountStartDate AccountEndDate
-------------------------------------------
1 12/1/2012 12/1/2013
2 1/1/2013 1/1/2014
UsageId AccountID EstimatedUsage StartDate EndDate
------------------------------------------------------
1 1 10 1/1 1/31
2 1 11 2/1 2/29
3 1 23 3/1 3/31
4 1 23 4/1 4/30
5 1 15 5/1 5/31
6 1 20 6/1 6/30
7 1 15 7/1 7/31
8 1 12 8/1 8/31
9 1 14 9/1 9/30
10 1 21 10/1 10/31
11 1 27 11/1 11/30
12 1 34 12/1 12/31
13 2 13 1/1 1/31
14 2 13 2/1 2/29
15 2 28 3/1 3/31
16 2 29 4/1 4/30
17 2 31 5/1 5/31
18 2 26 6/1 6/30
19 2 43 7/1 7/31
20 2 32 8/1 8/31
21 2 18 9/1 9/30
22 2 20 10/1 10/31
23 2 47 11/1 11/30
24 2 33 12/1 12/31
I'd like to write one query that gives me estimated usage for each month (starting now until the last month that we serve an account) for all accounts being served during that month.
The results would be as follows:
Month-Year Total Est Usage
------------------------------
Oct-12 0 (none being served)
Nov-12 0 (none being served)
Dec-12 34 (only accountid 1 being served)
Jan-13 23 (accountid 1 & 2 being served)
Feb-13 24 (accountid 1 & 2 being served)
Mar-13 51 (accountid 1 & 2 being served)
...
Dec-13 33 (only accountid 2 being served)
Jan-14 0 (none being served)
Feb-14 0 (none being served)
I'm assuming I need to sum and then do a Group By...but not really sure logically how I'd lay this out.
Revised Answer:
I've created a Months table with columns MonthID, Month with values like (201212, 12), (201301, 1), ...
I've also reorganised the usage table to have a month column rather than the start date and end date, as it makes the idea clearer.
See http://sqlfiddle.com/#!3/f57d84/6 for details
The query is now:
Select
m.MonthID,
Sum(u.EstimatedUsage) TotalEstimatedUsage
From
Accounts a
Inner Join
Usage u
On a.AccountID = u.AccountID
Inner Join
Months m
On m.MonthID Between
Year(a.AccountStartDate) * 100 + Month(a.AccountStartDate) And
Year(a.AccountEndDate) * 100 + Month(a.AccountEndDate) And
m.Month = u.Month
Group By
m.MonthID
Order By
1
Previous answer, for reference which assumed usages ranges were full dates rather than just months.
Select
Year(u.StartDate),
Month(u.StartDate),
Sum(Case When a.AccountStartDate <= u.StartDate And a.AccountEndDate >= u.EndDate Then u.EstimatedUsage Else 0 End) TotalEstimatedUsage
From
Accounts a
Inner Join
Usage u
On a.AccountID = u.AccountID
Group By
Year(u.StartDate),
Month(u.StartDate)
Order By
1, 2