Counting values from table results - sql

I have a table of data that looks like the following:
ArtistName TrackName TrackID
1 Pendulum Slam 6
2 N/A N/A 26
3 N/A N/A 26
4 N/A N/A 26
5 Snow Patrol Chasing Cars 17
6 Snow Patrol Chasing Cars 17
7 Rihanna Love The Way You Lie 4
8 N/A N/A 26
9 N/A N/A 26
10 Kanye West Stronger 10
11 Rihanna Love The Way You Lie 4
12 N/A N/A 26
13 N/A N/A 26
14 Tinie Tempah Written In The Stars 8
15 N/A N/A 26
16 N/A N/A 26
17 Nero Crush On You 18
etc...
Basically what I'd like to do is count the number of occurrences of each TrackID, and display that in a column. The previous table is created from this query which combines a few other tables:
SELECT Artist_Details.ArtistName, Track_Details.TrackName, Sales_Records.TrackID
FROM Track_Details
INNER JOIN Sales_Records ON Track_Details.TrackID = Sales_Records.TrackID
JOIN Artist_Details ON Track_Details.ArtistID = Artist_Details.ArtistID;
The output format I'd like is:
ArtistName TrackName Track ID TotalSales
1 Pendulum Slam 6 8
2 Tinie Tempah Written In The Stars 8 5
3 Rihanna Love The Way You Lie 4 2
And finally, I'd like the value 26 to not be counted and to be ignored and not displayed in the results, with it sorted ascending by TotalSales. And if possible to limit this chart to 10 rows.
Thanks in advance, Mark

That looks like a slam dunk for group by:
SELECT top 10 Artist_Details.ArtistName, Track_Details.TrackName,
Sales_Records.TrackID, count(Sales_Records.TrackID) as TotalSales
FROM Track_Details
INNER JOIN Sales_Records ON Track_Details.TrackID = Sales_Records.TrackID
JOIN Artist_Details ON Track_Details.ArtistID = Artist_Details.ArtistID
WHERE Sales_Records.TrackID <> 26
GROUP BY Artist_Details.ArtistName, Track_Details.TrackName, Sales_Records.TrackID
ORDER BY count(Sales_Records.TrackID) desc

Related

How to get top values when there is a tie

I am having difficulty figuring out this dang problem. From the data and queries I have given below I am trying to see the email address that has rented the most movies during the month of September.
There are only 4 relevant tables in my database and they have been anonymized and shortened:
Table "cust":
cust_id
f_name
l_name
email
1
Jack
Daniels
jack.daniels#google.com
2
Jose
Quervo
jose.quervo#yahoo.com
5
Jim
Beam
jim.beam#protonmail.com
Table "rent"
inv_id
cust_id
rent_date
10
1
9/1/2022 10:29
11
1
9/2/2022 18:16
12
1
9/2/2022 18:17
13
1
9/17/2022 17:34
14
1
9/19/2022 6:32
15
1
9/19/2022 6:33
16
3
9/1/2022 18:45
17
3
9/1/2022 18:46
18
3
9/2/2022 18:45
19
3
9/2/2022 18:46
20
3
9/17/2022 18:32
21
3
9/19/2022 22:12
10
2
9/19/2022 11:43
11
2
9/19/2022 11:42
Table "inv"
mov_id
inv_id
22
10
23
11
24
12
25
13
26
14
27
15
28
16
29
17
30
18
31
19
31
20
32
21
Table "mov":
mov_id
titl
rate
22
Anaconda
3.99
23
Exorcist
1.99
24
Philadelphia
3.99
25
Quest
1.99
26
Sweden
1.99
27
Speed
1.99
28
Nemo
1.99
29
Zoolander
5.99
30
Truman
5.99
31
Patient
1.99
32
Racer
3.99
and here is my current query progress:
SELECT cust.email,
COUNT(DISTINCT inv.mov_id) AS "Rented_Count"
FROM cust
JOIN rent ON rent.cust_id = cust.cust_id
JOIN inv ON inv.inv_id = rent.inv_id
JOIN mov ON mov.mov_id = inv.mov_id
WHERE rent.rent_date BETWEEN '2022-09-01' AND '2022-09-31'
GROUP BY cust.email
ORDER BY "Rented_Count" DESC;
and here is what it outputs:
email
Rented_Count
jack.daniels#google.com
6
jim.beam#protonmail.com
6
jose.quervo#yahoo.com
2
and what I want it to be outputting:
email
jack.daniels#google.com
jim.beam#protonmail.com
From the results I am actually getting I have a tie for first place (Jim and Jack) and that is fine but I would like it to list both tieing email addresses not just Jack's so you cant do anything with rows or max I don't think.
I think it must have something to do with dense_rank but I don't know how to use that specifically in this scenario with the count and Group By?
Your creativity and help would be appreciated.
You're missing the FETCH FIRST ROWS WITH TIES clause. It will work together with the ORDER BY clause to get you the highest values (FIRST ROWS), including ties (WITH TIES).
SELECT cust.email
FROM cust
INNER JOIN rent
ON rent.cust_id = cust.cust_id
INNER JOIN inv
ON inv.inv_id = rent.inv_id
INNER JOIN mov
ON mov.mov_id = inv.mov_id
WHERE rent.rent_date BETWEEN '2022-09-01' AND '2022-09-31'
GROUP BY cust.email
ORDER BY COUNT(DISTINCT inv.mov_id) DESC
FETCH FIRST 1 ROWS WITH TIES

Replace Id of one column by a name from another table while using the count statement?

I am trying to get the count of patients by province for my school project, I have managed to get the count and the Id of the province in a table but since I am using the count statement it will not let me use join to show the ProvinceName instead of the Id (it says it's not numerical).
Here is the schema of the two tables I am talking about
The content of the Province table is as follow:
ProvinceId
ProvinceName
ProvinceShortName
1
Terre-Neuve-et-Labrador
NL
2
Île-du-Prince-Édouard
PE
3
Nouvelle-Écosse
NS
4
Nouveau-Brunswick
NB
5
Québec
QC
6
Ontario
ON
7
Manitoba
MB
8
Saskatchewan
SK
9
Alberta
AB
10
Colombie-Britannique
BC
11
Yukon
YT
12
Territoires du Nord-Ouest
NT
13
Nunavut
NU
And here is n sample data from the Patient table (don't worry it's fake data!):
SS
FirstName
LastName
InsuranceNumber
InsuranceProvince
DateOfBirth
Sex
PhoneNumber
2
Doris
Patel
PATD778276
5
1977-08-02
F
514-754-6488
3
Judith
Doe
DOEJ7712917
5
1977-12-09
F
418-267-2263
4
Rosemary
Barrett
BARR05122566
6
2005-12-25
F
905-638-5062
5
Cody
Kennedy
KENC047167
10
2004-07-01
M
604-833-7712
I managed to get the patient count by province using the following statement:
select count(SS),InsuranceProvince
from Patient
full JOIN Province ON Patient.InsuranceProvince = Province.ProvinceId
group by InsuranceProvince
which gives me the following table:
PatientCount
InsuranceProvince
13
1
33
2
54
3
4
4
608
5
1778
6
25
7
209
8
547
9
649
10
6
11
35
12
24
13
How can I replace the id's with the correct ProvinceShortName to get the following final result?
ProvinceName
PatientCount
NL
13
PE
33
NS
54
NB
4
QC
608
ON
1778
MB
25
SK
209
AB
547
BC
649
YT
6
NT
35
NU
24
Thanks in advance!
So you can actually just specify that in the select. Note that it's best practise to include the thing you group by in the select, but since your question is so specific then...
SELECT ProvinceShortName, COUNT(SS) AS PatientsInProvince
FROM Patient
JOIN Province ON Patient.InsuranceProvince=Province.ProvinceId
GROUP BY InsuranceProvince;
I would suggest:
select pr.ProvinceShortName, count(*)
from Patient p join
Province pr
on p.InsuranceProvince = pr.ProvinceId
group by pr.ProvinceShortName
order by min(pr.ProvinceId);
Notes:
The key is including the columns you want in the select and group by.
You seem to want the results in province number order, so I included an order by.
There is no need to count the non-NULL values of SS. You might as well use count(*).
Table aliases make the query easier to write and to read.
I assume that you need to show the patient count by province.
SELECT
Province.ProvinceShortName AS [ProvinceName]
,COUNT(1) as [PatinetCount]
FROM Patient
RIGHT JOIN Province ON Patient.InsuranceProvince = Province.ProvinceId
GROUP BY ProvinceShortName
Just altering your query to
select ProvinceShortName As PatientCount,count(InsuranceProvince) As PatientCount
from Patient
full JOIN Province ON Patient.InsuranceProvince = Province.ProvinceId
group by ProvinceShortName

Joining tables with different column name but same value in sqlite

I'm using SQLite to work with my database
I have two different tables, with key columns that have different names but the same value.
As such:
shoes
Identification | Name | Shoe size
1 Bob 10
2 John 12
payment
PaymentID | Price | Year
1 20 2013
2 38 2015
I need
Identification(or PaymentID, no matter) | Name | Shoe size | Price | Year
1 Bob 10 20 2013
2 John 12 38 2015
I've been searching, and trying to understand the tutorials to no avail. I guess im just too stupid
select s.identification, s.name, s.`shoe size`, p.price, p.year
from shoes s
join payment p on p.paymentid = s.identification

SQL : How to find number of occurrences without using HAVING or COUNT?

This is a trivial example, but I am trying to understand how to think creatively using SQL.
For example, I have the following tables below, and I want to query the names of folks who have three or more questions. How can I do this without using HAVING or COUNT? I wonder if this is possible using JOINS or something similar?
FOLKS
folkID name
---------- --------------
01 Bill
02 Joe
03 Amy
04 Mike
05 Chris
06 Elizabeth
07 James
08 Ashley
QUESTION
folkID questionRating questionDate
---------- ---------- ----------
01 2 2011-01-22
01 4 2011-01-27
02 4
03 2 2011-01-20
03 4 2011-01-12
03 2 2011-01-30
04 3 2011-01-09
05 3 2011-01-27
05 2 2011-01-22
05 4
06 3 2011-01-15
06 5 2011-01-19
07 5 2011-01-20
08 3 2011-01-02
Using SUM or CASE seems to be cheating to me!
I'm not sure if it's possible in your current formulation, but if you add a primary key to the question table (questionid) then the following seems to work:
SELECT DISTINCT Folks.folkid, Folks.name
FROM ((Folks
INNER JOIN Question AS Question_1 ON Folks.folkid = Question_1.folkid)
INNER JOIN Question AS Question_2 ON Folks.folkid = Question_2.folkid)
INNER JOIN Question AS Question_3 ON Folks.folkid = Question_3.folkid
WHERE (((Question_1.questionid) <> [Question_2].[questionid] And
(Question_1.questionid) <> [Question_3].[questionid]) AND
(Question_2.questionid) <> [Question_3].[questionid]);
Sorry, this is in MS Access SQL, but it should translate to any flavour of SQL.
Returns:
folkid name
3 Amy
5 Chris
Update: Just to explain why this works. Each join will return all the question ids asked by that person. The where clauses then leaves only unique rows of question ids. If there are less than three questions asked then there will be no unique rows.
For example, Bill:
folkid name Question_3.questionid Question_1.questionid Question_2.questionid
1 Bill 1 1 1
1 Bill 1 1 2
1 Bill 1 2 1
1 Bill 1 2 2
1 Bill 2 1 1
1 Bill 2 1 2
1 Bill 2 2 1
1 Bill 2 2 2
There are no rows where all the ids are different.
however for Amy:
folkid name Question_3.questionid Question_1.questionid Question_2.questionid
3 Amy 4 4 5
3 Amy 4 4 4
3 Amy 4 4 6
3 Amy 4 5 4
3 Amy 4 5 5
3 Amy 4 5 6
3 Amy 4 6 4
3 Amy 4 6 5
3 Amy 4 6 6
3 Amy 5 4 4
3 Amy 5 4 5
3 Amy 5 4 6
3 Amy 5 5 4
3 Amy 5 5 5
3 Amy 5 5 6
3 Amy 5 6 4
3 Amy 5 6 5
3 Amy 5 6 6
3 Amy 6 4 4
3 Amy 6 4 5
3 Amy 6 4 6
3 Amy 6 5 4
3 Amy 6 5 5
3 Amy 6 5 6
3 Amy 6 6 4
3 Amy 6 6 5
3 Amy 6 6 6
There are several rows which have different ids and hence these get returned by the above query.
you can try sum , to replace count.
SELECT SUM(CASE WHEN Field_name >=3 THEN field_name ELSE 0 END)
FROM tabel_name
SELECT f.*
FROM (
SELECT DISTINCT
COUNT(*) OVER (PARTITION BY folkID) AS [Count] --count questions for folks
,a.folkID
FROM QUESTION AS q
) AS p
INNER JOIN FOLKS as f ON f.folkID = q.folkID
WHERE p.[Count] > 3

Retrieve top 48 unique records from database based on a sorted Field

I have database table that I am after some SQL for (Which is defeating me so far!)
Imagine there are 192 Athletic Clubs who all take part in 12 Track Meets per season.
So that is 2304 individual performances per season (for example in the 100Metres)
I would like to find the top 48 (unique) individual performances from the table, these 48 athletes are then going to take part in the end of season World Championships.
So imagine the 2 fastest times are both set by "John Smith", but he can only be entered once in the world champs. So i would then look for the next fastest time not set by "John Smith"... so on and so until I have 48 unique athletes..
hope that makes sense.
thanks in advance if anyone can help
PS
I did have a nice screen shot created that would explain it much better. but as a newish user i cannot post images.
I'll try a copy and paste version instead...
ID AthleteName AthleteID Time
1 Josh Lewis 3 11.99
2 Joe Dundee 4 11.31
3 Mark Danes 5 13.44
4 Josh Lewis 3 13.12
5 John Smith 1 11.12
6 John Smith 1 12.18
7 John Smith 1 11.22
8 Adam Bennett 6 11.33
9 Ronny Bower 7 12.88
10 John Smith 1 13.49
11 Adam Bennett 6 12.55
12 Mark Danes 5 12.12
13 Carl Tompkins 2 13.11
14 Joe Dundee 4 11.28
15 Ronny Bower 7 12.14
16 Carl Tompkin 2 11.88
17 Nigel Downs 8 14.14
18 Nigel Downs 8 12.19
Top 4 unique individual performances
1 John Smith 1 11.12
3 Joe Dundee 4 11.28
5 Adam Bennett 6 11.33
6 Carl Tompkins 2 11.88
Basically something like this:
select top 48 *
from (
select athleteId,min(time) as bestTime
from theRaces
where raceId = '123' -- e.g., 123=100 meters
group by athleteId
) x
order by bestTime
try this --
select x.ID, x.AthleteName , x.AthleteID , x.Time
(
select rownum tr_count,v.AthleteID AthleteID, v.AthleteName AthleteName, v.Time Time,v.id id
from
(
select
tr1.AthleteName AthleteName, tr1.Time time,min(tr1.id) id, tr1.AthleteID AthleteID
from theRaces tr1
where time =
(select min(time) from theRaces tr2 where tr2.athleteId = tr1.athleteId)
group by tr1.AthleteName, tr1.AthleteID, tr1.Time
having tr1.Time = ( select min(tr2.time) from theRaces tr2 where tr1.AthleteID =tr2.AthleteID)
order by tr1.time
) v
) x
where x.tr_count < 48