Making a MAX() query from a subquery with COUNT() - sql

Need to Show the Name of the boat which made the most trips, so i made a query that counts the trips:
SELECT B.IdBoat, COUNT(T.IdTrip)
FROM Trip T INNER JOIN Boat B ON T.IdBoat=B.IdBoat
GROUP BY B.IdBoat
Now I need to show the name of the one with the MAX trips, how do I use that query as a subquery, without using the ORDER BY DESC and TOP 1 but using MAX?
Currently got:
SELECT B.Name
FROM Trip T INNER JOIN Boat B ON T.IdBoat=B.IdBoat
WHERE B.IdBoat = MAX( the sub query above)
also tried
SELECT B.Name, T.IdTrip
FROM Boat B INNER JOIN Trip T ON B.IdBoat=T.IdBoat
WHERE B.IdBoat IN (
SELECT MAX(T.NTrips) FROM
(SELECT B.IdBoat AS [IdBoat], COUNT(T.IdTrip) AS [NTrips]
FROM Trip T INNER JOIN Boat B ON B.IdBoat=T.IdBoat
GROUP BY B.Boat) T
GROUP BY T.IdBoat)
The above returned the full count of 3 on the name of the boat instead of the correct 2.
I've tried googling and searching about said problem on stackoverflow and others but can't adapt their solution to my query, any help is good help.
Thank you.
edit 1. As asked, I'll provide some data as to help understand the problem better
Table Boat:
IdBoat | Name
1 | 'SS Sparrow'
2 | 'SS AndaNoMar'
Table Trip
IdTrip | IdBoat
1 | 1
2 | 1
3 | 2
Subquery 1 (COUNT)
IdBoat | NTrips
2 | 1
1 | 2

You can do:
with
x as (
select
b.idBoat,
b.Name,
count(*) as cnt
from trip t
join boat b on b.idBoat = t.idBoat
group by b.idBoat, b.Name
),
m as (
select max(cnt) as max_cnt from x
)
select
x.*
from x
join m on m.max_cnt = x.cnt

SELECT
B.IdBoat,
B.Name,
T.Trips
FROM
Boat AS B
INNER JOIN
(
SELECT
IdBoat,
COUNT(*) AS Trips,
RANK() OVER (PARTITION BY IdBoat
ORDER BY COUNT(*) DESC
)
AS TripsRank
FROM
Trip
GROUP BY
IdBoat
)
AS T
ON T.IdBoat = B.IdBoat
WHERE
T.TripsRank = 1

A better method than either of the other two answers is to use ORDER BY:
SELECT TOP (1) B.IdBoat, B.Name, COUNT(T.IdTrip) as cnt
FROM Trip T INNER JOIN
Boat B
ON T.IdBoat = B.IdBoat
GROUP BY B.IdBoat, B.Name
ORDER BY cnt DESC;
There is no need for subqueries or CTEs or window functions.
If you want ties, then you can use TOP (1) WITH TIES.

Related

GROUP BY Subquery returns more than one row

I'm looking for a way to solve the following situation. I have a table that I need to return only one number for each "p.pays", This query is supposed to list "nom from table Pays" where at least half of the "athlete" have are in the table "Resultat" but my subquery returns more than one line is there a way I can match "p.code" in both the query and the subquery so it only returns 1 line per "p.code".
SELECT p.nom , count(*) FROM Athlete a
INNER JOIN Pays p ON a.pays = p.code
GROUP BY p.code HAVING count(*)/2 >= (SELECT count(*) FROM Athlete a
INNER JOIN Pays p ON a.pays = p.code
INNER JOIN Resultat r ON a.code = r.athlete
GROUP BY p.code);
Expected result, show Countries"Pays" where at least half of the athletes "Athlete" have won a medal (Athlete is in the Resultat table). :
p.nom | count(*)
|----------|--------|
|Albania | 134 | <-- Total Number of athletes "Athlete" in the
|Argentina | 203 | country "Pays".
| ... | ... |
You want to have two counts of athlethes in the country:
all athletes
the resultat athletes
Use a conditional count for this:
SELECT p.nom, count(*)
FROM pays p
INNER JOIN athlete a ON a.pays = p.code
GROUP BY p.code
HAVING COUNT(*) / 2 >=
COUNT(*) FILTER (WHERE a.code IN (SELECT athlete FROM resultat))
ORDER BY p.nom;

SQL Join and count relations

Having a PostgreSQL query problem, and wondering if there's an efficient way to get this in a single query. Let's take the following simple table structure. Think of it as the traditional many to many relationship.
users <-> user_collections <-> collections
Given a users id, I'd like to first get all of their collections. This is the simple part for which I have a query:
SELECT c.id, c.name, c.description, c.created_at, c.updated_at
FROM collections c
JOIN user_collections uc ON c.id = uc.collection_id
WHERE uc.user_id = $1
ORDER BY created_at DESC
So for example:
users
id | email
1 | user1#example.com
2 | user2#example.com
3 | user3#example.com
user_collections
id | user_id | collection_id
1 | 1 | 1
2 | 2 | 1
3 | 3 | 1
collections
id | name | description
1 | Example | Demo collection
In the above case, querying for collections for user one would yield the first collection. However I'd also like to get a count of how many users are associated with each collection. In this case, a total count of 3, since all three members share this collection. A member count if you will. Is there a sensible way to do this in one query, or is two probably better?
An alternative approach to using correlated subquery is to pre-calculated the number of users in each collection, then join it to your existing query.
i.e.
with collection_counts as (
select
collection_id
, count(1) as collection_count
from user_collections
group by collection_id
)
SELECT
c.id
, c.name
, c.description
, c.created_at
, c.updated_at
, cc.collecion_count
FROM collections c
JOIN user_collections uc ON c.id = uc.collection_id
join collection_counts as cc on c.id = cc.collection_id
WHERE uc.user_id = $1
ORDER BY created_at DESC
You need to use here a correlated sub query to get the corresponding total user of that collection [Correlated sub query help https://www.geeksforgeeks.org/sql-correlated-subqueries/]
SELECT c.* , (select count(user_id) from user_collections sc where sc.collection_id=uc.collection_id) as GroupCount
FROM collections c
JOIN user_collections uc ON c.id = uc.collection_id
WHERE uc.user_id = $1
ORDER BY created_at DESC
One method is conditional aggregation:
SELECT c.id, c.name, c.description, c.created_at, c.updated_at,
COUNT(*) as num_users
FROM collections c JOIN
user_collections uc
ON c.id = uc.collection_id
GROUP BY c.id
HAVING COUNT(*) FILTER (WHERE uc.user_id = $1) > 0
ORDER BY created_at DESC;
That said, it might be faster to do:
SELECT c.id, c.name, c.description, c.created_at, c.updated_at,
(SELECT COUNT(*)
FROM user_collections uc
WHERE c.id = uc.collection_id
) as num_users
FROM collections c
WHERE EXISTS (SELECT 1
FROM user_collections uc
WHERE c.id = uc.collection_id AND
uc.user_id = $1
)
ORDER BY created_at DESC;
This would be faster for two reasons:
It avoids the outer aggregation. Aggregations on larger data are generally more expensive.
It calculates the count only for the collections that are in the result set.
This can also make use of indexes on the table -- which if you care about performance, you should have.

How to get the lowest ID present for each category SQL

I have the following tables is sql
Diagnosis
DiagnosisID
DiagnosisDescription
Member
MemberID
FirstName
LastName
DiagnosisCategoryMap
DiagnosisCategoryID
DiagnosisID
MemberDiagnosis
MemberID
DiagnosisID
What I need to do is find the diagnosis with the lowest DiagnosisID present for each Members Category
This is the sql I have so far:
SELECT MD.MemberID AS MID,
MD.DiagnosisID AS DID,
DM.DiagnosisCategoryID AS CID
FROM
MemberDiagnosis MD
INNER JOIN DiagnosisCategoryMap DM ON MD.DiagnosisID = DM.DiagnosisID
Which gives me this result set:
> MID DID CID
> 1 2 2
> 1 4 3
> 3 3 3
> 3 4 3
The result set I need should look like this:
> MID DID CID
1 2 2
3 3 3
What am I missing in my query.
I have tried to do a group by but that (of course) did not work out well because I could not aggregate properly for the group by.
I am using SQL SERVER and that is all I can use
Use the MIN aggregate to get the minimum DiagnosticID for each MemberID and DiagnosisCategoryID using GROUP BY
SELECT MD.MemberID AS MID,
MIN(MD.DiagnosisID) AS DID,
DM.DiagnosisCategoryID AS CID
FROM
MemberDiagnosis MD
INNER JOIN DiagnosisCategoryMap DM ON MD.DiagnosisID = DM.DiagnosisID
GROUP BY
MD.MemberID,
DM.DiagnosisCategoryID
Break the problem down into smaller steps.
First, verify that you can get the lowest Diagnosis Id for each Member with the following:
select MemberId as MID, min(DiagnosisId) as DID
from MemberDiagnosis
group by MemberId
When you have verified that that works, join the DiagnosisCategoryMap table...
select MID, DID, DiagnosisCategoryId as CID
from
(
select MemberId as MID, min(DiagnosisId) as DID
from MemberDiagnosis
group by MemberId
) src
inner join DiagnosisCategoryMap dcm
on dcm.DiagnosisId = src.DID
SELECT A.ID AS MID, MIN(C.DiagnosisID) AS DID, C.DiagnosisCategoryID AS CID
FROM
Member A INNER JOIN MemberDiagnosis B
ON A.MemberID=B.MemberID
INNER JOIN DiagnosisCategoryMap C
ON B.DiagnosisID=C.DiagnosisID
GROUP BY A.ID, C.DiagnosisCategoryID;

How to get only one from each value in a column

I have this query:
SELECT
r.rev_id, rs.name, COUNT(ws.user_id) as likes
FROM
Reviews AS r
LEFT JOIN
Wasliked AS ws ON r.rev_id = ws.rev_id
LEFT JOIN
Restaurants AS rs ON rs.rid = r.rest_id
GROUP BY
rs.name, r.rev_id
ORDER BY
likes DESC
and the result is:
rev_id name likes
------------------------
7 rest1 5
10 rest1 3
6 rest1 2
2 rest3 2
1 rest2 2
5 rest3 1
8 rest4 1
But I want the result to be like this:
rev_id name likes
--------------------------
7 rest1 5
2 rest3 2
1 rest2 2
taking the 3 highest results with different names.
I have already tried to only group by rs.name instead of rs.name,r.rev_id but that causes an error.
Thanks in advance
So you want the top value for each name, limited to three rows. This suggest row_number():
SELECT TOP 3 rev_id, name, likes
FROM (SELECT r.rev_id, rs.name, COUNT(ws.user_id) as likes,
ROW_NUMBER() OVER (PARTITION BY rs.name ORDER BY COUNT(ws.user_id)) as seqnum
FROM Reviews r left join
Wasliked ws
on r.rev_id = ws.rev_id left join
Restaurants rs
on rs.rid = r.rest_id
GROUP BY rs.name, r.rev_id
) x
WHERE seqnum = 1
ORDER BY likes desc;
Also you can do it like this if you do not mind writing a redundant sql:
select top 3 t1.*
from (
select r.rev_id, rs.name, count(ws.user_id) as likes
from reviews as r
left join wasliked as ws on r.rev_id=ws.rev_id
left join restaurants as rs on rs.rid=r.rest_id
group by rs.name,r.rev_id
) t1
inner join (
select name, max(likes) as likes
from (
select r.rev_id, rs.name, count(ws.user_id) as likes
from reviews as r
left join wasliked as ws on r.rev_id=ws.rev_id
left join restaurants as rs on rs.rid=r.rest_id
group by rs.name,r.rev_id) tmp
group by name
) t2 on t1.name = t2.name and t1.likes = t2.likes
order by t1.likes desc
#Gordon Linoff's answer is a better way to do this, his sql is just right, and you could find it gives you the lowest likes row per name, so when you change
ROW_NUMBER() OVER (PARTITION BY rs.name ORDER BY COUNT(ws.user_id)) as seqnum
to
ROW_NUMBER() OVER (PARTITION BY rs.name ORDER BY COUNT(ws.user_id) DESC) as seqnum
it will give you the right result.
Your query has the form show rows with max(foo). The first line of attack is Group By, but sometimes, you want more information about the aggregate. In this case, having computed likes by rev_id and name, you want only those rows having max(likes) for each name. That calls for an existence test:
with T (rev_id, name, likes) as (
SELECT r.rev_id, rs.name, COUNT(ws.user_id) as likes
FROM Reviews as r
left join Wasliked as ws on r.rev_id=ws.rev_id
left join Restaurants as rs on rs.rid=r.rest_id
GROUP BY rs.name,r.rev_id
)
select * from T as L
where exists (
select 1 from T
where name = L.name
group by name
having max(likes) = L.likes
)
order by likes desc
That's about right.
I prefer my version to the others provided so far. It doesn't use the nonstandard top N formulation, and the query is cast in terms of the logical operation you need, i.e. quantification.
With practice, where exists gets easier, and will save you writing many more complicated queries.

Select Most Recent Date with Inner Join

Running into a wall when trying to pull info from tables similar to those below. Not sure how to approach this.
The results should have the most recent TRANSAMT for each ACCNUM along with NAME and address.
Select A.ACCNUM, MAX(B.TRANSAMT) as BAMT, B.ADDRESS from
From TableA A inner join TableB on A.ACCNUM = B.ACCNUM
This is what i have so far. Any help would be appreciated.
TableA
ACCNUM NAME ADDRESS
00001 R. GRANT Miami, FL
00002 B. PAUL Dallas, TX
TableB
ACCNUM TRANSAMT TRANSDATE
00001 150 1/1/2015
00001 200 13/2/2015
00002 100 2/1/205
00003 50 18/2/2015
You can use the ANSI standard row_number() function in most databases. This allows you to do conditional aggregation:
select a.accnum, a.name, b.amount, a.address
from tableA a left join
(select b.*, row_number() over (partition by accnum order by transdate desc) as seqnum
from tableB b
) b
on a.accnum = b.accnum and b.seqnum = 1;
Note: I changed the join to a left join. This will keep all records in tableA, even those with no matches. I am not sure if that is the intention of your query.
You can use row_number to order rows per each account number by the most recent first.
select accnum, amt, name, address
from (
select A.ACCNUM, B.TRANSAMT as BAMT, B.ADDRESS,A.Name,
row_number() over(partition by a.accnum order by b.transdate desc) as rn
From TableA A
inner join TableB on A.ACCNUM = B.ACCNUM
) t
where rn = 1;
Please note this will not work if you are using MySQL.
This one with no ROW_NUMBER():
with find_max as(
select acc_name,max(TRANSDATE) as TRANSDATE from talbeB group by acc_name)
select find_max.ACCNUM , A.TRANSAMT ,
find_max.TRANSDATE , B.ADDRESS,B.Name
from tableA as A
join find_max on find_max.ACCNUM=A.ACCNUM and find_max.ACCNUM=A.ACCNUM
join TableB B on A.ACCNUM = B.ACCNUM
First find the max date for each acc_name, the join both of tables to it.
Will work on most data bases.