sql subquery join group by - sql

I am trying to get a list of our users from our database along with the number of people from the same cohort as them - which in this case is defined as being from the same medical school at the same time.
medical_school_id is stored in the doctor_record table
graduation_dt is stored in the doctor_record table as well.
I have managed to write this query out using a subquery which does a select statement counting the number of others for each row but this takes forever. My logic is telling me that I ought to run a simple GROUP BY query once first and then somehow JOIN the medical_school_id on to that.
The group by query is as follows
select count(ca.id) , cdr.medical_school_id, cdr.graduation_dt
from account ca
LEFT JOIN doctor cd on ca.id = cd.account_id
LEFT JOIN doctor_record cdr on cd.gmc_number = cdr.gmc_number
GROUP BY cdr.medical_school_id, cdr.graduation_dt
The long select query is
select a.id, a.email , dr.medical_school_id,
(select count(ba.id) from account ba
LEFT JOIN doctor bd on ba.id = bd.account_id
LEFT JOIN doctor_record bdr on bd.gmc_number = bdr.gmc_number
WHERE bdr.medical_school_id = dr.medical_school_id AND bdr.graduation_dt = dr.graduation_dt) AS med_count,
from account a
LEFT JOIN doctor d on a.id = d.account_id
LEFT JOIN doctor_record dr on d.gmc_number = dr.gmc_number
If you could push me in the right direction that would be amazing

I think you just want window functions:
select a.id, a.email, dr.medical_school_id, dr.graduation_dt,
count(*) over (partition by dr.medical_school_id, dr.graduation_dt) as cohort_size
from account a left join
doctor d
on a.id = d.account_id left join
doctor_record dr
on d.gmc_number = dr.gmc_number;

Using your same code for group by:
SELECT * FROM (
(
SELECT acc.[id]
, acc.[email]
FROM
account acc
LEFT JOIN
doctor doc
ON
acc.id = doc.account_id
LEFT JOIN
doctor_record doc_rec
ON
doc.gmc_number = doc_rec.gmc_number
) label
LEFT JOIN
(
SELECT count(acco.id)
, doc_reco.medical_school_id
, doc_reco.graduation_dt
FROM
account acco
LEFT JOIN
doctor doct
ON
acco.id = doct.account_id
LEFT JOIN
doctor_record doc_reco
ON
doct.gmc_number = doc_reco.gmc_number
GROUP BY
doc_reco.medical_school_id,
doc_reco.graduation_dt
) count
ON
count.[medical_school_id]=label.[medical_school_id]
AND
count.[graduation_dt]=label.[graduation_date]
)

how about something like this?
select a.doctor_id
, count(*) - 1
from doctor_record a
left join doctor_record b on a.medical_school_id = b.medical_school_id
and a.graduation_dt = b.graduation_dt
group by a.doctor_id
Subtract 1 from the count so that you're not counting the doctor in the "other folks in same cohort" number
I'm defining "same cohort" as "same medical school & graduation date".
I'm unclear on what GMC number is and how it is related. Is it something to do with cohort?

Related

How to build the SQL query for given question?

I have 2 SQL problems for which I need SQL query.
Table - Booking
Table - Adventure
Table - Tourist
Table - Location
Query1: Display TourId, TourName and Email of those tourist(s) who have booked all types of adventures. (Hint: Use the concept of Joins).
My Try:
Select DISTINCT T.TourId, T.TourName, T.Email
From Tourist T
INNER JOIN Booking B ON B.TourId = T.TourId
INNER JOIN Location L ON L.LocId = B.Loc
INNER JOIN Adventure A ON A.AdvId = L.AdvId
AND A.AdvType in (Select DISTINCT AdvType From Adventure)
Query2: For each booking, Identify the location whose bookingamount is greater than the average bookingamount of all the bookings done for that location. Display LocId, LocName and Rating for the identified location(s). (Hint: Use the concept of subqueries)
My Try:
Select B.Loc, L.LocName, L.Rating
From Booking B
INNER JOIN Location L ON B.Loc = L.LocId
AND BookingAmount > (Select AVG(B.BookingAmount) from Booking B Group By B.Loc)
Query 1:
select distinct tourid,tourname,email from tourist, booking, location
where 1=1
and tourist.tourid = booking.tourid
and booking.locid = location.locid
and location.advid = adventure.advid
and adventure.advtype = 'A'
Query 2:
select locid,locname,rating
from location
where locid in (select booking.locid from booking, (select
b.bookid,b.loc,avg(b.bookingamount) as avg_ba from booking b group by
b.bookid,b.loc) aa
where booking.bookid = aa.bookid and booking.loc = aa.loc and
booking.bookingamount > aa.avg_ba)
Note: if it is a database design for any production server, I must say it needs to be changed ASAP.
Another Note: Please do not ever use pictures as references. It is very difficult to get information from pictures
Query 1:-
Select DISTINCT T.TourId, T.TourName, T.Email
From Tourist T
INNER JOIN Booking B ON B.TourId = T.TourId
INNER JOIN Location L ON L.LocId = B.Loc
INNER JOIN Adventure A ON A.AdvId = L.AdvId
WHERE A.AdvType='A' AND A.AdvType='G' AND A.AdvType='W';
Query 2:-
Select B.Loc, L.LocName, L.Rating
From Booking B
INNER JOIN Location L ON B.Loc = L.LocId
WHERE B.BookingAmount > (Select AVG(B.BookingAmount) from Booking B Group By B.Loc);

SQL many to many select people with multiple vacancies

I am working with sql server through SSMS right now. How can i choose all people with multiple(>2)vacancies?
I am trying something like that, but i dont understand how to make part with "more than 2 vacancies"?
SELECT dbo.applicants.FirstName, dbo.vacancy.Name
FROM dbo.applicants INNER JOIN
dbo.VacancyApplicant ON dbo.applicants.id = dbo.VacancyApplicant.ApplicantId INNER JOIN
dbo.vacancy ON dbo.VacancyApplicant.VacancyId = dbo.vacancy.id WHERE dbo.vacancy.Name='third vacancy'
SELECT dbo.applicants.FirstName, dbo.vacancy.Name
FROM dbo.applicants A INNER JOIN
dbo.VacancyApplicant V ON A.id = V.ApplicantId
WHERE EXIST(
SELECT 1
FROM dbo.applicants INNER JOIN
dbo.VacancyApplicant ON dbo.applicants.id =
dbo.VacancyApplicant.ApplicantId INNER JOIN
dbo.vacancy ON dbo.VacancyApplicant.VacancyId = dbo.vacancy.id
WHERE A.id=dbo.applicants.id
GROUP BY dbo.applicants.id,dbo.vacancy.id
HAVING COUNT(1)>2
)
Group By and Having are you basic answer. Below is a simple solution, might not be ideal, but can give you the idea.
I am finding target "applicants" ids in subquery, that uses GROUP BY and HAVING then outer query joins to that to output FirstName and LastName of applicant
SELECT dbo.applicants.FirstName, dbo.applicants.LastName FROM
dbo.applicants a INNER JOIN
(
SELECT dbo.applicants.id
FROM dbo.applicants INNER JOIN
dbo.VacancyApplicant ON dbo.applicants.id = dbo.VacancyApplicant.ApplicantId INNER JOIN
dbo.vacancy ON dbo.VacancyApplicant.VacancyId = dbo.vacancy.id AND dbo.vacancy.Name='third vacancy'
GROUP BY dbo.applications.id
HAVING COUNT(dbo.vacancy.id) > 2
) targetIds ON a.id = targetIds.id
"more than 2 vacancies"?
Your question only mentions vacancies but your query is filtering for a particular name. I assume you really want more than two of that name.
If I understand correctly, you want aggregation:
SELECT a.FirstName, a.Name
FROM dbo.applicants a INNER JOIN
dbo.VacancyApplicant va
ON a.id = va.ApplicantId INNER JOIN
dbo.vacancy v
ON va.VacancyId = v.id
WHERE v.Name = 'third vacancy'
GROUP BY a.FirstName, v.Name
HAVING COUNT(*) > 2;
Note the use of table aliases. They make the query easier to write and to read.
WITH TempCTE AS (
SELECT DISTINCT ap.FirstName
,vc.Name
,COUNT (va.VacancyId) OVER (PARTITION BY ap.id) AS NoOfVacancies
FROM dbo.applicants ap
JOIN dbo.VacancyApplicant va
ON ap.id = va.ApplicantId
JOIN dbo.vacancy vc
ON va.VacancyId = vc.id
)
SELECT FirstName,[Name], NoOfVacancies FROM TempCTE
WHERE NoOfVacancies > 2

How to create distinct count from queries with several tables

I am trying to create one single query that will give me a distinct count for both the ActivityID and the CommentID. My query in MS Access looks like this:
SELECT
tbl_Category.Category, Count(tbl_Activity.ActivityID) AS CountOfActivityID,
Count(tbl_Comments.CommentID) AS CountOfCommentID
FROM tbl_Category LEFT JOIN
(tbl_Activity LEFT JOIN tbl_Comments ON
tbl_Activity.ActivityID = tbl_Comments.ActivityID) ON
tbl_Category.CategoryID = tbl_Activity.CategoryID
WHERE
(((tbl_Activity.UnitID)=5) AND ((tbl_Comments.PeriodID)=1))
GROUP BY
tbl_Category.Category;
I know the answer must somehow include SELECT DISTINCT but am not able to get it to work. Do I need to create multiple subqueries?
This is really painful in MS Access. I think the following does what you want to do:
SELECT ac.Category, ac.num_activities, aco.num_comments
FROM (SELECT ca.category, COUNT(*) as num_activities
FROM (SELECT DISTINCT c.Category, a.ActivityID
FROM (tbl_Category as c INNER JOIN
tbl_Activity as a
ON c.CategoryID = a.CategoryID
) INNER JOIN
tbl_Comments as co
ON a.ActivityID = co.ActivityID
WHERE a.UnitID = 5 AND co.PeriodID = 1
) as caa
GROUP BY ca.category
) as ca LEFT JOIN
(SELECT c.Category, COUNT(*) as num_comments
FROM (SELECT DISTINCT c.Category, co.CommentId
FROM (tbl_Category as c INNER JOIN
tbl_Activity as a
ON c.CategoryID = a.CategoryID
) INNER JOIN
tbl_Comments as co
ON a.ActivityID = co.ActivityID
WHERE a.UnitID = 5 AND co.PeriodID = 1
) as aco
GROUP BY c.Category
) as aco
ON aco.CommentId = ac.CommentId
Note that your LEFT JOINs are superfluous because the WHERE clause turns them into INNER JOINs. This adjusts the logic for that purpose. The filtering is also very tricky, because it uses both tables, requiring that both subqueries have both JOINs.
You can use DISTINCT:
SELECT
tbl_Category.Category, Count(DISTINCT tbl_Activity.ActivityID) AS CountOfActivityID,
Count(DISTINCT tbl_Comments.CommentID) AS CountOfCommentID
FROM tbl_Category LEFT JOIN
(tbl_Activity LEFT JOIN tbl_Comments ON
tbl_Activity.ActivityID = tbl_Comments.ActivityID) ON
tbl_Category.CategoryID = tbl_Activity.CategoryID
WHERE
(((tbl_Activity.UnitID)=5) AND ((tbl_Comments.PeriodID)=1))
GROUP BY
tbl_Category.Category;

Access Subquery On mulitple conditions

This SQL query needs to be done in ACCESS.
I am trying to do a subquery on the total sales, but I want to link the sale to the province AND to product. The below query will work with one or the other: (po.product_name = allp.all_products) AND (p.province = allp.all_province); -- but it will no take both.
I will be including every month into this query, once I can figure out the subquery on with two criteria.
Select
p.province as [Province],
po.product_name as [Product],
all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id)
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province
)
as allp
on (po.product_name = allp.all_products) AND (p.province = allp.all_province);
Make the first select sql into a table by giving it an alias and join table 1 to table 2. I don't have your table structure or data to test it but I think this will lead you down the right path:
select table1.*, table2.*
from
(Select
p.province as [Province],
po.product_name as [Product]
--removed this ,all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id) table1
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province --check your group by, I dont think you want pp1.price here if you want to aggregate
) as table2 --changed from allp
on (table1.product = table2.all_products) AND (table1.province = table2.all_province);

List all donations made, by both individual alumni and business donors. Name, ID of the donor, date and amount of the donation must be displayed

Need help in this problem. I have the following tables but i cannot seem to get any data out based on the description and query as below.
Corporate(CorporateID(PK), CorporateName, CorporateAddress)
Donation(DonationID(PK), TypeOfDonations)
Alumnus(AlumnusID(PK), CityPK(FK), AlumnusName, EmailAddress, WorkPhoneNumber, HomePhoneNumber, Address
Donation_Made(CorporateDonationID(PK), DonationID(FK), CorporateID(FK), AlumnusID(FK), DonationAmount, DateOfDonation
SELECT Z.DONATIONID, A.ALUMNUSNAME, C.CORPORATENAME, Z.DATEOFDONATION, Z.DONATIONAMOUNT
FROM ALUMNUS A,
(SELECT * FROM DONATION D LEFT JOIN DONATION_MADE DM
ON D.DONATIONID = DM.DONATIONID)Z LEFT JOIN CORPORATE C
ON C.CORPORATEID = DM.CORPORATEID AND A.ALUMNUSID=DM.ALUMNUSID AND Z.TYPEOFDONATIONS= 'MONETARY';
You should start with the fact table, i.e. DONATION_MADE, and then (outer) join the related tables:
SELECT DM.DONATIONID,
A.ALUMNUSNAME,
C.CORPORATENAME,
DM.DATEOFDONATION,
DM.DONATIONAMOUNT
FROM DONATION_MADE DM
INNER JOIN DONATION D
ON D.DONATIONID = DM.DONATIONID
LEFT JOIN ALUMNUS A,
ON A.ALUMNUSID = DM.ALUMNUSID
LEFT JOIN CORPORATE C
ON C.CORPORATEID = DM.CORPORATEID
WHERE D.TYPEOFDONATIONS = 'MONETARY';
You need to use outer join to get data from both Corporate and Alumnus tables.
select a.AlumnusID, a.AlumnusName, c.CorporateID, c.CorporateName, dm.DateOfDonation, dm.DonationAmount
from Donation_Made dm
left outer join on Alumnus a on dm.AlumnusID = a.AlumnusID
left outer join on Corporate c on dm.CorporateID = c.CorporateID
inner join on Donation d on dm.DonationID = d.DonationID
where d.TypeOfDonations='MONETARY';