SQL Query - max(count()) - possible without CTE? - sql

I'm studying for my Database Systems exam at the moment, having some trouble with an exercise creating a query.
I have four tables:
A referent-table with personal data of referents,
A course-table with course data (with the responsible referent as foreign key),
A workshop-table with workshop data (with the corresponding course as foreign key),
A booking-table which manages bookings (with the corresponding workshop which has been booked as a foreign key)
My exercise is to find out how much money a referent earns (there's a price-column in workshop)
It's not very difficult to list how much money he earns per workshop; I created this query to show me:
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.price AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid
This returns this:
2;"Anna";0.60
4;"Ahmed";3.5
1;"Hans";
2;"Anna";13.20
3;"Wolfgang";
As you can see, it works fine.
I now have two rows for Anna (because she is responsible for two courses..) and want to have one row with the sum of both tables.
Unfortunately, the only way to do this (as I found out) is with a Common Table Expression (CTE) - with a CTE it works:
WITH incomepercourse AS (
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.price AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid
)
SELECT referentid, name, SUM(income) FROM incomepercourse GROUP BY referentid, name
this returns:
3;"Wolfgang";
4;"Ahmed";3.50
2;"Anna";13.80
1;"Hans";
Is there any way to avoid a CTE?
My professor didn't talk about CTE, and it also isn't in his lecture notes - so there has to be some other, simpler way.
Is there anyone out there who knows a better way to achieve this?
Thanks in Advance!

You can wrap the CTE part and query as shown below.
SELECT tbl1.referentid, tbl1.name, SUM(tbl1.income) FROM (
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.Preis AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid)tbl1
GROUP BY tbl1.referentid, tbl1.name

You do not need to join the workshop and booking table if you do not use it.
If the price is in the course table as shown in your first request I think what you are asking for is answered by this request:
SELECT referent.referentid, referent.name, sum(price)
FROM referent LEFT JOIN g22_courses ON g22_courses.responsiblerefid=referent.id
GROUP BY referent.referentid, referent.name
If the price is in the workshop table, just add a join to this table.

Okay, I solved it.
It was a problem with my understanding of the problem after all... ._.
I know managed to make a query like a_horse_with_no_name said in his comment:
SELECT r.referentid,
r.name,
SUM(c.price) AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid
ORDER BY r.referentid
This solves my problem perfectly, returning the right values.
Thank you!

Related

Trying to write query of most popular product (meaning product with most distinct customers) super stuck. without using view

this is the schema
Customer(CID,Name,City,State)
Order(OID,CID,Date)
Product(PID,ProductName,Price)
LineItem(LID,OID,PID,Number,TotalPrice),
with mostCustomers (pid, cnt) as
(
select
li.PID, count(distinct o.CID) customers
from LineItem li
inner join order o on o.OID = li.OID
group by li.PID
),
maxCustomers (customers) as (select max(customers) from mostCustomers)
select p.ProductName
from mostCustomers mc
inner join maxCustomers mx on mx customers = mc.customers
inner join product p on p.PID = mc.PID;
You should give sample data and expected output. Also you should have tagged your question with the backend you are using. Above query would work for most but not all.
Please do that next time.

SQL - How do I select rows from one table depending on data from two other tables?

I have an SQL question. It's a simple problem, but I'm not an SQL guy at all.
Here is the situation, I have three tables:
CUSTOMER(
PK(customer_id)
)
LOAN(
PK(loan_id),
customer_it,
behavior_id
)
BEHAVIOR(
PK(behavior_id),
unpaid_number
)
// PK(x): x is a primary key.
I would like to select all of the CUSTOMERs who have an unpaid_number >= 1.
Can anybody show me a way to work this around?
Thanks
You are looking for INNER JOIN. Use like:
SELECT * FROM CUSTOMER c
INNER JOIN LOAN l ON c.customer_id = l.customer_it
INNER JOIN BEHAVIOR b ON b.behavior_id = l.behavior_id
WHERE b.unpaid_number>=1
Use inner join
SELECT c.* FROM CUSTOMER c INNER JOIN LOAN l ON l.customer_id = c.Customer_id INNER JOIN BEHAVIOR b ON b.behavior_id = l.behavior_id WHERE unpaid_number >=1
Actually, if you want all customers, you presumably want one row per customer, regardless of the number of matching rows in behavior.
That would suggest using exists or in:
select c.*
from customer c
where exists (select 1
from loan l join
behavior b
on b.behavior_id = l.behavior_id
where b.unpaid_number >= 1 and
l.customer_id = c.customer_id
);
This is particularly important if you are considering using select distinct.
Please, try below code
SELECT c.*
FROM CUSTOMER c
INNER JOIN LOAN l
ON l.customer_id = c.Customer_id
INNER JOIN BEHAVIOR b
ON b.behavior_id = l.behavior_id
WHERE unpaid_number >=1
try this?
SELECT LOAN.customer_it FROM LOAN
WHERE LOAN.behavior_id IN
(SELECT BEHAVIOR.behavior_id
from BEHAVIOR where BEHAVIOR.unpaid_number>=1)

SQL Get aggregate as 0 for non existing row using inner joins

I am using SQL Server to query these three tables that look like (there are some extra columns but not that relevant):
Customers -> Id, Name
Addresses -> Id, Street, StreetNo, CustomerId
Sales -> AddressId, Week, Total
And I would like to get the total sales per week and customer (showing at the same time the address details). I have come up with this query
SELECT a.Name, b.Street, b.StreetNo, c.Week, SUM (c.Total) as Total
FROM Customers a
INNER JOIN Addresses b ON a.Id = b.CustomerId
INNER JOIN Sales c ON b.Id = c.AddressId
GROUP BY a.Name, c.Week, b.Street, b.StreetNo
and even if my SQL skill are close to none it looks like it's doing its job. But now I would like to be able to show 0 whenever the one customer don't have sales for a particular week (weeks are just integers). And I wonder if somehow I should get distinct values of the weeks in the Sales table, and then loop through them (not sure how)
Any help?
Thanks
Use CROSS JOIN to generate the rows for all customers and weeks. Then use LEFT JOIN to bring in the data that is available:
SELECT c.Name, a.Street, a.StreetNo, w.Week,
COALESCE(SUM(s.Total), 0) as Total
FROM Customers c CROSS JOIN
(SELECT DISTINCT s.Week FROM sales s) w LEFT JOIN
Addresses a
ON c.CustomerId = a.CustomerId LEFT JOIN
Sales s
ON s.week = w.week AND s.AddressId = a.AddressId
GROUP BY c.Name, a.Street, a.StreetNo, w.Week;
Using table aliases is good, but the aliases should be abbreviations for the table names. So, a for Addresses not Customers.
You should generate a week numbers, rather than using DISTINCT. This is better in terms of performance and reliability. Then use a LEFT JOIN on the Sales table instead of an INNER JOIN:
SELECT a.Name
,b.Street
,b.StreetNo
,weeks.[Week]
,COALESCE(SUM(c.Total),0) as Total
FROM Customers a
INNER JOIN Addresses b ON a.Id = b.CustomerId
CROSS JOIN (
-- Generate a sequence of 52 integers (13 x 4)
SELECT ROW_NUMBER() OVER (ORDER BY a.x) AS [Week]
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x)
CROSS JOIN (SELECT x FROM (VALUES(1),(1),(1),(1)) b(x)) b
) weeks
LEFT JOIN Sales c ON b.Id = c.AddressId AND c.[Week] = weeek.[Week]
GROUP BY a.Name
,b.Street
,b.StreetNo
,weeks.[Week]
Please try the following...
SELECT Name,
Street,
StreetNo,
Week,
SUM( CASE
WHEN Total IS NULL THEN
0
ELSE
Total
END ) AS Total
FROM Customers a
JOIN Addresses b ON a.Id = b.CustomerId
RIGHT JOIN Sales c ON b.Id = c.AddressId
GROUP BY a.Name,
c.Week,
b.Street,
b.StreetNo;
I have modified your statement in three places. The first is I changed your join to Sales to a RIGHT JOIN. This will join as it would with an INNER JOIN, but it will also keep the records from the table on the right side of the JOIN that do not have a matching record or group of records on the left, placing NULL values in the resulting dataset's fields that would have come from the left of the JOIN. A LEFT JOIN works in the same way, but with any extra records in the table on the left being retained.
I have removed the word INNER from your surviving INNER JOIN. Where JOIN is not preceded by a join type, an INNER JOIN is performed. Both JOIN and INNER JOIN are considered correct, but the prevailing protocol seems to be to leave the INNER out, where the RDBMS allows it to be left out (which SQL-Server does). Which you go with is still entirely up to you - I have left it out here for illustrative purposes.
The third change is that I have added a CASE statement that tests to see if the Total field contains a NULL value, which it will if there were no sales for that Customer for that Week. If it does then SUM() would return a NULL, so the CASE statement returns a 0 instead. If Total does not contain a NULL value, then the SUM() of all values of Total for that grouping is performed.
Please note that I am assuming that Total will not have any NULL values other than from the RIGHT JOIN. Please advise me if this assumption is incorrect.
Please also note that I have assumed that either there will be no missing Weeks for a Customer in the Sales table or that you are not interested in listing them if there are. Again, please advise me if this assumption is incorrect.
If you have any questions or comments, then please feel free to post a Comment accordingly.

Get some data which corresponds to the maximum date

I have these 3 tables:
Table ORG:
Fields:historyid, personid
Table PERSON:
Fields: id
Table HISTORY:
Fields: id,date,personid
Both HISTORY and ORG are linked to PERSON with an 1:N relationship. Also, ORG is linked to HISTORY with an 1:N relationship. I want to get from table ORG for each person just one row: this which corresponds to the HISTORY row with the highest date. The following SQL gives the highest date for a certain person. However, I do not know how to combine this with the above requirement.
SELECT ash1.id
FROM
(SELECT * FROM history a WHERE a.personid=person.id) ash1
LEFT JOIN
(SELECT * FROM history b WHERE b.personid=person.id) ash2
ON ash1.personid=ash2.personid
AND ash1.date < ash2.date
WHERE ash2.date IS NULL
I think you can do it by using MAX() and GROUP BY:
SELECT
o.historyid AS o_hist,
o.personid AS o_per,
h.id AS h_id,
MAX(h.date) AS h_date,
h.personid AS h_person
FROM
org o
LEFT JOIN
person p ON p.id = o.personid
LEFT JOIN
history h ON h.id = o.historyid AND h.personid = p.id
GROUP BY o_per
Try the below query..
;WITH_CTE_HighestHistory
AS (SELECT PersonID,MAX([Date]) HDate
FROM History
GROUP BY PersonID)
SELECT org.*,h.*
FROM org o
LEFT JOIN History h ON o.Historyid=h.Id and o.PersonID=h.PersonId
INNER JOIN WITH_CTE_HighestHistory ch ON h.Personid=ch.Personid and h.[Date]=ch.[Date]
WHERE EXISTS(SELECT 1 FROM Person p WHERE p.Id=o.PersonID )
There are multiple ways to approach this, depending on the database. However, your data structure is awkward. Why does org have historyid? That doesn't really make sense to me.
In any case, based on your description, this should work:
select o.*, h.*
from org o join
history h
on h.personid = o.personid
where h.date = (select max(h2.date)
from history h2
where h2.personid = h.personid
);
You might want to start the from clause as:
from (select distinct personid from org) o
So, you only get one person, if they are repeated in the table.

Multiple COUNT in 1 SQLITE Query

Using SQLite.
SELECT c.*,
COUNT(m.course_id) AS "meeting_count",
COUNT(r.meeting_id) AS "race_count"
FROM course c
LEFT JOIN meeting m ON m.course_id = c.id
LEFT JOIN race r ON r.meeting_id = m.id
GROUP BY c.id
Course has meetings has races.
Trying to select the correct count for course meetings and course races. The problem is the above query is returning the same count for "meeting_count" as "race_count". What am I doing wrong?
try adding DISTINCT like COUNT(DISTINCT m.course_id)