Multiply total number of values in column by value in a different table - sql

I am trying to count all the values in one column and then multiply this number by a value in a different table. So far I have:
SELECT CLUB_FEE * COUNT(MEMBER_ID) AS VALUE
FROM CLUB, SUBSCRIPTION
WHERE CLUB_ID = 'CLUB1';
This is not working however, can anyone please help?
I also need help doing this for multiple clubs. Is it possible to do it all in one statement for all clubs and then get the average?

Presumably, you intend something like this:
SELECT MAX(c.CLUB_FEE) * COUNT(MEMBER_ID) AS VALUE
FROM CLUB c JOIN
SUBSCRIPTION s
ON c.CLUB_ID = s.CLUB_ID
WHERE c.CLUB_ID = 'CLUB1';
You can also write this as:
SELECT SUM(c.CLUB_FEE) AS VALUE
FROM CLUB c JOIN
SUBSCRIPTION s
ON c.CLUB_ID = s.CLUB_ID
WHERE c.CLUB_ID = 'CLUB1';
I thought the first version would be clearer, because the OP specifies COUNT() in the question.
If you want it for all clubs that have subscribers:
SELECT SUM(c.CLUB_FEE) AS VALUE
FROM CLUB c JOIN
SUBSCRIPTION s
ON c.CLUB_ID = s.CLUB_ID
GROUP BY c.CLUB_ID;

From inspecting the explain plans, it seems the following version may be a bit more efficient (since it avoids a join and uses only one aggregation). If you need this for ALL clubs at the same time, then probably all solutions will have the same "optimizer cost" (they will all do a join at some point).
select club_fee * (select count(member_id) from subscription where club_id = 'CLUB1')
from club
where club_id = 'CLUB1'
So now the only aggregate function is pushed into a subquery and the rest does not need either a join or another aggregate function.
Of course, this only matters if performance is important; it may very well not be.

Related

Query build to find records where all of a series of records have a value

Let me explain a little bit about what I am trying to do because I dont even know the vocab to use to ask. I have an Access 2016 database that records staff QA data. When a staff member misses a QA we assign a job aid that explains the process and they can optionally send back a worksheet showing they learned about what was missed. If they do all of these ina 3 month period they get a credit on their QA score. So I have a series of records all of whom have a date we assigned the work(RA1) and MAY have a work returned date(RC1).
In the below image "lavalleer" has earned the credit because both of her sheets got returned. "maduncn" Did not earn the credit because he didn't do one.
I want to create a query that returns to me only the people that are like "lavalleer". I tried hitting google and searched here and access.programmers.co.uk but I'm only coming up with instructions to use Not null statements. That wouldn't work for me because if I did a IS Not Null on "maduncn" I would get the 4 records but it would exclude the null.
What I need to do is build a query where I can see staff that have dates in ALL of their RC1 fields. If any of their RC1 fields are blank I dont want them to return.
Consider:
SELECT * FROM tablename WHERE NOT UserLogin IN (SELECT UserLogin FROM tablename WHERE RCI IS NULL);
You could use a not exists clause with a correlated subquery, e.g.
select t.* from YourTable t where not exists
(select 1 from YourTable u where t.userlogin = u.userlogin and u.rc1 is null)
Here, select 1 is used purely for optimisation - we don't care what the query returns, just that it has records (or doesn't have records).
Or, you could use a left join to exclude those users for which there is a null rc1 record, e.g.:
select t.* from YourTable t left join
(select u.userlogin from YourTable u where u.rc1 is null) v on t.userlogin = v.userlogin
where v.userlogin is null
In all of the above, change all occurrences of YourTable to the name of your table.

What's the most efficient way to exclude possible results from an SQL query?

I have a subscription database containing Customers, Subscriptions and Publications tables.
The Subscriptions table contains ALL subscription records and each record has three flags to mark the status: isActive, isExpire and isPending. These are Booleans and only one flag can be True - this is handled by the application.
I need to identify all customers who have not renewed any magazines to which they have previously subscribed and I'm not sure that I've written the most efficient SQL query. If I find a lapsed subscription I need to ignore it if they already have an active or pending subscription for that particular magazine.
Here's what I have:
SELECT DISTINCT Customers.id, Subscriptions.publicationName
FROM Subscriptions
LEFT JOIN Customers
ON Subscriptions.id_Customer = Customers.id
LEFT JOIN Publications
ON Subscriptions.id_Publication = Publications.id
WHERE Subscriptions.isExpired = 1
AND NOT EXISTS
( SELECT * FROM Subscriptions s2
WHERE s2.id_Publication = Subscriptions.id_Publication
AND s2.id_Customer = Subscriptions.id_Customer
AND s2.isPending = 1 )
AND NOT EXISTS
( SELECT * FROM Subscriptions s3
WHERE s3.id_Publication = Subscriptions.id_Publication
AND s3.id_Customer = Subscriptions.id_Customer
AND s3.isActive = 1 )
I have just over 50,000 subscription records and this query takes almost an hour to run which tells me that there's a lot of looping or something going on where for each record the SQL engine is having to search again to find any 'isPending' and 'isActive' records.
This is my first post so please be gentle if I've missed out any information in my question :) Thanks.
I don't have your complete database structure, so I can't test the following query but it may contain some optimization. I will leave it to you to test, but will explain why I have changed, what I have changed.
select Distinct Customers.id, Subscriptions.publicationName
from Subscriptions
join Customers on Subscriptions.id_Customer = Customer.id
join Publications
ON Subscriptions.id_Publication = Publications.id
Where Subscriptions.isExpired = 1
And Not Exists
(select * from Subscriptions s2
join Customers on s2.id_Customer = Customer.id
join Publications
ON s2.id_Publication = Publications.id
where s2.id_Customer = s2.id_customer and
(s2.isPending = 1 or s2.isActive = 1))
If you have no resulting data in Customer or Publications DB, then the Subscription information isn't useful, so I eliminated the LEFT join in favor of simply join. Combine the two Exists subqueries. These are pretty intensive if I recall so the fewer the better. Last thing which I did not list above but may be worth looking into is, can you run a subquery with specific data fields returned and use it in an Exists clause? The use of Select * will return all data fields which slows down processing. I'm not sure if you can limit your result unfortunately, because I don't have an equivalent DB available to me that I can test on (the google probably knows).
I suspect there are further optimizations that could be made on this query. Eliminating the Exists clause in favor of an 'IN' clause may help, but I can't think of a way right now, seeing how you've got to match two unique fields (customer id and the relevant subscription). Let me know if this helps at all.
With a table of 50k rows, you should be able to run a query like this in seconds.

Cannot find correct number of values in a table that are not in another table, though I can do otherwise

I want to retrieve the course_id in table course that is not in the table takes. Table takes only contains course_id of courses taken by students. The problem is that if I have:
select count (distinct course.course_id)
from course, takes
where course.course_id = (takes.course_id);
the result is 85 which is smaller than the total number of course_id in table course, which is 200. The result is correct.
But I want to find the number of course_id that are not in the table takes, and I have:
select count (distinct course.course_id)
from course, takes
where course.course_id != (takes.course_id);
The result is 200, which is equal the number of course_id in table course. What is wrong with my code?
This SQL will give you the count of course_id in table course that aren't in the table takes:
select count (*)
from course c
where not exists (select *
from takes t
where c.course_id = t.course_id);
You didn't specify your DBMS, however, this SQL is pretty standard so it should work in the popular DBMSs.
There are a few different ways to accomplish what you're looking for. My personal favorite is the LEFT JOIN condition. Let me walk you through it:
Fact One: You want to return a list of courses
Fact Two: You want to
filter that list to not include anything in the Takes table.
I'd go about this by first mentally selecting a list of courses:
SELECT c.Course_ID
FROM Course c
and then filtering out the ones I don't want. One way to do this is to use a LEFT JOIN to get all the rows from the first table, along with any that happen to match in the second table, and then filter out the rows that actually do match, like so:
SELECT c.Course_ID
FROM
Course c
LEFT JOIN -- note the syntax: 'comma joins' are a bad idea.
Takes t ON
c.Course_ID = t.Course_ID -- at this point, you have all courses
WHERE t.Course_ID IS NULL -- this phrase means that none of the matching records will be returned.
Another note: as mentioned above, comma joins should be avoided. Instead, use the syntax I demonstrated above (INNER JOIN or LEFT JOIN, followed by the table name and an ON condition).

SQL Counting and Joining

I'm taking a database course this semester, and we're learning SQL. I understand most simple queries, but I'm having some difficulty using the count aggregate function.
I'm supposed to relate an advertisement number to a property number to a branch number so that I can tally up the amount of advertisements by branch number and compute their cost. I set up what I think are two appropriate new views, but I'm clueless as to what to write for the select statement. Am I approaching this the correct way? I have a feeling I'm over complicating this bigtime...
with ad_prop(ad_no, property_no, overseen_by) as
(select a.ad_no, a.property_no, p.overseen_by
from advertisement as a, property as p
where a.property_no = p.property_no)
with prop_branch(property_no, overseen_by, allocated_to) as
(select p.property_no, p.overseen_by, s.allocated_to
from property as p, staff as s
where p.overseen_by = s.staff_no)
select distinct pb.allocated_to as branch_no, count( ??? ) * 100 as ad_cost
from prop_branch as pb, ad_prop as ap
where ap.property_no = pb.property_no
group by branch_no;
Any insight would be greatly appreciated!
You could simplify it like this:
advertisement
- ad_no
- property_no
property
- property_no
- overseen_by
staff
- staff_no
- allocated_to
SELECT s.allocated_to AS branch, COUNT(*) as num_ads, COUNT(*)*100 as ad_cost
FROM advertisement AS a
INNER JOIN property AS p ON a.property_no = p.property_no
INNER JOIN staff AS s ON p.overseen_by = s.staff_no
GROUP BY s.allocated_to;
Update: changed above to match your schema needs
You can condense your WITH clauses into a single statement. Then, the piece I think you are missing is that columns referenced in the column definition have to be aggregated if they aren't included in the GROUP BY clause. So you GROUP BY your distinct column then apply your aggregation and math in your column definitions.
SELECT
s.allocated_to AS branch_no
,COUNT(a.ad_no) AS ad_count
,(ad_count * 100) AS ad_cost
...
GROUP BY s.allocated_to
i can tell you that you are making it way too complicated. It should be a select statement with a couple of joins. You should re-read the chapter on joins or take a look at the following link
http://www.sql-tutorial.net/SQL-JOIN.asp
A join allows you to "combine" the data from two tables based on a common key between the two tables (you can chain more tables together with more joins). Once you have this "joined" table, you can pretend that it is really one table (aliases are used to indicate where that column came from). You understand how aggregates work on a single table right?
I'd prefer not to give you the answer so that you can actually learn :)

Select based on the number of appearances of an id in another table

I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.