Unable to join tables from 2 tables with many to many condition - sql

I have this diagram below, I am trying to first computes the total number of suppliers supplying the specified part then extracts the supplier’s information supplying the specified part.
SELECT count(*) totalCount, s_suppkey, s_name
FROM supplier INNER JOIN part ON s_suppkey = p_suppkey
WHERE p_partkey = 123
GROUP BY s_suppkey, s_name;
But i keep getting this error ORA-00904: "P_SUPPKEY": invalid identifier

Based on your diagram, there is no such column in the part table,

You (probably) need to use the (bridging) table that is part-hidden off to the left of your image that connects the parts and suppliers in a many-to-many relationship.
SELECT COUNT(*) OVER () totalCount,
s_suppkey,
s_name
FROM supplier s
WHERE EXISTS(
SELECT 1
FROM part_supplied_by ps
WHERE s.s_suppkey = ps.s_suppkey
AND p.p_partkey = 123
);

Related

Trying to display values from multiple tables in SQL

I have two tables in SQL, one of which contains the employee ID (emp_id) and their first, and last name. Another table contains the employee ID with their total sales. I want to see the first name, last name, and their total sales of only those records where the total sales are higher than 25,000. The first table's name is employee, and the second table's name is works_with
The code that I used is:
SELECT employee.first_name, employee.last_name,works_with.total_sales
FROM employee
WHERE employee.emp_id IN (
SELECT works_with.emp_id
FROM works_with
WHERE works_with.total_sales>25000
);
I'm getting the following error:
"Msg 4104, Level 16, State 1, Line 25
The multi-part identifier "works_with.total_sales" could not be bound."
How can I fix this error?
Thanks
Your query still works if writing like this:
SELECT employee.first_name, employee.last_name
FROM employee
WHERE employee.emp_id IN (
SELECT works_with.emp_id
FROM works_with
WHERE works_with.total_sales>25000
);
Your select is failed because it doesn't have the column works_with.total_sales.
You can add works_with.total_sales to your select with the join clause like this:
SELECT employee.first_name, employee.last_name, works_with.total_sales
FROM employee
JOIN works_with
on employee.emp_id=works_with.emp_id
WHERE works_with.total_sales>25000
Or select multiple tables like this:
SELECT employee.first_name, employee.last_name, works_with.total_sales
FROM employee, works_with
WHERE employee.emp_id=works_with.emp_id
and works_with.total_sales>25000
Explain the error
The multi-part identifier "works_with.total_sales" could not be bound
That tells you the works_with.total_sales isn't available in the current selection.
You can't bound it to your selection.
see https://www.w3schools.com/sql/sql_join.asp for join clause in t-SQL
This query might work given your requirements. A JOIN could be used to link those two tables in your question.
select t1.first_name,
t1.last_name,
t2.total_sales
from employees t1
join works_with t2
on t2.emp_id = t1.emp_id
where t2.total_sales > 25000
You can join bth tables and show all rows with where the total_sales is bigger than 2500
SELECT employee.first_name, employee.last_name,works_with.total_sales
FROM employee INNER JOIN works_with ON works_with.emp_id = employee.emp_id
WHERE works_with.total_sales>25000;

How to use WITH clause and select clause

click here to view screenshot of table
Question: write a query to display the customer number, firstname, lastname for those client where total loan amount taken is maximum and at least taken from 2 bank branch.
I have tried the following query but I'm getting this error
Msg 8120, Level 16, State 1, Line 7
Column 'customer.fname' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Code:
with l as
(
select custid, sum(loan_amount) as tot
from loan
group by custid
having count(bid) >= 2
)
select
concat(c.fname, c.ltname) as name,
max(l.tot)
from
customer as c, l
where
l.custid = c.custid
You need to have a GROUP BY to select both aggregated and non-aggregated data, so you need to decide how you want the data grouped. You could do either
SELECT CONCAT(c.fname,c.ltname) as name, MAX(l.tot)
FROM customer AS c
INNER JOIN l ON l.custid=c.custid
GROUP BY c.fname,c.ltname
or
SELECT CONCAT(c.fname,c.ltname) as name, MAX(l.tot)
FROM customer AS c
INNER JOIN l ON l.custid=c.custid
GROUP BY concat(c.fname,c.ltname)
Please note the following:
I converted the "old" join syntax to the more acceptable INNER JOIN syntax
You probably want a space between the first and late name if you're displaying the results.

Join statement and comparison

The database being used for this question is structured as follows with Primary Keys bolded, and Foreign Keys ' '.
Countries (Name, Country_ID, area_sqkm, population)
Teams (team_id, name, 'country_id', description, manager)
Stages (stage_id, took_place, start_loc, end_loc, distance, description)
Riders (rider_id, name, 'team_id', year_born, height_cms, weight_kgs, 'country_id', bmi)
Results ('stage_id', 'rider_id', time_seconds)
I am stuck at the question of:
Q: Bradley Wiggins won the tour. Write a query to find the riders who beat him in at least 4 stages, i.e., riders who had a better time than Wiggins in at least 4 of the 21 stages.
I am currently at :
SELECT ri.name
from riders ri
INNER JOIN results re ON ri.name = re.name
WHERE ri.name = 'BRADLEY Wiggins' IN ...`
I am unsure of how can I move to comparing 2 time_seconds.
May I know how can I go about getting the solution?
Thank you
The task is indeed a little complicated, as it involves several concepts.
The first of these is a self join, i.e. you'll have to select from the same table twice. You want Bradley's results and the others' results, so as to be able to compare them.
select ...
from results bradley
join results other on ...
Or:
select ...
from (select * from results where ...) bradley
join (select * from results where ...) other on ...
Let's use the first option. We add a WHERE clause so to get Bradley and we add the ON clause to get non-Bradleys at the same stage with a better result:
select ...
from results bradley
join results other on other.rider_id <> bradley.rider_id
and other.stage_id = bradley.stage_id
and other.time_seconds < bradley.time_seconds
where bradley.rider_id = (select id from riders where name = 'BRADLEY Wiggins')
The last part is to find riders with at least four better results. This is called aggregation. You want to see riders, so you group by rider_id. And you want to count, so you use COUNT. Moreover you want to restrict results based on COUNT, so you put this in the HAVING clause:
select other.rider_id
from results bradley
join results other on other.rider_id <> bradley.rider_id
and other.stage_id = bradley.stage_id
and other.time_seconds < bradley.time_seconds
where bradley.rider_id = (select id from riders where name = 'BRADLEY Wiggins')
group by other.rider_id
having count(*) >= 4;
As to getting the riders' data, e.g. their names, there are a couple of options:
Join the table and put the columns both in your SELECT clause and your GROUP BY clause. You would do this, if you wanted data from both sets, i.e. riders' data plus the result count.
Subselect the value if you only want one value (e.g. the name). That's simple but really only makes sense when you want only one value from riders table.
You'd change your SELECT clause thus:
select (select name from riders where id = other.rider_id) as name
Write an outer query around the query you already have.
This would be:
select *
from riders
where id in
(
select other.rider_id
from results bradley
join results other on other.rider_id <> bradley.rider_id
and other.stage_id = bradley.stage_id
and other.time_seconds < bradley.time_seconds
where bradley.rider_id = (select id from riders where name = 'BRADLEY Wiggins')
group by other.rider_id
having count(*) >= 4
);

SQL Access: how to obtain output involving multiple tables without running 2 queries?

I would like to find out the most popular genre of film for a certain age group, for example 20-30 year-olds. I'm quite new to SQL and would appreciate any help I can get, apologies if this is too minor.
The relevant tables for this query are:
FILM {FID (PK), ..., Film_Title}
MEMBER {MID (PK), ..., Date_of_Birth}
LIST {MID (FK), FID (FK)}
GENRE {GID (PK), Genre}
FILM_ACTOR_DIRECTOR_GENRE {FID (FK), ..., GID (FK)}
FILM and MEMBER table should be quite self-explanatory, while a LIST is a selection of films a MEMBER wishes to rent. It's like a shopping basket. Each member only has one list and each list can contain many films. FILM_ACTOR_DIRECTOR_GENRE contains Genre belonging to each film. Each film can only have one genre.
So far I have managed to get an output which shows:
Genre # People Aged 20-30
------- -------------------
Action 5
Comedy 4
Horror 2
etc. etc.
However it involves creating a table and then running another query. Is there a way to obtain the most popular genre within a particular age group without having to run 2 separate queries?
The 2 queries I've used are:
SELECT DISTINCT Genre.Genre_Name, Member.Date_of_Birth
INTO Genre_by_Age
FROM
((((Genre
INNER JOIN Film_Actor_Director_Genre ON Genre.GID = Film_Actor_Director_Genre.GID)
INNER JOIN Film ON Film_Actor_Director_Genre.FID = Film.FID)
INNER JOIN List ON Film.FID = List.FID)
INNER JOIN Member ON Member.MID = List.MID)
WHERE (((Member.[Date_of_Birth]) Between #4/16/1995# And #4/16/1985#));
for creating the new table with information I want, and:
SELECT Genre_Name, COUNT(*) as Number_of_People_aged_20_to_30
FROM Genre_by_Age
GROUP BY Genre_Name
ORDER BY COUNT(*) DESC;
to obtain the output shown above.
Is there a way to obtain the above result without running 2 separate queries? Thanks for your time!
How about using a subquery?
SELECT Genre_Name, COUNT(*) as Number_of_People_aged_20_to_30
FROM (SELECT DISTINCT Genre.Genre_Name, Member.Date_of_Birth
FROM ((((Genre
INNER JOIN Film_Actor_Director_Genre ON Genre.GID = Film_Actor_Director_Genre.GID)
INNER JOIN Film ON Film_Actor_Director_Genre.FID = Film.FID)
INNER JOIN List ON Film.FID = List.FID)
INNER JOIN Member ON Member.MID = List.MID)
WHERE (((Member.[Date_of_Birth]) Between #4/16/1995# And #4/16/1985#))
) as t
GROUP BY Genre_Name
ORDER BY COUNT(*) DESC;
I think this should work:
SELECT Genre.Genre_Name, count(Member.MID) as Number_of_People_aged_20_to_30
FROM
((((Genre
INNER JOIN Film_Actor_Director_Genre ON Genre.GID = Film_Actor_Director_Genre.GID)
INNER JOIN Film ON Film_Actor_Director_Genre.FID = Film.FID)
INNER JOIN List ON Film.FID = List.FID)
INNER JOIN Member ON Member.MID = List.MID)
WHERE (((Member.[Date_of_Birth]) Between #4/16/1995# And #4/16/1985#))
GROUP BY Genre.Genre_Name
ORDER BY count(Member.MID) DESC;

Using group by and having clause

Using the following schema:
Supplier (sid, name, status, city)
Part (pid, name, color, weight, city)
Project (jid, name, city)
Supplies (sid, pid, jid**, quantity)
Get supplier numbers and names for suppliers of parts supplied to at least two different projects.
Get supplier numbers and names for suppliers of the same part to at least two different projects.
These were my answers:
1.
SELECT s.sid, s.name
FROM Supplier s, Supplies su, Project pr
WHERE s.sid = su.sid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid) >= 2
2.
SELECT s.sid, s.name
FROM Suppliers s, Supplies su, Project pr, Part p
WHERE s.sid = su.sid AND su.pid = p.pid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid)>=2
Can anyone confirm if I wrote this correctly? I'm a little confused as to how the Group By and Having clause works
The semantics of Having
To better understand having, you need to see it from a theoretical point of view.
A group by is a query that takes a table and summarizes it into another table. You summarize the original table by grouping the original table into subsets (based upon the attributes that you specify in the group by). Each of these groups will yield one tuple.
The Having is simply equivalent to a WHERE clause after the group by has executed and before the select part of the query is computed.
Lets say your query is:
select a, b, count(*)
from Table
where c > 100
group by a, b
having count(*) > 10;
The evaluation of this query can be seen as the following steps:
Perform the WHERE, eliminating rows that do not satisfy it.
Group the table into subsets based upon the values of a and b (each tuple in each subset has the same values of a and b).
Eliminate subsets that do not satisfy the HAVING condition
Process each subset outputting the values as indicated in the SELECT part of the query. This creates one output tuple per subset left after step 3.
You can extend this to any complex query there Table can be any complex query that return a table (a cross product, a join, a UNION, etc).
In fact, having is syntactic sugar and does not extend the power of SQL. Any given query:
SELECT list
FROM table
GROUP BY attrList
HAVING condition;
can be rewritten as:
SELECT list from (
SELECT listatt
FROM table
GROUP BY attrList) as Name
WHERE condition;
The listatt is a list that includes the GROUP BY attributes and the expressions used in list and condition. It might be necessary to name some expressions in this list (with AS). For instance, the example query above can be rewritten as:
select a, b, count
from (select a, b, count(*) as count
from Table
where c > 100
group by a, b) as someName
where count > 10;
The solution you need
Your solution seems to be correct:
SELECT s.sid, s.name
FROM Supplier s, Supplies su, Project pr
WHERE s.sid = su.sid AND su.jid = pr.jid
GROUP BY s.sid, s.name
HAVING COUNT (DISTINCT pr.jid) >= 2
You join the three tables, then using sid as a grouping attribute (sname is functionally dependent on it, so it does not have an impact on the number of groups, but you must include it, otherwise it cannot be part of the select part of the statement). Then you are removing those that do not satisfy your condition: the satisfy pr.jid is >= 2, which is that you wanted originally.
Best solution to your problem
I personally prefer a simpler cleaner solution:
You need to only group by Supplies (sid, pid, jid**, quantity) to
find the sid of those that supply at least to two projects.
Then join it to the Suppliers table to get the supplier same.
SELECT sid, sname from
(SELECT sid from supplies
GROUP BY sid
HAVING count(DISTINCT jid) >= 2
) AS T1
NATURAL JOIN
Supliers;
It will also be faster to execute, because the join is only done when needed, not all the times.
--dmg
Because we can not use Where clause with aggregate functions like count(),min(), sum() etc. so having clause came into existence to overcome this problem in sql. see example for having clause go through this link
http://www.sqlfundamental.com/having-clause.php
First of all, you should use the JOIN syntax rather than FROM table1, table2, and you should always limit the grouping to as little fields as you need.
Altought I haven't tested, your first query seems fine to me, but could be re-written as:
SELECT s.sid, s.name
FROM
Supplier s
INNER JOIN (
SELECT su.sid
FROM Supplies su
GROUP BY su.sid
HAVING COUNT(DISTINCT su.jid) > 1
) g
ON g.sid = s.sid
Or simplified as:
SELECT sid, name
FROM Supplier s
WHERE (
SELECT COUNT(DISTINCT su.jid)
FROM Supplies su
WHERE su.sid = s.sid
) > 1
However, your second query seems wrong to me, because you should also GROUP BY pid.
SELECT s.sid, s.name
FROM
Supplier s
INNER JOIN (
SELECT su.sid
FROM Supplies su
GROUP BY su.sid, su.pid
HAVING COUNT(DISTINCT su.jid) > 1
) g
ON g.sid = s.sid
As you may have noticed in the query above, I used the INNER JOIN syntax to perform the filtering, however it can be also written as:
SELECT s.sid, s.name
FROM Supplier s
WHERE (
SELECT COUNT(DISTINCT su.jid)
FROM Supplies su
WHERE su.sid = s.sid
GROUP BY su.sid, su.pid
) > 1
What type of sql database are using (MSSQL, Oracle etc)?
I believe what you have written is correct.
You could also write the first query like this:
SELECT s.sid, s.name
FROM Supplier s
WHERE (SELECT COUNT(DISTINCT pr.jid)
FROM Supplies su, Projects pr
WHERE su.sid = s.sid
AND pr.jid = su.jid) >= 2
It's a little more readable, and less mind-bending than trying to do it with GROUP BY. Performance may differ though.
1.Get supplier numbers and names for suppliers of parts supplied to at least two different projects.
SELECT S.SID, S.NAME
FROM SUPPLIES SP
JOIN SUPPLIER S
ON SP.SID = S.SID
WHERE PID IN
(SELECT PID FROM SUPPPLIES GROUP BY PID, JID HAVING COUNT(*) >= 2)
I am not slear about your second question