SQL: Return unique entries of only first table in a multi table join - sql

I'm relatively new to sql so this may probably come off as a simple questions.
I have 3 tables A,B,C
B has a FK to A, and C has a FK to B
I would like to get all the distinct entries in A that has an entry in B that corresponds to an entry in C. With JOINs I currently have:
SELECT DISTINCT *
FROM Peoples p
INNER JOIN Contracts c
ON p.pkey=c.person_pkey
JOIN audits a ON c.contract_pkey=a.contract_issued_for;
So this returns the list of all people (with duplicates) with an audit. How do I get it so that instead of return columns of all the tables put together, but rather just all the columns that belong to table peoples and are unique entries?
Thanks in advance!

Replace
SELECT DISTINCT *
with
SELECT DISTINCT p.*

SELECT DISTINCT p.*
FROM Peoples p
...

The minimal change to your query would be to change * to p.*. However, I think a better approach is to use IN clauses rather than JOINs:
SELECT *
FROM Peoples
WHERE pkey IN
( SELECT person_pkey
FROM Contracts
WHERE contract_pkey IN
( SELECT contract_issued_for
FROM audits
)
)
;
This makes clear what the query is really doing: it's finding people who have contracts that audits have been issued for, and is not otherwise interested in those contracts or audits.

Related

SQL Server question - subqueries in column result with a join?

I have a distinct list of part numbers from one table. It is basically a table that contains a record of all the company's part numbers. I want to add columns that will pull data from different tables but only pertaining to the part number on that row of the distinct part list.
For example: if I have part A, B, C from the unique part list I want to add columns for Purchase quantity, repair quantity, loan quantity, etc... from three totally unique tables.
So it's almost like I need 3 subqueries that will sum of that data from the different tables for each part.
Can anybody steer me in the direction of how to do this? Please and thank you so much!
One method is correlated subqueries. Something like this:
select p.*,
(select count(*)
from purchases pu
where pu.part_id = p.part_id
) as num_purchases,
(select count(*)
from repairs r
where r.part_id = p.part_id
) as num_repairs,
(select count(*)
from loans l
where l.part_id = p.part_id
) as num_loans
from parts p;
Another option is joins with aggregation before the join. Or lateral joins (which are quite similar to correlated subqueries).

Select all in table where a column in table 1 exists in another table and a second column equals a variable in second table

I know the title is confusing but its the best I could explain it. Basically im developing a cinema listings website for a company which owns two cinemas. So I have a database which has the two tables "Films" and "Listings" with data for both cinemas in them.
I'm trying to select all films and their data for one cinema if the films name shows up in the listings (since the two cinemas share all films but in the table but the may not have the same films showing)
Here is what i have come up with but I run into a problem as when the "SELECT DISTINCT" returns more than one result it obviously cant be matched with the FilmName on tbl Films.
How can i check this value for all FilmNames on tblFilms?
SELECT *
FROM tblFilms
WHERE FilmName = (SELECT DISTINCT FilmName FROM tblListings WHERE Cimema = 1)
use IN if the subquery return multiple values,
SELECT *
FROM tblFILMS
WHERE FilmName IN (SELECT DISTINCT FilmName FROM tblListings WHERE Cimema = 1)
Another way to solve thius is by using JOIN (which I recommend)
SELECT DISTINCT a.*
FROM tblFILMS a
INNER JOIN tblListings b
ON a.FilmName = b.FilmName AND
b.Cimema = 1
for faster query execution, add an INDEX on FilmName on both tables.
If you have your schemas for the tables, that would help.
That said, I believe what you want to look at is the JOIN keyword. (inner/outer/left/etc). That's exactly what JOIN is meant to do (ie your title).

Retrieve different row from same table

i hava a set of following tables
customer(cus_id,cus_name);
jointAccount(cus_id,acc_number,relationship);
account(acc_number,cus_id)
now i want to create a select statement to list all the jointAccounts,
it should included the both customer name, and relationship.
I have no idea how to retrieve both different user name, is that possible to do this?
Generally speaking, yes. I'm assuming you mean you want to get customer info for both sides of the joint account per your jointAccount table. Not sure what database you're using so this answer is assuming MySQL.
You can join on the same table twice in a single SQL query. I'm assuming you have not yet created your tables, as you have cus_id listed twice in the jointAccount table. Typically these would be something like cus_id1 and cus_id2, which I've used in my sample query below.
Example:
SELECT c1.cus_id AS cust1_id, c1.cus_name AS cust1_name
, c2.cus_id AS cust2_id, c2.cus_name AS cust2_name, j.relationship
FROM customer c1
INNER JOIN jointAccount j
ON c1.cus_id = j.cus_id1
, customer c2
INNER JOIN jointAccount j
ON c2.cus_id = j.cus_id2
I haven't tested this but that's the general idea.
try this query:
SELECT * FROM jointAccount a LEFT JOIN customer c ON a.cus_id = c.cus_id;
just replace the * with the name of the columns you need.

Some SQL Questions

I have been using SQL for years, but have mostly been using the query designer within SQL Studio (etc.) to put together my queries. I've recently found some time to actually "learn" what everything is doing and have set myself the following fairly simple tasks. Before I begin, I'd like to ask the SOF community their thoughts on the questions, possible answers and any tips they may have.
The questions are;
Find all records w/ a duplicate in a particular column (e.g. a linking id is in more than 1 record throughout table)
SUM price from a linked table within the same query (select within a select?)
Explain the difference between the 4 joins; LEFT, RIGHT, OUTER, INNER
Copy data from one table to another based on SELECT and WHERE criteria
Input welcomed & appreciated.
Chris
I recommend that you start by following some tutorials on this topic. Your questions are not uncommon questions for someone moving from a beginner to intermediate level in SQL. SQLZoo is an excellent resource for learning SQL so consider following that.
In response to your questions:
1) Find all records with a duplicate in a particular column
There are two steps here: find duplicate records and select those records. To find the duplicate records you should be doing something along the lines of:
select possible_duplicate_field, count(*)
from table
group by possible_duplicate_field
having count(*) > 1
What we're doing here is selecting everything from a table, then grouping it by the field we want to check for duplicates. The count function then gives me a count of the number of items within that group. The HAVING clause indicates that we want to filter AFTER the grouping to only show the groups which have more than one entry.
This is all fine in itself but it doesn't give you the actual records that have those values on them. If you knew the duplicate values then you'd write this:
select * from table where possible_duplicate_field = 'known_duplicate_value'
We can use the SELECT within a select to get a list of the matches:
select *
from table
where possible_duplicate_field in (
select possible_duplicate_field
from table
group by possible_duplicate_field
having count(*) > 1
)
2) SUM price from a linked table within the same query
This is a simple JOIN between two tables with a SUM of the two:
select sum(tableA.X + tableB.Y)
from tableA
join tableB on tableA.keyA = tableB.keyB
What you're doing here is joining two tables together where those two tables are linked by a key field. In this case, this is a natural join which operates as you would expect (i.e. get me everything from the left table which has a matching record in the right table).
3) Explain the difference between the 4 joins; LEFT, RIGHT, OUTER, INNER
Consider two tables A and B. The concept of "LEFT" and "RIGHT" in this case are slightly clearer if you read your SQL from left to right. So, when I say:
select x from A join B ...
The left table is "A" and the right table is "B". Now, when you explicitly say "LEFT" the SQL statement you are declaring which of the two tables you are joining is the primary table. What I mean by this is: Which table do I scan through first? Incidentally, if you omit the LEFT or RIGHT, then SQL implicitly uses LEFT.
For INNER and OUTER you are declaring what to do when matches don't exist in one of the tables. INNER declares that you want everything in the primary table (as declared using LEFT or RIGHT) where there is a matching record in the secondary table. Hence, if the primary table contains keys "X", "Y" and "Z", and the secondary table contains keys "X" and "Z", then an INNER will only return "X" and "Z" records from the two tables.
When OUTER is used, we're saying: Give me everything from the primary table and anything that matches from the secondary table. Hence, in the previous example, we'd get "X", "Y" and "Z" records in the output record set. However, there would be NULLs in the fields which should have come from the secondary table for key value "Y" as it doesn't exist in the secondary table.
4) Copy data from one table to another based on SELECT and WHERE criteria
This is pretty trivial and I'm surprised you've never encountered it. It's a simple nested SELECT in an INSERT statement (this may not be supported by your database - if not, try the next option):
insert into new_table select * from old_table where x = y
This assumes the tables have the same structure. If you have different structures then you'll need to specify the columns:
insert into new_table (list, of, fields)
select list, of, fields from old_table where x = y
Let's say you have 2 tables named :
[OrderLine] with the columns [Id, OrderId, ProductId, Qty, Status]
[Product] with [Id, Name, Price]
1) all orderline of command having more than 1 line (it's technically the same as looking for duplicates on OrderId :) :
select OrderId, count(*)
from OrderLine
group by OrderId
having count(*) > 1
2) total price for all order line of the order 1000
select sum(p.Price * ol.Qty) as Price
from OrderLine ol
inner join Product p on ol.ProductId = p.Id
where ol.OrderId = 1000
3) difference between joins:
a inner join b => take all a that has a match with b. if b is not found, a will be not be returned
a left join b => take all a, match them with b, include a even if b is not found
a righ join b => b left join a
a outer join b => (a left join b) union ( a right join b)
4) copy order lines to a history table :
insert into OrderLinesHistory
(CopiedOn, OrderLineId, OrderId, ProductId, Qty)
select
getDate(), Id, OrderId, ProductId, Qty
from
OrderLine
where
status = 'Closed'
To answer #4 and to perhaps show at least some understanding of SQL and the fact this isn't HW, just me trying to learn best practise;
SET NOCOUNT ON;
DECLARE #rc int
if #what = 1
BEGIN
select id from color_mapper where product = #productid and color = #colorid;
select #rc = ##rowcount
if #rc = 0
BEGIN
exec doSavingSPROC #colorid, #productid;
END
END
END

Select based on the number of appearances of an id in another table

I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.