sql multiple outer joins on multiple fields - sql

SQL beginner here.
I am trying to outer-join one table (pgm_update) to two other tables (family and family_act_visits). A pgm_update record may correspond to a family record or a family_act_visits record or neither; my results should return data for all three cases. Both of some bad architecture, the joins have to be on multiple columns. Each of these individual queries works, but I haven't been able to combine them into a single query.
SELECT p.last_name_wo, p.activity, p.participation, fav.*
FROM family_act_visits fav
RIGHT JOIN pgm_update p ON fav.folks_fk=p.folks_fk and fav.activity=p.activity
JOIN activities a on p.activity=a.activity
WHERE p.participation in ('c','a') and a.act_start_date>current_date()
SELECT p.last_name_wo, p.activity, p.participation, f.*
FROM family f
RIGHT JOIN pgm_update p ON f.folks_fk=p.folks_fk and f.activity=p.activity
JOIN activities a on p.activity=a.activity
WHERE p.participation in ('c','a') and a.act_start_date>current_date()
One of my attempts at the full query is:
SELECT p.last_name_wo, p.activity, p.participation, fav.*
FROM family_act_visits fav, family f
RIGHT JOIN pgm_update p ON fav.folks_fk=p.folks_fk and fav.activity=p.activity
RIGHT JOIN pgm_update p2 ON f.folks_fk=p2.folks_fk and f.activity=p2.activity
JOIN activities a on p.activity=a.activity
WHERE p.participation in ('c','a') and a.act_start_date>current_date()
This gets the error message "Unknown column 'fav.folks_fk' in 'on clause'"
Hope this long post contains all the info needed....thanks!

Firstly, JOINing on multiple conditions does not indicate poor design.
Are you sure you want a RIGHT JOIN?
Perhaps you are looking for LEFT JOIN
Your query is a bit complicated, and it includes a CROSS JOIN, perhaps unintentionally?
Perhaps you just want to UNION ALL which allows you to append datasets having the same shape....
SELECT
p.last_name_wo
,p.activity
,p.participation
,fav.*
FROM
family_act_visits fav
RIGHT JOIN pgm_update p
ON fav.folks_fk=p.folks_fk
AND fav.activity=p.activity
JOIN activities a
ON p.activity=a.activity
WHERE
p.participation IN ('c','a')
AND a.act_start_date>current_date()
UNION ALL
SELECT
p.last_name_wo
,p.activity
,p.participation
,f.*
FROM
family f
RIGHT JOIN pgm_update p
ON f.folks_fk=p.folks_fk
AND f.activity=p.activity
JOIN activities a
ON p.activity=a.activity
WHERE
p.participation IN ('c','a')
AND a.act_start_date>current_date()

Related

Joining two different columns from one table to the same column in a different table?

I am working on a query that has fields called ios_app_id, android_app_id and app_id.
The app_id from downloads table can be either ios_app_id or android_app_id in the products table.
Is it correct that because of that I cannot just run a simple join of downloads and products table on on p.ios_app_id = d.app_id and then join again on on p.android_app_id = d.app_id? Would that cause an incorrect number of records?
select p.product_id, d.date, d.downloads,
from products p
inner join download d
on p.ios_app_id = d.app_id
UNION
select p.product_id, d.date, d.downloads
from products p
inner join download d
on p.android_app_id = d.app_id
I would try:
select p.product_id, d.date, d.downloads,
from products p
inner join downloads d
on p.ios_app_id = d.app_id
inner join downloads d
on p.android_app_id = d.app_id
Basically I am trying to understand why the union here is needed instead of just joining the two fields twice? Thank you
Just join twice:
select p.product_id,
coalesce(di.date, da.date),
coalesce(di.downloads, da.downloads)
from products p left join
downloads di
on p.ios_app_id = di.app_id left join
downloads da
on p.android_app_id = da.app_id;
This should be more efficient than your method with union. Basically, it attempts joining using the two ids. The coalesce() combines the results into a single column.
Remember that the purpose of an INNER JOIN is to get the values that exists on BOTH sets of data (lets called them table A and table B), using a specific column to join them. In your example, if you try to do the INNER JOIN twice, what would happen is that the first time you execute the INNER JOIN, the complete PRODUCTS table is your table A, and you obtain all the products that have downloaded the ios_app, but now (and this is the key part) this result becomes your new dataset, so it becomes your new table A for the next inner join. And thats the issue, cause what you would want is to join the whole table again, not just the result of the first join, but thats not how it works. This is why you need to use the UNION, cause you need to obtain your results independently and then add them.
An alternative would be to use LEFT JOIN, but you could get null values and duplicates -and its not too "clean"-. So, for your particular case, I think using UNION is much clearer and easier to understand.
If you do left join in first query it will work.
create table all_products as (select p.product_id, d.date, d.downloads,
from products p
left join downloads d
on p.ios_app_id = d.app_id)
select a.product_id, d.date, d.downloads from all_products a left join downloads d
on a.android_app_id = d.app_id inner join

How can i apply left outer join conditions on four tables?

i was trying to apply joins on 4 tables. but i could not find proper result for that.
i have 4 tables like 1.students,2.college,3.locations,4.departments. so i have same column sid in all tables which can be used to join conditions.
i want all matched rows from four tables as mentioned columns in select statement below and unmatched rows in left table which is left outer join work.
i have tried this syntax.
select
students.sname,
college.cname,
locations.loc,
department.dept
from students, college, locaions, departments
where student.sid=college.sid(+)
and college.sid=locations.sid(+)
and locations.sid=department.sid(+);
is this right ?
This is an old-fashioned way of outer-joining in an Oracle database. For Oracle this statement is correct; in other DBMS it is invalid.
Anyway, nowadays (as of Oracle 9i; in other DBMS much longer) you should use standard SQL joins instead.
select
s.sname,
c.cname,
l.loc,
d.dept
from students s
left outer join college c on c.sid = s.sid
left outer join locations l on l.sid = c.sid
left outer join departments d on d.sid = l.sid;
Given what you've shown it would appear that what you really want is
select s.sname,
c.cname,
l.loc,
d.dept
from students s
LEFT OUTER JOIN college c
ON c.SID = s.SID
LEFT OUTER JOIN locations l
ON l.SID = s.SID
LEFT OUTER JOIN departments d
ON d.SID = s.SID
The issue in your original query is that because an OUTER JOIN is an optional join, you can end up with NULL values being returned in one of the join fields, which then prevents any of the downstream values being joined. I agree with #ThorstenKettner who observes in a comment that SID is apparently a "student ID", but it's not reasonable or appropriate to have a "student ID" field on tables named COLLEGE, LOCATIONS, or DEPARTMENTS. Perhaps you need to update your design to allow any number of students to be associated with one of these entities, perhaps using a "join" table.
Best of luck.

Struggling to interpret this SQL join syntax

Two pretty similar queries:
Query #1:
SELECT *
FROM employee e
LEFT JOIN employee_payments ep
INNER JOIN payments p ON ep.payment_id = p.id
ON ep.employee_id = e.id
Query #2:
SELECT *
FROM employee e
LEFT JOIN employee_payments ep ON ep.employee_id = e.id
INNER JOIN payments p ON ep.payment_id = p.id
But obviously crucially different syntax.
The way I learn these new syntax concepts best are to interpret them as plain English. So how could you describe what these are selecting?
I would expect that they'd produce the same results, but it feels to me like the LEFT JOIN in the second query acts as an INNER JOIN somehow - since a fraction of my results set are returned (i.e. the employees with payments).
If the first query 'says' "give me all employees, along with any available employee_payments (that have already been joined with their payment record)"- what does the second query say?
If the first query 'says' "give me all employees, along with any available employee_payments (that have already been joined with their payment record)"- what does the second query say?
I suppose you might put it as "Take all employees along with any available employee_payments. Join this with the payment records."
The "Join this with the payment records" is what filters out employees that don't have any associated employee_payments records: the attempt to join with the payment records will fail.
but it feels to me like the LEFT JOIN in the second query acts as an INNER JOIN somehow
It's not the LEFT JOIN that's doing the filtering, but it does indeed give the exact same result as if the LEFT JOIN had been an INNER JOIN.
In order to understand the logical order1 in which joins happen, you need to look at the ON clauses. For each ON clause that you encounter, you pair it with the closest previous JOIN clause that hasn't already been processed. This means that you first query is:
INNER JOIN ep to p (producing, say, ep')
LEFT JOIN e to ep'
And your second query is:
LEFT JOIN e to ep (producing, say, e')
INNER JOIN e' to p
Since the conditions of the INNER JOIN rely upon columns present in ep, this is why the different join orders matter here.
1The logical join order determines the final shape of the result set. SQL Server is free to perform joins in any order it sees fit, but it must produce results consistent with the logical join order.

how to join these tables in sql

I have these tables and would like to query them to show the all clients and their groups (if any), the following image describes the case:
How to join tables to get the result using sql server?
This looks like a lesson teaching CROSS JOIN. Because you want a row in your result for each intersection of client and group, whether or not it is valid, you want to cross join those tables then see if there is a matching record in client_group. In a working application this cross join could get unwieldy very quickly, with a few thousand groups and clients you'd have many millions of results.
Something like this should get your cartesian result and see if a matching record is found:
SELECT
c.id 'client_id', g.Id 'group_id', ISNULL(cg.client_id)
FROM
(client c
CROSS JOIN group g)
LEFT JOIN client_group cg ON c.id = cg.client_id AND g.id = cg.group_id
More on joining:
What is the difference between Left, Right, Outer and Inner Joins?

Outer Joining SQL Tables?

I have three table in the Database -
Activity table with activity_id, activity_type
Category table with category_id, category_name
Link table with mapping between activity_id and category_id
I need to write a select statement to get the following data:
activity_id, activity_type, Category_name.
The issue is some of the activity_id have no entry in the link table.
If I write:
select a.activity_id, a.activity_type, c.category_name
from activity a, category c, link l
where a.activity_id = l.activity_id and c.category_id = l.category_id
then I do not get the data for the activity_ids that are not present in the link table.
I need to get data for all the activities with empty or null value as category_name for those not having any linking for category_id.
Please help me with it.
PS. I am using MS SQL Server DB
I believe you're looking for a LEFT OUTER JOIN for your activity table to return all rows.
SELECT
a.activity_id, a.activity_type, c.category_name
FROM activity a
LEFT OUTER JOIN link l
ON a.activity_id = l.activity_id
LEFT OUTER JOIN category c
ON c.category_id = l.category_id;
You should use proper explicit joins:
select a.activity_id, a.activity_type, c.category_name
from activity a
LEFT JOIN link l
ON a.activity_id = l.activity_id
LEFT JOIN category c
ON l.category_id = c.category_id
If writing this type of logic will be part of your ongoing responsibilities, I would strongly suggest that you do some research on joins, including the interactions between joins and where clauses. Joins and where clauses combine to form the backbone of query writing, regardless of the technology used to retrieve the data.
Most critical join information to understand:
Left Outer Join: retrieves all information from the 'left' table and any records that exist in the joined table
Inner Join: retrieves only records that exist in both tables
Where clauses: used to limit data, regardless of inner or outer join definitions.
In the example you posted, the where clause is limiting your overall data to rows that exist in all 3 tables. Replacing the where clause with appropriate join logic will do the trick:
select a.activity_id, a.activity_type, c.category_name
from activity a
left outer join link l --return all activity rows regardless of whether the link exists
on a.activity_id = l.activity_id
left outer join category c --return all activity rows regardless of whether the link exists
on c.category_id = l.category_id
Best of luck!
What about?
select a.activity_id, a.activity_type, c.category_name from category c
left join link l on c.category_id = l.category_id
left join activity a on l.activity_id = a.activity_id
Actually, the first join seems that it could be an inner join, because you didn't mention that there might be some missing elements there