JOIN Tables with all specified fields matching - sql

I have two tables. I'm selecting all the values in the first table and trying to get the associated rows in the second table which match BOTH of the specified fields.
So in this example, I want only the rows in the CarsTable and the associated columns in the TrucksTable in which BOTH the Tires and Windows values match (if just one value matches, I don't want it). I'm not even certain a join is the correct operation. Any ideas?
SELECT * FROM CarsTable, TrucksTable
LEFT JOIN TrucksTable t1
ON
t1.Tires = Cars.Tires
LEFT JOIN Trucks t2
ON
t2.Windows = Cars.Windows

You need to do some research on joins...your from clause is all over the place. Carstable,truckstable as you have here is a cross join..and I don't understand what the left joins you are doing are. I think you only have two tables:
SELECT c.*, t.* FROM CarsTable c
inner join TrucksTable t
on t.tires = c.tires and t.windows = c.windows
inner join will function as a filter...if the record in carstable finds no matches in the truckstable by the keys listed, it won't appear. If you want all cartable rows even those that find no match, use left join (in this case, t.* will be null)

Related

Find left table and right table when we join more than 2 tables?

Suppose I have three tables here one of them is a Country table second one is a city table and the third one is a customer table, and I am writing a query here like this
SELECT country.country_name_eng, city.city_name, customer.customer_name
FROM Country
LEFT JOIN city ON city.country_id = country.id
LEFT JOIN customer ON customer.city_id = city.id
I want to know here who is the left table of the customer table and who is the left table of the city table as well as who is the right table of the country table and who is the right table of the customer table
'Left' is always the table at the left of the JOIN keyword, no matter if left, right, inner or full outer join, analogously 'right'.
Left and right joins, though, have different semantics:
With a normal (inner) join a data row is only added if there's a matching entry in both tables, e.g. if you JOIN ON l.id = r.id then if l contains an id with value 7 a row for is only added if there's a matching id with value 7 in r as well (and vice versa).
A left (outer) join instead adds a row for id 7 in l even if there's no matching id in r – then the corresponding columns for table r are filled with nulls.
Analogously a right (outer) join, all entries of r are added with nulls for l's columns if no matching entry found.
A full outer join is a combination of left and right outer join, any entry of l and r is used with nulls on either side for the counter part if none exists.
In your concrete query you'd list all countries even if there don't exist any cities – and if they do exist, for each country all of them even if there don't exist any customers there. In the latter case the columns for the customer are all filled with null values, and if no city exists either, additionally all city columns.

Minus operation gives wrong answer

I am trying to get the result of the minus operation of join tables which means that I am finding unmatched records.
I tried:
SELECT count(*) FROM mp_v1 mp
left join cm_v1 sop
on mp.study_name=sop.study_name and
sop.site_id=sop.site_id
--where mp.study_name='1101'
MINUS
SELECT count(*) FROM iv_mpv1 mp
inner join cm sop
on mp.study_name=sop.study_name and
sop.site_id=sop.site_id
--where mp.study_name='1101'
output: the count of this gives me 171183251
but when I run the first query individually I get 171183251 for left outer join and 171070345 for inner join so the output needs to be 112906. I am not sure where my query is wrong. Could anyone please give your opinion.
If you want unmatched records you wouldn't use MINUS on the counts. The query would look more like:
SELECT COUNT(*)
FROM ((SELECT *
FROM mp_v1 mp LEFT JOIN
cm_v1 sop
USING (study_name, site_id)
) MINUS
(SELECT *
FROM iv_mpv1 mp LEFT JOIN
iv_cmv1 sop
USING (study_name, site_id)
)
) x;
Also note that MINUS removes duplicates, so if you have duplicates within each set of tables, then they only count as one row.
The SELECT * assumes that the tables have the same columns and compatible types -- which makes sense given the gist of the question. You may need to list the particular columns you care about.

joining to a table that might not exist

So I have joined a table to itself in order to find the most recently added note of a certain type (;NoteTypesID=20625;). BUT when there is NOTHING of that type, it completely excludes the entire record's results from my dataset, instead of giving me nulls for the fields I am trying to pull from the ;PROPOSALNOTEPAD; table.
---select ... from ... (including the prop table)
left join PROPOSALNOTEPAD PN_SolicitationStrategy on prop.ID = PN_SolicitationStrategy.ParentId
join ( select parentID, max(DateAdded) as MaxDateAdded FROM PROPOSAL_NOTEPAD where NoteTypeID = 20625 group by ParentId) as PN_max_SolicitationStrategy
ON PN_SolicitationStrategy.ParentId = PN_max_SolicitationStrategy.ParentId AND PN_SolicitationStrategy.DateAdded = PN_max_SolicitationStrategy.MaxDateAdded
I need to be able to test if this is null: ;select parentID FROM PROPOSAL_NOTEPAD where NoteTypeID = 20625;. And then do my joins based on that result. How do I do that? How do I join based on a condition?
Use LEFT OUTER JOIN:
left join PROPOSALNOTEPAD PN_SolicitationStrategy on prop.ID = PN_SolicitationStrategy.ParentId
LEFT OUTER JOIN ( select parentID, max(DateAdded) as MaxDateAdded FROM PROPOSAL_NOTEPAD where NoteTypeID = 20625 group by ParentId) as PN_max_SolicitationStrategy
ON PN_SolicitationStrategy.ParentId = PN_max_SolicitationStrategy.ParentId AND PN_SolicitationStrategy.DateAdded = PN_max_SolicitationStrategy.MaxDateAdded
Inner Join always brings the common records from both table. In your case, there is no matching record from 1st table and PN_max_SolicitationStrategy table. That's why you are getting no records.
Using LEFT OUTER JOIN will bring you all the records from 1st table and matching records from the 2nd (PN_max_SolicitationStrategy) table.
Here is simple tutorial to learn about JOINs: https://www.w3schools.com/sql/sql_join_left.asp

Why is LEFT JOIN deleting rows?

I have been using sql for a long time, but I am now working in Databricks and I am getting a very strange result. I have a table called block_durations with a set of ids (called block_ts), and I have another table called mergetable, which I want to left join to that table. Mergetable is indexed by acct_id and block_ts, so it has many different records for each block_ts. I want to keep the rows in block_durations that don't match, and if there are multiple matches in mergetable I want there to be multiple corresponding entries in the resulting join, as you would expect from a left join.
But this is not happening. In order to demonstrate this, I am showing the result of joining mergetable, after filtering for a single acct_id so that there is at most one match per block_ts.
select count(*) from mergetable where acct_id = '0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98'
16579
select count(*) from block_durations
82817
select count(*) from
(
SELECT
mt.*,
bd.block_duration
FROM
block_durations bd
left outer JOIN mergetable mt
ON mt.block_ts = bd.block_ts
where acct_id='0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98'
) countTable
16579
As you can see, even though there are >80000 records in block_durations, most of them are getting lost in the left join. Why is this happening? I thought the whole point of a left join is that the non-matching rows of the left table are kept. This is exactly the behavior I would expect from an inner join -- and indeed when I switch to an inner join nothing changes.
Could someone please help me figure out what's going on?
-Paul
All rows from left side of the join are preserved, but later on you run WHERE ... condition on that which removed rows not matching the condition.
Merge your WHERE condition into JOIN condition:
SELECT
mt.*,
bd.block_duration
FROM
block_durations bd
left outer JOIN mergetable mt
ON mt.block_ts = bd.block_ts AND acct_id='0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98'
You can also filter mergetable before you run JOIN on the results:
SELECT
mt.*,
bd.block_duration
FROM
block_durations bd
left outer JOIN (SELECT * FROM mergetable WHERE acct_id='0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98') mt
ON mt.block_ts = bd.block_ts

When would you INNER JOIN a LEFT JOINed table /

I came across below code today.
SELECT StaffGroup.*
FROM StaffGroup
LEFT OUTER JOIN StaffByGroup
ON StaffByGroup.StaffGroupId = StaffGroup.StaffGroupId
INNER JOIN StaffMember
ON StaffMember.StaffMemberId = StaffByGroup.StaffMemberId
WHERE StaffByGroup.StaffGroupId IS NULL
The main table StaffGroup is being LEFT JOINed with StaffByGroup table and then StaffByGroup table is being INNER JOINed with StaffMember table.
I thought the INNER JOIN is trying to filter out the records which exist in StaffGroup and StaffByGroup but do not exist in StaffMember.
But this is not how it is working. The query does not return any records.
Am I missing something in understanding the logic of the query ? Have you ever used INNER JOIN with a table which has been used with LEFT JOIN in earlier part of the query ?
Actually you are missing one concept:
The main table StaffGroup is being LEFT Joined with StaffByGroup table and then this creates a virtual table say VT1 with all records from StaffGroup and matching records from StaffByGroup based on your match/filter condition in ON predicate.Then not StaffByGroup table but this VT1 is being INNER Joined with StaffMember table based on match/filter condition in ON predicate.
So basically the inner join is trying to filter out those records from StaffGroup and hence StaffByGroup which do not have a StaffMemberId.
Adding your where condition adds a final filter like from the final virtual table created by all the above joins remove all such records which don't have a StaffGroupId which in turn might be removing all rows collected in VT1 as all of them will be having some value for StaffGroupId.
To get all records from StaffGroup which have no StaffGroupId along with details from StaffMember for all such records you can add condition in ON predicate as:
SELECT StaffGroup.*
FROM StaffGroup
LEFT OUTER JOIN StaffByGroup
ON StaffByGroup.StaffGroupId = StaffGroup.StaffGroupId and StaffByGroup.StaffGroupId IS NULL
INNER JOIN StaffMember
ON StaffMember.StaffMemberId = StaffByGroup.StaffMemberId
This query looks fundamentally flawed - I guess what was originally intended is
SELECT StaffGroup.*
FROM StaffGroup
LEFT OUTER JOIN
(SELECT * FROM StaffByGroup
INNER JOIN StaffMember
ON StaffMember.StaffMemberId = StaffByGroup.StaffMemberId) StaffByGroup
ON StaffByGroup.StaffGroupId = StaffGroup.StaffGroupId
WHERE StaffByGroup.StaffGroupId IS NULL
which returns all groups from StaffGroup that dont' have existing staffmembers assigned to them (the INNER JOIN with StaffMember filters out those rows from StaffByGroup that don't have a matching row in StaffMember - probably because there exists no foreign key between them)
You are getting 0 records because of your where clause
where StaffByGroup.StaffGroupId is null
The left join links all the records from tbl A which are contained in tbl B and since you have specified StaffGROUPID as your key and then looked for Nulls values in your key, its 100% clear that you will end up with no results