Hive left join - conditions on where

Hive left join - conditions on where - hive

Given two tables
Table A
idA
v1
1
1
2
1
3
2
and
Table B
idB
v2
v3
1
1
a
2
2
b
2
1
a
I want to get all the values from Table A, plus the information on Table B (v3) where the two ids should be the same. This is easy - left outer join!
select *
from A
left join B on A.idA = B.idB
However, what if I need to get v1 = v2 ? I thought that I could just use where
select *
from A
left join B on A.idA = B.idB
where B.id is null or A.v1 = B.v2
Unfortunately, this removes all rows from the left table (A) that did not match any on B (in this example, idA = 3). Any solution?
EDIT: as #irnerd point out, the problem as stated is very simple (just extend the on clause). The actual issue comes when v1 becomes a timestamp, that as to be between v2 and v4 (timestamps) as in
select *
from A
left join B on A.idA = B.idB and a.v1 between b.v2 and b.v4
The previous query works fine in Oracle, but in Hive I get error...

Just extend the join clause to qualify the second join criteria
select *
from A
left join B on A.idA = B.idB
and A.v1 = B.v2

Related

SQL join on three tables, lines that exist in 2 tables but not the third

Please I need your help.
Suppose that we have 3 tables A, B and C as shown in the image below:
I want to get lines in the table A that exist or not exist in table B, and lines in table C that exist or not exist in table B, using one sql request.
I have tried this but doesn't work :
SELECT A.ATS0804, C.ATS0207, A.ATS0959, A.ATS0802, B.ATS0827
FROM
ISUT183.ENS0042 B
RIGHT JOIN ISUT183.ENS0038 A
ON B.ENS0038K = A.ATS0804
RIGHT JOIN ISUT183.EN00041 C
ON B.EN00041K = C.AT02812
WHERE ( C.ATS0207 = '0001757430'
AND B.ATS0823 = '9999-01-01'
AND A.ATS0803 = '9999-01-01'
AND A.ATS0959 = '61384352001'
AND A.ATS0802 ='01.01.2010'
) ;

you can do a cross join too:
with AB as (
select * from A left outer join B on A.ID1=B.ID1
),
AC as (
select * from C left outer join B on C.ID2=B.ID2
)
select * from AB CROSS JOIN AC

use where exists and where not exists clauses

If you test equality into table B in where clause, the left outer join or right outer join dont take null
You dont have join between A and C, then you can do a UNION ALL
but you must take columns of same type in selects clause (ID1 same type of ID2)
select * from (
select 'A-B' typejoin, A.ID1 as IDA_OR_C, B.ID1 as IDB from A left outer join B on A.ID1=B.ID1
union all
select 'A-C' typejoin, C.ID2 as IDA_OR_C, B.ID2 as IDB from C left outer join B on C.ID2=B.ID2
) tmp
where ....

How do I select group wise from a second table

This one is hard to explain and I'm sure I will facepalm when I see the solution, but I just can't get it right...
I have three tables:
Table A contains new records that I want to do something with.
Table B contains all activities from Table C of a specific type (done beforehand).
Table C is sort of a "master" table that contains all activities as well as a customer id and a lot of other stuff.
I need to select all activities that is in Table A from Table B. So far so good.
The part I can't get together is that I also need all the activities from Table B that has the same customer id as an activity contained in Table A.
This is what I'm after:
activity
2
3
4
5
6
The trick here is to get activity 2, because activity 2 is also done by customer 2, even though it is not in Table A.
Here are the tables:
Table A (new records)
activity
3
4
5
6
Table B (all records of a specific type from Table C)
activity
1
2 <-- How do I get this record as well?
3
4
5
6
Table C (all records)
activity customer
1 1
2 2
3 2
4 3
5 3
6 4
7 5
I tried something like this...
SELECT *
FROM table_b b
INNER JOIN table_c c
ON c.activity = b.activity
INNER JOIN table_a a
ON a.activity = b.activity
... but of course it only yields:
activity
3
4
5
6
How can I get activity 2 as well here?

To do this returning one column I would recommend staging the customer_ids of activities in Table_b that are in Table_a into a CTE (common table expression MSDN CTE) then select activities in table_c and join to the CTE to get only activities with a valid customer_id.
example of CTE: (Note the semi-colon ; before the WITH keyword is workaround for an issue in SQL 2005 with multiple statements. It it not necessary if you are in a newer version, or not running batch statements.)
;WITH cte_1 AS (
SELECT distinct c.customer --(only need a distinct result of customer ids)
from table_b b
join table_a a on b.activity = a.activity --(gets only activities in b that are in a)
join table_c c on b.activity = c.activity --(gets customer id of activies in table b)
)
SELECT a.activity
FROM table_c a
JOIN cte_1 b ON a.customer = b.customer
Alternatively you could do this in three joins with a select distinct. However I find the CTE to be an easier way to develop and think about this problem regardless of the way you decide to implement your solution. Although the three join solution will most likely scale and perform better over time with a growing data-set.
Example:
SELECT distinct d.activity
from table_b b
join table_a a on b.activity = a.activity --(gets only activities in b that are in a)
join table_c c on b.activity = c.activity --(gets customer id of activies in table b)
join table_c d ON c.customer = d.customer
Both would output:
2
3
4
5
6

Here is one way to do it
SELECT *
FROM TableB b1
WHERE EXISTS (SELECT 1
FROM Tablec c1
WHERE EXISTS (SELECT 1
FROM TableA a
INNER JOIN Tablec c
ON a.activity = c.activity
WHERE c.customer = c1.customer)
AND c1.activity = b1.activity)

Can you try doing a left join?
SELECT *
FROM table_b b
INNER JOIN table_c c
ON c.activity = b.activity
LEFT JOIN table_a a
ON b.activity = a.activity

Join on 3 tables

i have currently 3 tables :
Table A
Table B
Table C
There is a link between A & B and a link between B & C (A-B-C).
The thing is that :
It is possible to have a row in A but no not in B
It is possible to have a row in B but not in A
It is possible to have a row in B but not in C
In the end i would like to have a query which could give me the following row (where X represent the ID of the corresponding table) :
TableA|TableB|TableC
X | X | X
X | null | null
null | X | X
X | X | null
I managed to have the case with TableA & TableB with the following query :
SELECT A.ID, B.ID
FROM TABLEA A
LEFT JOIN TABLEB B on (join condition)
UNION
SELECT A.ID,B.ID
FROM TABLE B
LEFT JOIN TABLEA A on (join condition)
Thank you for any help you may provide

What you need is a FULL OUTER JOIN, however, you have tagged your post with sybase - it depends what you mean by that. Sybase ASE doesn't support FULL OUTER JOIN syntax, but SQL Anywhere does.

If I understood it correctly then a FULL OUTER JOIN should do your work :
SELECT a.id,b.id,c.id
FROM TableA a
FULL OUTER JOIN TableB b on a.id = b.id
FULL OUTER JOIN TableC c on COALESCE(a.id,b.id) = c.id
SQL Fiddle

Query with join equivalency?

Are these two queries equivalent (assuming varying/any kinds of data in the table)? Are there any scenarios in which they would return different results?
Query 1:
select * from tablea a
left join tableb b on a.keyacol = b.keybcol
inner join tablec c on c.keyccol = b.keybcol;
Query 2:
select * from tablea a
left join (
select b.*, c.* from tableb b
inner join tablec c on c.keyccol = b.keybcol
) sub on a.keyacol = sub.keybcol;

No, they are not equivalent. Example:
CREATE TABLE a
( keya int ) ;
CREATE TABLE b
( keyb int ) ;
CREATE TABLE c
( keyc int ) ;
INSERT INTO a
VALUES
(1) ;
INSERT INTO b
VALUES
(1),(2) ;
INSERT INTO c
VALUES
(2) ;
Results:
SELECT *
FROM a
LEFT JOIN b
ON a.keya = b.keyb
INNER JOIN c
ON c.keyc = b.keyb ;
Result
----------------------
| keya | keyb | keyc |
----------------------
SELECT *
FROM a
LEFT JOIN
( SELECT b.*, c.*
FROM b
INNER JOIN c
ON c.keyc = b.keyb
) sub
ON a.keya = sub.keyb ;
Result
----------------------
| keya | keyb | keyc |
----------------------
| 1 | NULL | NULL |
----------------------
As to why this happens, a LEFT JOIN b INNER JOIN c is parsed as (a LEFT JOIN b) INNER JOIN c which is equivalent to (a INNER JOIN b) INNER JOIN c because the condition on the INNER join cancels the LEFT join.
You can also write the second query in this form - without subquery - which is parsed as a LEFT JOIN (b INNER JOIN c) because of the different placing of the ON clauses:
SELECT *
FROM a
LEFT JOIN
b
INNER JOIN c
ON c.keyc = b.keyb
ON a.keya = b.keyb ;
Result
----------------------
| keya | keyb | keyc |
----------------------
| 1 | NULL | NULL |
----------------------

They're not equivalent.
Essentially, there are four scenarios here, for records on A:
corresponding records on B and C exist;
corresponding records exist on B but not C;
corresponding records exist on C but not B;
corresponding records don't exist on B or C.
Both queries will return the same values for scenario 1.
However, they will return different values for the other scenarios - the inner join on a value of B to a value C in the first query means that you will be attempting to join a null to a value of C in the other scenarios.

The INNER JOIN keyword return rows when there is at least one match in
both tables/selections. If there are rows in tableb that do not have
matches in tablec, those rows will NOT be listed.
The LEFT JOIN keyword returns all the rows from the left table
(tablea), even if there are no matches in the right table (tableb or
sub selection).
Since the left join returns all rows, the two queries might differ if there aren't any correspondent matches in the inner join from the first query. Those rows won't be selected in the first query but will appear in the second since it's used a left join.
Another thing that migh differ was the columns. But since the select * from you will select all colums from all tables/selections and:
Query 1 returns all columns from tablea, tableb and tablec
Query 2 returns all columns from tablea and selection sub (returns all columns from tableb and tablec) = returns all columns from tablea, tableb and tablec
this isn't a problem.
So NO, they are equivalent.

Bidirectional outer join

Suppose we have a table A:
itemid mark
1 5
2 3
and table B:
itemid mark
1 3
3 5
I want to join A*B on A.itemid=B.itemid both right and left ways. i.e. result:
itemid A.mark B.mark
1 5 3
2 3 NULL
3 NULL 5
Is there a way to do it in one query in MySQL?

It's called a full outer join and it's not supported natively in MySQL, judging from its docs. You can work around this limitation using UNION as described in the comments to the page I linked to.
[edit] Since others posted snippets, here you go. You can see explanation on the linked page.
SELECT *
FROM A LEFT JOIN B ON A.id = B.id
UNION ALL
SELECT *
FROM A RIGHT JOIN B ON A.id = B.id
WHERE A.id IS NULL

Could do with some work but here is some sql
select distinct T.itemid, A.mark as "A.mark", B.mark as "B.mark"
from (select * from A union select * from B) T
left join A on T.itemid = A.itemid
left join B on T.itemid = B.itemid;
This relies on the left join, which returns all the rows in the original table (in this case this is the subselect table T). If there are no matches in the joined table, then it will set the column to NULL.

This works for me on SQL Server:
select isnull(a.id, b.id), a.mark, b.mark
from a
full outer join b on b.id = a.id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Hive left join - conditions on where - hive

Just extend the join clause to qualify the second join criteria select * from A left join B on A.idA = B.idB and A.v1 = B.v2

Related

SQL join on three tables, lines that exist in 2 tables but not the third

How do I select group wise from a second table

Join on 3 tables

Query with join equivalency?

Bidirectional outer join

Categories

Resources