So I have a Q that is like this:
with t1 as (
a.col1 as 'c1',
a.col2 as 'c2',
b.col1 as 'c3',
b.col2 as 'c4'
from table1 a left join table2 b
on a.col1 = b.col1
)
select
c.c1,
c.c2,
c.c3,
c.c4
from t1 c
and I want to make this whole thing a with as T2 so I can pull from what is the outer query on the above code. this is needed to perform calculations on data then renaming the column then performing calculation on the renamed columns and then one more time. I can't seem to figure out how to make the whole statement a "table" that I can then make my select statement from.
I have tried nesting another ;with as () and either it's not possible or I'm not doing it right and my guess is the latter.
Thanks in advance!
Is this what you want?
with t1 as (
select a.col1 as c1, a.col2 as c2, b.col1 as c3, b.col2 as c4
from table1 a left join
table2 b
on a.col1 = b.col1
),
t2 as (
select c.c1, c.c2, c.c3, c.c4
from t1 c
)
select *
from t2;
You can define multiple CTEs with a with statement. They are separated by commas.
Related
Suppose you have two tables A and B and you are trying to write a JOIN query, is the following possible:
SELECT A.col1, B.col1
FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
Will this return a table of col1 from table A and col2 from table B where there is a match in the second column across the tables and the third column of table B is 'hello'?
I.e. it will only return rows that are matching in col2 and this is further reduced to the cases where col3 in table B is 'hello'?
Yes. You can use:
Below will Join the Records in B table (Col3='hello') with A:
SELECT A.col1, B.col1
FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
Below will Join all Records in B table with A, And performing where at Result of A and B:
SELECT A.col1, B.col1
FROM A JOIN B on A.col2 = B.col2
WHERE B.col3 = 'hello'
Both will give the same result when no other tables joined.
Yes you can.
You can specify any kind of boolean condition in the ON clause.
It is not mandatory that any column is involved in the condition so all of the following are valid:
SELECT A.col1, B.col1 FROM A JOIN B on 1=1
SELECT A.col1, B.col1 FROM A JOIN B on B.col3 = 'hello'
SELECT A.col1, B.col1 FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
SELECT A.col1, B.col1 FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = C.col3)
SELECT A.col1, B.col1 FROM A LEFT JOIN B on (C.col3 = 'bye')
But pay attention, if you limit the condition to only key fields the optimizer engine will improve the performances very much.
For an inner join, these two statements are equivalent:
SELECT A.col1, B.col1
FROM A JOIN
B
ON A.col2 = B.col2 AND B.col3 = 'hello';
and:
SELECT A.col1, B.col1
FROM A JOIN
B
ON A.col2 = B.col2
WHERE B.col3 = 'hello';
Both should have the same execution plans as well.
Some people prefer putting filtering conditions in the WHERE clause, so the query is more clear about "conditions between tables" versus "filters on the result set". I tend to agree with this sentiment, although I'm not dogmatic about it.
OUTER JOINs are different. For an outer join, it makes a big different where the conditions go. In that case, you generally do not have a choice, so you use ON or WHERE to get the logic that you want.
Is there a way to join 2 tables together on one and only one of the possible conditions? Joining on condition "a" or "b" could duplicate rows, but I'm looking to only join once. I came up with a potential solution, but I'm wondering if there is a more slick way to do it.
For example:
SELECT *
FROM TableA a
LEFT JOIN TableB b
ON a.col1 = b.col1
OR (a.col1 != b.col1 AND a.col2 = b.col2)
This would join the tables on col1 OR col2 BUT NOT BOTH. Is there a cleaner way of doing this?
Not more efficient but I think more clear
SELECT *
FROM TableA a
LEFT JOIN TableB b
ON (a.col1 = b.col1 or a.col2 = b.col2)
AND NOT (a.col1 = b.col1 and a.col2 = b.col2)
Your method works. If you only want one (or a handful) of columns from b, I would suggest:
SELECT a.*, COALESCE(b.col3, b2.col3)
FROM TableA a LEFT JOIN
TableB b
ON a.col1 = b.col1 LEFT JOIN
TableB b2
ON a.col1 <> b2.col1 AND a.col2 = b2.col2;
Removing the OR from the JOIN conditions allows the optimizer to generate a better execution plan.
Checking to see if this is possible in Hive:
Select a.col1,b.col1
from tableA a join tableB b on a.col1 = b.col1
lateral view explode(numcred) tableA as creds
where creds.id = 9;
I can not find the answer in the docs. In short:
I want to JOIN on two tables AND LATERAL VIEW EXPLODE TABLEA
Seems simple enough but throws syntax issue.
select a.col1
,b.col1
from (Select a.col1
from tableA a
lateral view explode(numcred) e as creds
where e.creds.id = 9
) a
join tableB b
on a.col1 = b.col1
Not at my computer now, so no way to test this, but my guess is you'll have to write an inner query. Something like this:
SELECT
a.col1,
b.col1
FROM (
SELECT
dummy.col1
FROM table_a dummy
LATERAL VIEW EXPLODE(numcred) tableA as creds
WHERE
creds.id = 9
) a
JOIN tableB b
ON
a.col1 = b.col1
Trying to select top record of a outer joined table
if there are no records in table B then null will be there
if there are multiple records then just the first one should be selected.
I built this query but I get error
SELECT DISTINCT
A.Col1 , A.Col2, B.Col2, B.Col3
FROM
A LEFT OUTER JOIN (SELECT TOP 1 * FROM B WHERE B.Col1=A.Col1) A ON B.Col1=A.Col1
The multi-part identifier "B.Col1" could not be bound.
Anyone know how to resolve this?
If you want only one match, then use outer apply:
SELECT A.Col1 , A.Col2, B.Col2, B.Col3
FROM A OUTER APPLY
(SELECT TOP 1 *
FROM B
WHERE B.Col1 = A.Col1
) B;
So I am currently trying to take one table and add it into another table but for some reason it is not working the way I want it to.
There are three columns in both the tables and I only want to add each row of data from table 2 to table 1 if the first 2 columns of table 2 are not already in table 1 (I dont care about the 3rd column)
This is what I have so far:
INSERT INTO table1 (col1, col2, col3)
SELECT a.col1, a.col2, a.col3
FROM table2 as a
WHERE NOT EXISTS (SELECT b.col1, b.col2
FROM table1 as b
WHERE a.col1 = b.col1 AND a.col2 = b.col2);
I checked around and this seems that it should work but it isn't but can anyone see why?
I often have trouble when there are two fields to search for. One way is to combine them together:
INSERT INTO table1 (col1, col2, col3)
SELECT a.col1, a.col2, a.col3 from table2 as a
WHERE concat(a.col1,':', a.col2)
NOT IN (SELECT concat(col1,':',col2) from table1);
Another way is a left join:
INSERT INTO table1 (col1, col2, col3)
SELECT a.col1, a.col2, a.col3
from table2 as a
LEFT OUTER JOIN table1 as b
ON a.col1 = b.col1
AND a.col2 = b.col2
WHERE b.col1 IS NULL AND b.col2 IS NULL;
For example 2, it is better to use a primary key in the where clause.
Try this:
merge into tab2 a
using
(select col1,col2,col3 from tabl1) b
on
(b.col1=a.col1 and b.col2=a.col2)
when not matched then
insert (a.col1,a.col2,a.col3)
values
(b.col1,b.col2,b.col3);
Try this:
INSERT INTO table1 (col1, col2, col3)
(SELECT a.col1, a.col2, a.col3
FROM table2 a
WHERE NOT EXISTS (SELECT b.col1, b.col2
FROM table1 b
WHERE a.col1 = b.col1 AND a.col2 = b.col2));
I guess for table name as is not needed.