I have two tables A and B and I have to perform left join on that with multiple cases in on condition.
Is there any efficient way of doing this in big query or SQL.
select * from table_A A
left join table_B B
where
[some condition OR some condition]
on
case1
A.column1 =B.column1
and A.column2= B.column2
and A.column3= B.column3
and A.column4= B.column4
and A.column5= B.column5
OR case2
A.column1 =B.column1
and A.column3= B.column3
and A.column4= B.column4
and A.column5= B.column5
OR case3
A.column1 =B.column1
and A.column2= B.column2
and A.column4= B.column4
OR case4
A.column1 =B.column1
and A.column3= B.column3
and A.column5= B.column5
Here my main motive is that for one row if my case1 matches than it will not go into other cases. Likewise it will work if first is not matches then it will check second, then third and it will get best possible one match.
Here the cases will help that to get 100% of join between A and B table.
In first cases we are checking all 5 fields of both table, but if some of the field are null than it will check other case and likewise it should work.
If I understand correctly, the general approach in SQL is multiple left joins:
select a.*, coalesce(b1.col, b2.col, b3.col, b4.col) as col
from table_A A left join
table_B B1
on A.column1 = B1.column1 and
A.column2 = B1.column2 and
A.column3 = B1.column3 and
A.column4 = B1.column4 and
A.column5 = B1.column5 left join
table_b B2
on B1.column1 is null and
A.column1 = B2.column1 and
A.column3 = B2.column3 and
A.column4 = B2.column4 and
A.column5 = B2.column5 left join
table_b B3
on B2.column1 is null and
A.column1 = B3.column1 and
A.column2 = B3.column2 and
A.column3 = B3.column3 left join
table_b B4
on B3.column1 is null and
A.column2 = B4.column2 and
A.column4 = B4.column4
You want to get the "best" matching B rows. I.e. if there are rows matching case 1, you want to stick with these, but if there are none, then you want to try with case 2, etc.
What you can do is combine the conditions, so as to join all possible matches first. Then look at the matches and dismiss all except the best ones. Ranking can be done with RANK.
select *
from
(
select
*,
rank() over (partition by A.id
order by
case when A.column2 = B.column2
and A.column3 = B.column3
and A.column4 = B.column4
and A.column5 = B.column5 then 1
when A.column3 = B.column3
and A.column4 = B.column4
and A.column5 = B.column5 then 2
when A.column2 = B.column2
and A.column4 = B.column4 then 3
else 4
end) as rnk
from table_A A
left join table_B B
on A.column1 = B.column1
and
(
(A.column2 = B.column2 and A.column4 = B.column4)
or
(A.column3 = B.column3 and A.column5 = B.column5)
)
where [some condition OR some condition]
) ranked
where rnk = 1;
(My query assumes some ID in table_A. If your table doesn't have a unique ID, use whatever column(s) uniquely identify a row in the table.)
The solution can be to use a temporary data storage (temp table, cursors, or whatever) and use a parametrized loop to feed it. The problem that you have is that in pure SQL you don't have loops. You have to use the scripting languages of bigQuery, give a look here https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting
Below two options I see - both for BigQuery Standard SQL (Thank you to #Thorsten-Kettner for helping in understanding OP's logic/requirements)
Option 1 - separate joins for each case; then combine all and finally pick the winner for each record in A
#standardSQL
SELECT * EXCEPT(priority, identity)
FROM (
SELECT AS VALUE ARRAY_AGG(t ORDER BY priority LIMIT 1)[OFFSET(0)]
FROM (
SELECT *, 1 priority, FORMAT('%t', A) identity
FROM table_A A LEFT JOIN table_B B
USING(column1,column2,column3,column4,column5) -- Case 1
WHERE [SOME condition OR SOME condition]
UNION ALL
SELECT *, 2 priority, FORMAT('%t', A) identity
FROM table_A A LEFT JOIN table_B B
USING(column1,column3,column4,column5) -- Case 2
WHERE [SOME condition OR SOME condition]
UNION ALL
SELECT *, 3 priority, FORMAT('%t', A) identity
FROM table_A A LEFT JOIN table_B B
USING(column1,column2,column4) -- Case 3
WHERE [SOME condition OR SOME condition]
UNION ALL
SELECT *, 4 priority, FORMAT('%t', A) identity
FROM table_A A LEFT JOIN table_B B
USING(column1,column3,column5) -- Case 4
WHERE [SOME condition OR SOME condition]
) t
GROUP BY identity
)
Option 1 - just pick all potential candidates in one query with on fly calculating which case the entry belong to and finally pick the winner for each row in A
#standardSQL
SELECT * EXCEPT(priority, identity)
FROM (
SELECT SELECT AS VALUE ARRAY_AGG(t ORDER BY priority LIMIT 1)[OFFSET(0)]
FROM (
SELECT A.*,
B.* EXCEPT(column1,column2,column3,column4,column5),
FORMAT('%t', A) identity
CASE
WHEN (A.column1,A.column2,A.column3,A.column4,A.column5) = (B.column1,B.column2,B.column3,B.column4,B.column5) THEN 1
WHEN (A.column1,A.column3,A.column4,A.column5) = (B.column1,B.column3,B.column4,B.column5) THEN 2
WHEN (A.column1,A.column2,A.column4) = (B.column1,B.column2,B.column4) THEN 3
WHEN (A.column1,A.column3,A.column5) = (B.column1,B.column3,B.column5) THEN 4
ELSE 5
END AS priority,
FROM table_A A LEFT JOIN table_B B
ON A.column1 = B.column1
OR A.column2 = B.column2
OR A.column3 = B.column3
OR A.column4 = B.column4
OR A.column5 = B.column5
WHERE [SOME condition OR SOME condition]
) t
WHERE priority < 5
GROUP BY identity
)
Note: above versions have similarity and different at the same time - it is matter of preferences to pick one vs another. Also wanted to note - above is not tested and just written on-fly so might need additional tunning - but most likely not :o)
The setting is simple, I wanted to retrieve all rows from table A that were not present in table B. Because a unique row can be identified using 4 columns, I needed to have a way to write the WHERE statement that it works correctly.
My solution is to concatenate the 4 columns and use that as "one" column/key to do the outer join:
select *
from table_A
where filter_condition = 0
and (column1 || column2 || column3 || column4) not in (
select A.column1 || A.column2 || A.column3 || A.column4
from table_A A -- 1618727
inner join table_B B
on A.column1 = B.column1
and A.column2 = B.column2
and A.column3 = B.column3
and A.column4 = B.column4
and filter_condition = 0
)
My question is, is this a good way of doing this or am I doing something fundamentally wrong?
To be clear, the desired result is simply to get back only the rows of table_A that I "lose" due to the INNER JOIN with table_A and table_B.
You seem to be looking for not exists:
select a.*
from table_a a
where a.filter_condition = 0
and not exists (
select 1
from table_b b
where
a.column1 = b.column1
and a.column2 = b.column2
and a.column3 = b.column3
and a.column4 = b.column4
)
This will give you all records in table_a that do not have a corresponding record in table_b.
Using a LEFT JOIN between A and B and checking for a NULL row in B is probably easier:
SELECT *
FROM table_A A
LEFT JOIN table_B B ON A.column1 = B.column1
AND A.column2 = B.column2
AND A.column3 = B.column3
AND A.column4 = B.column4
WHERE B.column1 IS NULL
AND A.filter_condition = 0
You should be able to use tuples (aka row constructors) in PostgreSQL:
select *
from table_a
where filter_condition = 0
and (column1, column2, column3, column4) not in
(
select column1, column2, column3, column4
from table_b
);
If the columns can be null, then better use NOT EXISTS, as null=null results in "unknown" rather than in true or false.
I am trying to make a condition where for a certain ID, when either of two values from two different tables are greater than a number, then I will display a row with both values. Otherwise, I don't want to display any new row. What is the correct syntax for this?
if(select
a.Column1 > 2 or
b.Column2 > 2
from
Table1 a join Table2 b on a.ID = b.ID)
begin
select
a.Column1,
b.Column2
from
Table1 a join Table2 b on a.ID = b.ID)
end
else
begin
Don't Select
end
You just need to add it as a where condition. If your where condition fails for a given row, that row wouldn't be selected.
select
a.Column1,
b.Column2
from
Table1 a join Table2 b on a.ID = b.ID
where a.column1 > 2 or b.column2 > 2
#vkp's answer is probably what you want, but the literal translation of the query you have written -- without using control-flow statements -- is this:
select
a.Column1,
b.Column2
from
Table1 a join Table2 b on a.ID = b.ID
where exists (select 1 from Table1 c join Table2 d on c.ID = d.ID where c.Column1 > 2 or d.Column2 > 2);
This will either return nothing at all if one of records in the join doesn't have Table1.Column1 > 2 or Table2.Column2 > 2, or it will return all records.
I have following sql select:
select ...
from table1 a, table2 b
where
a.column = 'ABC' and
a.column2 = b.column2
I would like to only check if a.column2 = b.column2 when a.column = 'ABC'.
How do I do that?
Thanks
I'm not sure from your question tag if you're trying to figure out how to do this with a JOIN specifically (as opposed to how you did it with the WHERE clause), but anyway -- a couple of ways:
1) --with WHERE clause
select ...
from
table1 a
INNER JOIN table2 b
ON a.column2 = b.column2
where
a.column = 'ABC'
2) --WITHOUT WHERE CLAUSE
select ...
from
table1 a
INNER JOIN table2 b
ON a.column2 = b.column2
AND a.column = 'ABC'
Try this. It will check column2 only when column is 'ABC':
select ...
from table1 a, table2 b
where
(a.column = 'ABC' and
a.column2 = b.column2) or a.column <> 'ABC'
Hy guys,
can anybody please help me with a subquery in Oracle database 10g? I need to extract the values for a column in the first table as value of another column in the second table.
I currently use this statement:
SELECT
CASE WHEN A.column1 = 'A' THEN 'aaa'
WHEN A.column1 = 'B' THEN 'bbb'
.......
WHEN A.column1 = 'X' THEN 'xxx'
ELSE 'bad' END AS COLUMN1, A.*
FROM TRANSACTION_TABLE A, CATEGORY_TABLE B
WHERE A.column1 IS NOT NULL
AND A.column1 <> ' '
This is not an elegant approach, so I'm trying to use a subselect from CATEGORY_TABLE B like the following:
SELECT A.column1, A.*
FROM TRANSACTION_TABLE A, CATEGORY_TABLE B
WHERE A.column1 IS NOT NULL
AND A.column1 = B.column_b_1
AND A.column1 <> ' '
AND A.column1 IN (SELECT B.column_b_1_descr FROM CATEGORY_TABLE B
WHERE B.FIELDNAME = 'column1' AND A.column1 = B.column_b_1)
So, I cannot get any results by using the subquery and don't want to continue using the CASE against many conditions, just want to replace the A.column1 values with the descriptive values from B.column_b_1_descr , as they're easier to read.
I would appreciate any feedback.
Thanks
Unless I'm misunderstanding your question...
CATEGORY_TABLE:
name | value
A aaa
B bbb
C ccc
...
SELECT B.value AS COLUMN1, A.\*
FROM TRANSACTION\_TABLE A, CATEGORY\_TABLE B
WHERE A.column1 = B.name
or
SELECT t2.value as COLUMN1, t1.\*
FROM TRANSACTION\_TABLE t1
INNER JOIN CATEGORY\_TABLE t2 ON t1.column1 = t2.name;
The where clause isn't needed, since an inner join automatically excludes rows with null values or no matches.