How to combine CASE, INNER JOIN, and GROUP BY - sql

I'm dealing with an issue; I need to join two tables, group by their ID's and use CASE statement to compare values from those 2 tables. I have been trying to use a temp table and then SELECT from it.
Purpose is to test if values in CORE correspond to values in MART.
Ideally I want to have one query, where I will see column CORE_X_MART and can use where statement on it.
Group by is essential because otherwise I have ID duplicates in the temporary table.
My code:
drop table if exists #tNDWH_4034
select a.ID, b.ID, a.col2 as MART_Value, b.col2 as CORE_Value,
case when a.col2 = b.col2 then 'Match' else 'Mismatch' end as CORE_X_MART
into #tNDWH_4034
from tab1 as a
inner join tab2 as b on a.ID = b.ID
where a.CurrentFlag = 1
group by a.ID, b.ID;
select * from #tNDWH_4034
where CORE_X_MART = 'Mismatch';
I'm using SQL server.

You don't need temp table. You can go for derived table to achieve the purpose, to have them in a single query.
SELECT * FROM
(select a.ID, b.ID, a.col2 as MART_Value, b.col2 as CORE_Value,
case when a.col2 = b.col2 then 'Match' else 'Mismatch' end as CORE_X_MART
from tab1 as a
inner join tab2 as b
on a.ID = b.ID
where a.CurrentFlag = 1
group by a.ID, b.ID) as t
WHERE t.CORE_X_MART = 'Mismatch'

maybe:
select a.ID, b.ID, a.col2 as MART_Value, b.col2 as CORE_Value,
case when a.col2 = b.col2 then 'Match' else 'Mismatch' end as CORE_X_MART
into #tNDWH_4034
from tab1 as a
inner join tab2 as b on (a.ID = b.ID)
where a.CurrentFlag = 1
group by a.ID, b.ID, a.col2, b.col2
,case when a.col2 = b.col2 then 'Match' else 'Mismatch' end --this line is probably not required

You don't need the temp table, group by, or the case either. You are looking for mismatches only so just use the not equal to operator <> to filter your results.
select distinct a.ID, a.col2 as MART_Value, b.col2 as CORE_Value
from tab1 as a
inner join tab2 as b on a.ID = b.ID
where a.CurrentFlag = 1
and a.col2 <> b.col2

Related

Update table based on multiple conditions without using CASE

I need to update a table based on multiple conditions and the update needs to be done in one update statement. In addition, the restriction is that I CANNOT use the following construct due to performance issues since there are about 18 CASE expressions in my update:
UPDATE A
SET A.col1 = CASE WHEN B.col = someValue THEN B.Col2 END,
B.Col2 = CASE WHEN b.col = someOtherValue THEN B.Col2 END,
.
.
--18th CASE stmt
B.Col18 = CASE WHEN b.col = YetAnotherValue THEN B.Col2 END
FROM
tableA A
INNER JOIN
tableB B ON A.someColumn = B.someColumn
Any suggestions will be appreciated .
I suspect that you actually want to aggregate before updating:
UPDATE A
SET A.col1 = B.col1,
B.Col2 = B.col2,
. . .
FROM tableA A JOIN
(SELECT B.someColumn,
MAX(CASE WHEN B.col = someValue THEN B.Col2 END) as col1,
MAX(CASE WHEN b.col = someOtherValue THEN B.Col2 END) as col2,
. . .
FROM tableB B
GROUP BY B.someColumn
) B
ON A.someColumn = B.someColumn

Can you use an AND statement in a JOIN in SQL

Suppose you have two tables A and B and you are trying to write a JOIN query, is the following possible:
SELECT A.col1, B.col1
FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
Will this return a table of col1 from table A and col2 from table B where there is a match in the second column across the tables and the third column of table B is 'hello'?
I.e. it will only return rows that are matching in col2 and this is further reduced to the cases where col3 in table B is 'hello'?
Yes. You can use:
Below will Join the Records in B table (Col3='hello') with A:
SELECT A.col1, B.col1
FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
Below will Join all Records in B table with A, And performing where at Result of A and B:
SELECT A.col1, B.col1
FROM A JOIN B on A.col2 = B.col2
WHERE B.col3 = 'hello'
Both will give the same result when no other tables joined.
Yes you can.
You can specify any kind of boolean condition in the ON clause.
It is not mandatory that any column is involved in the condition so all of the following are valid:
SELECT A.col1, B.col1 FROM A JOIN B on 1=1
SELECT A.col1, B.col1 FROM A JOIN B on B.col3 = 'hello'
SELECT A.col1, B.col1 FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = 'hello')
SELECT A.col1, B.col1 FROM A JOIN B on (A.col2 = B.col2 AND B.col3 = C.col3)
SELECT A.col1, B.col1 FROM A LEFT JOIN B on (C.col3 = 'bye')
But pay attention, if you limit the condition to only key fields the optimizer engine will improve the performances very much.
For an inner join, these two statements are equivalent:
SELECT A.col1, B.col1
FROM A JOIN
B
ON A.col2 = B.col2 AND B.col3 = 'hello';
and:
SELECT A.col1, B.col1
FROM A JOIN
B
ON A.col2 = B.col2
WHERE B.col3 = 'hello';
Both should have the same execution plans as well.
Some people prefer putting filtering conditions in the WHERE clause, so the query is more clear about "conditions between tables" versus "filters on the result set". I tend to agree with this sentiment, although I'm not dogmatic about it.
OUTER JOINs are different. For an outer join, it makes a big different where the conditions go. In that case, you generally do not have a choice, so you use ON or WHERE to get the logic that you want.

Teradata wildcard in join definition

I have joined tables like bellow:
select a.*, b.col4, b.col5 from table a
inner join table b
on a.col2=b.col2
and a.col3=b.col3
It can happen that in b.col2, b.col3 can be value '*', which should be something like wildcard, meaninng, that in this case we can join value of b.col2 on any value of a.col2 or value b.col3 on any value a.col3.
Would you please help me define it?
It sounds like you have a default. One method is multiple comparison:
select a.*,
coalesce(b.col4, bdef3.col4, bdef2.col4, bdef.col4) as col4, b.col5
coalesce(b.col5, bdef3.col5, bdef2.col5, bdef.col5) as col5
from tablea a left join
tableb b
on b.col2 = a.col2 and b.col3 = a.col3 left join
tableb bdef3
on b.col2 = a.col2 and b.col3 = '*' left join
tableb bdef2
on b.col2 = '*' and b.col3 = a.col3 left join
tableb bdef
on b.col2 = '*' and b.col3 = '*';
You may want a where clause if you want to guarantee some match:
where (b.col2 is not null or bdef3.col2 is not null or bdef2.col2 is not null or bdef.col2 is not null)
I think the above is more efficient, but you can express this more succinctly as:
select a.*, b.col4, b.col5
from tablea a left join
tableb b
on (b.col2 = a.col2 or b.col2 = '*') and
(b.col3 = a.col3 or b.col3 = '*')
qualify 1 = row_number() over (partition by a.id order by (case when b.col2 = '*' then 2 else 1 end), (case when b.col3 = '*' then 2, else 1 end))

Left Join Table with Exclusive OR

Is there a way to join 2 tables together on one and only one of the possible conditions? Joining on condition "a" or "b" could duplicate rows, but I'm looking to only join once. I came up with a potential solution, but I'm wondering if there is a more slick way to do it.
For example:
SELECT *
FROM TableA a
LEFT JOIN TableB b
ON a.col1 = b.col1
OR (a.col1 != b.col1 AND a.col2 = b.col2)
This would join the tables on col1 OR col2 BUT NOT BOTH. Is there a cleaner way of doing this?
Not more efficient but I think more clear
SELECT *
FROM TableA a
LEFT JOIN TableB b
ON (a.col1 = b.col1 or a.col2 = b.col2)
AND NOT (a.col1 = b.col1 and a.col2 = b.col2)
Your method works. If you only want one (or a handful) of columns from b, I would suggest:
SELECT a.*, COALESCE(b.col3, b2.col3)
FROM TableA a LEFT JOIN
TableB b
ON a.col1 = b.col1 LEFT JOIN
TableB b2
ON a.col1 <> b2.col1 AND a.col2 = b2.col2;
Removing the OR from the JOIN conditions allows the optimizer to generate a better execution plan.

How can I search for values in three columns, but return rows that don't match all?

I have two tables, say A and B. I wish to compare three or more columns in both tables and to return any rows in table B that don't match all of the compared columns.
I've looked at doing a left join function from recommendations, but can't quite figure it out.
Please help!
You can use left join or not exists for this. Here is one method:
select b.*
from tableb as b
where not exists (select 1
from tablea as a
where a.col1 = b.col1 and a.col2 = b.col2 and a.col3 = b.col3
);
how about something like this
Select b.col1,b.col2,b.col3 from
tableb b left outer join tablea a
on ( b.col1 != a.co11 and b.col2 != a.co12 and b.col3 != a.co13 )