Find rows where one column value match and other does not - sql

I have two tables A and B
Table A
CODE TYPE
A 1
A 2
A 3
B 1
C 1
C 2
Table B
CODE TYPE
A 1
A 2
A 4
B 2
C 1
C 3
I want to return rows where CODE is in both tables but TYPE is not and also CODE has more than one TYPE in both tables so my result would be
CODE TYPE SOURCE
A 3 Table A
A 4 Table B
C 2 Table A
C 3 Table B
Any help with this?

I think this covers both of your conditions.
select code, coalesce(typeA, typeB) as type, src
from
(
select
coalesce(a.code, b.code) as code,
a.type as typeA,
b.type as typeB,
case when b.type is null then 'A' when a.type is null then 'B' end as src,
count(a.code) over (partition by coalesce(a.code, b.code)) as countA,
count(b.code) over (partition by coalesce(a.code, b.code)) as countB
from
A a full outer join B b
on b.code = a.code and b.type = a.type
) T
where
countA >= 2 and countB >= 2
and (typeA is null or typeB is null)

You can use a full join to see if the code matches and check if the type is null on either of the tables.
select coalesce(a.code,b.code) code, coalesce(a.type,b.type) type,
case when b.type is null then 'A' when a.type is null then 'B' end src
from a
full join b on a.code = b.code and a.type = b.type
where a.type is null or b.type is null
To limit the results to codes which have more than one type, use
select x.code, coalesce(a.type,b.type) type,
case when b.type is null then 'Table A' when a.type is null then 'Table B' end src
from a
full join b on a.code = b.code and a.type = b.type
join (select a.code from a join b on a.code = b.code
group by a.code having count(*) > 1) x on x.code = a.code or x.code = b.code
where a.type is null or b.type is null
order by 1

Using union
with tu as (
select CODE, TYPE, src='Table A'
from TableA
union all
select CODE, TYPE, src='Table B'
from TableB
)
select CODE, TYPE, max(src)
from tu t1
where exists (select 1 from tu t2 where t2.CODE=t1.CODE and t2.src=t1.src and t1.TYPE <> t2.TYPE)
group by CODE, TYPE
having count(*)=1
order by CODE, TYPE

Related

SQL query to find only those customer ids which have 2 source values

I have 2 tables, one which stores the customer id and the other table which stores customer id along with the information about different sources which use that customer information. Example:
TABLE A
Customer Id
1
2
3
..
TABLE B
Customer Id Source
1 'AA'
2 'AA'
1 'AB'
2 'AB'
2 'AC'
3 'AA'
3 'AB'
3 'AE'
4 'AA'
4 'AB'
I want to write a SQL query which returns records which have only AA and AB as sources (no other sources)
I have written the below query, but it is not working correctly:
select a.customer_id
from A a, B b
where a.customer_id = b.customer_id
and b.source IN ('AA','AB')
group by a.customer_id
having count(*) = 2;
A rather efficient solution is a couple of exists subqueries:
select a.*
from a
where
exists(select 1 from b where b.customer_id = a.customer_id and b.source = 'AA')
and exists(select 1 from b where b.customer_id = a.customer_id and b.source = 'AB')
and not exists(select 1 from b where b.customer_id = a.customer_id and b.source not in ('AA', 'AB'))
With an index on b(customer_id, source), this should run quickly.
Another option is aggreation:
select customer_id
from b
group by customer_id
having
max(case when source = 'AA' then 1 else 0 end) = 1
and max(case when source = 'AB' then 1 else 0 end) = 1
and max(case when source not in ('AA', 'AB') then 1 else 0 end) = 0
This assumes that the customer_id/source combination has no duplicates
select a.customer_id
from A a join B b
on a.customer_id = b.customer_id
group by a.customer_id
-- both 'AA' and 'AB', but no other
having sum(case when b.source IN ('AA','AB') then 1 else -1 end) = 2
It might be more efficient to aggregate before the join:
select a.customer_id
from A a join
( select customer_id
from B b
group by customer_id
-- both 'AA' and 'AB', but no other
having sum(case when source IN ('AA','AB') then 1 else -1 end) = 2
) b
on a.customer_id = b.customer_id
You can use aggregation:
select b.customer_id
from b
where b.source in ('AA', 'AB')
group by b.customer_id
having count(distinct b.source) = 2;
That said, your version should work. However, you should learn to use proper, explicit, standard, readable JOIN syntax. The join, however, is not needed in this case.
If you want only those two sources, you need to tweak the logic:
select b.customer_id
from b
group by b.customer_id
having sum(case when b.source = 'AA' then 1 else 0 end) > 0 and -- has AA
sum(case when b.source = 'AB' then 1 else 0 end) > 0 and -- has AB
count(distinct b.source) = 2;

Conditional Left Join SQL

table A
----------------------------
NAME | CODE | BRANCH
----------------------------
bob | PL | B
david | AA | B
susan | PL | C
joe | AB | C
alfred | PL | B
table B
----------------------------
CODE | DESCRIPTION
----------------------------
PL | code 1
PB | code 2
PC | code 3
table C
----------------------------
CODE | DESCRIPTION
----------------------------
AA | code 4
AB | code 5
AC | code 6
Is there any way to join table A, B and C. without join all the table?
select A.*, COALESCE(B.DESCRIPTION, C.DESCRIPTION) AS DESCRIPTION from A
left join B on A.CODE = B.CODE
left join C on A.CODE = C.CODE
In my real case there will be more than 10 to join with the same column.
So I need conditional left join, something like this
SELECT A* , DESCRIPTION
FROM A LEFT JOIN (
CASE
WHEN A.CODE = 'B' THEN SELECT * FROM B
WHEN A.CODE = 'C' THEN SELECT * FROM C
END
) BC ON A.CODE = BC.CODE
You cannot use CASE to implement flow control. In SQL CASE is an expression that returns a single value.
You can instead use the following query:
select A.*,
CASE A.BRANCH
WHEN 'B' THEN B.DESCRIPTION
WHEN 'C' THEN C.DESCRIPTION
END AS DESCRIPTION
from A
left join B on A.CODE = B.CODE AND A.BRANCH = 'B'
left join C on A.CODE = C.CODE AND A.BRANCH = 'C'
You could use this to generate queries. Then you write a PL/SQL block to loop through all these queries and execute dynamically to give you separate results.
SELECT 'SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN '
|| CASE WHEN A.BRANCH = 'B' THEN 'TABLEB B' END
|| CASE WHEN A.BRANCH = 'C' THEN 'TABLEC C' END
|| ' ON '
|| 'A.CODE = '
|| CASE WHEN A.BRANCH = 'B' THEN 'B.CODE' END
|| CASE WHEN A.BRANCH = 'C' THEN 'C.CODE' END
v_query
FROM TableA A;
Output
V_QUERY
--------------------------------------------------------------------------------
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEC C ON A.CODE = C.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEC C ON A.CODE = C.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE

Group by the union of two columns

How can GROUP BY based on the union of two columns be achieved performantly? There may be NULL values in either column. Something like (obviously this doesn't work):
SELECT a.val, b.val
FROM a
LEFT JOIN b on a.id = b.id
GROUP BY UNION(a.val, b.val)
With results like:
a.val | b.val
-----------
1 1
2 2
NULL 3
4 NULL
5 5
Thanks!
Why can't you use NVL
SELECT NVL(a.val, b.val) FROM a LEFT JOIN b on a.id = b.id
GROUP BY NVL(a.val, b.val)

Optimizing SQL Server view

I am trying to optimize a big query that contains about 10 subqueries on a table with 30+ columns and over 2 million records.
I would like to reduce the amount of selects on this massive table but I don't really know how I could optimize following query to prevent this.
I would like to use some kind of filter on a so that I can just write down a WHERE clause in the column subquery instead of querying tableA again, but I have no clue on how I could do this:
SELECT
col1, col2,
(SELECT COUNT(col3)
FROM tableA ta
INNER JOIN tableB b ON tl.TaskId = b.col6
INNER JOIN tableC c ON b.Id = c.TaskId
WHERE c.ResultCode = 1
AND ta.col4 = a.col4
AND ta.col5 = a.col5) as Executed,
(SELECT COUNT(col3)
FROM tableA ta
INNER JOIN tableB b ON tl.TaskId = b.col6
INNER JOIN tableC c ON b.Id = c.TaskId
WHERE c.ResultCode = 9
AND ta.col4 = a.col4
AND ta.col5 = a.col5) as NotExecuted
FROM
tableA a
GROUP BY
col1, col2, col4, col5
I have few questions about your query (asked in comment), however it is might what you are looking for:
SELECT
col1 ,
col2 ,
COUNT(CASE WHEN c.ResultCode = 1 THEN 1
ELSE NULL
END) AS Executed ,
COUNT(CASE WHEN c.ResultCode = 9 THEN 1
ELSE NULL
END) AS NotExecuted
FROM
tableA a
JOIN tableB b ON a.id = b.tableA_id
JOIN tableC c ON a.id = c.tableB_id
WHERE
c.ResultCode IN ( 1, 9 )
GROUP BY
col1 ,
col2;
If you do not need it in separate columns, then you can aggregate it in separate rows. That would probably work faster:
SELECT
col1 ,
col2 ,
CASE c.ResultCode
WHEN 1 THEN 'Executed'
WHEN 9 THEN 'Not Executed'
END,
COUNT(*)
FROM
tableA a
JOIN tableB b ON a.id = b.tableA_id
JOIN tableC c ON a.id = c.tableB_id
WHERE
c.ResultCode IN ( 1, 9 )
GROUP BY
col1 ,
col2 ,
c.ResultCode;
SELECT
col1, col2, sum(case when c.ResultCode = 1 THEN 1 ELSE 0 END) as Executed,
sum(case when c.ResultCode = 9 THEN 1 ELSE 0 END) as NotExecuted
FROM tableA a
INNER JOIN tableB b ON a.TaskId = b.col6
INNER JOIN tableC c ON b.Id = c.TaskId
GROUP BY col1, col2

How can I get the exact match join for the below scenario?

How can i join the below tables
TableA TableB TableC TableD
ID ID_C ID ID_A Value ID ID ID_C Value
1 1 1 1 a 1 1 1 a
2 1 b 2 1 b
in order to get the Result like
Result
ID ID_B Value ID_C ID_D Value
1 1 a 1 1 a
1 2 b 1 2 b
and my result shouldn't contain 1 2 b 1 1 b and both value columns cannot always have same values so it cannot be used in a condition.
To make it simplier,
Resultant Table TableA TableB
ID Value ID Value ID ID_A
1 a 1 a 1 1
1 b 2 g 2 1
2 a 3 d 3 2
3 c 4 3
Now i need to join the Resultant Table with TableA,TableB inorder to get some of the columns from TableA,TableB and ResultantTable.ID=TableA.ID and TableB.ID_A=TableA.ID since its a foreign key.
Doing the Join with TableB turns to duplicates. Since ID=1 occurs twice i get 4 records where ID=1, when there are only 2 records. It can be done with distinct or group by but i need other columns as well to be displayed.How do i do both in the process.
SELECT A.ID, B.ID, B.Value, C.ID, D.ID, D.Value
FROM TableA A
INNER JOIN TableB B ON A.ID = B.ID_A
INNER JOIN TableC C ON A.ID_C = C.ID
INNER JOIN TableD D ON B.ID = D.ID AND C.ID = D.ID_C
You tell us that the field "value" in TableB should not be different from the field "value" in TableD? Could we replace the B.ID = D.ID with B.Value = D.Value so solve your problem?
Are you sure, that is the way that is suppose to work?
Try:
SELECT A.ID, B.ID ID_B, B.Value Value_B, C.ID ID_C, D.ID ID_D, D.Value Value_D
FROM TableA A
JOIN TableB B ON A.ID = B.ID_A
JOIN TableC C ON A.ID_C = C.ID
JOIN TableD D ON B.Value = D.Value AND C.ID = D.ID_C