SQL Server 2008 compare two tables in same database and get column is changed - sql

I need to get the difference between two tables.
I need to compare Product, Qty & price columns from two tables and say if its new record or I need to mention which column value is changed.
Example Table A
Product | Qty | Price | Comments
A 20 500 xyz
B 50 200 xyz
C 90 100 abc
Example Table B
Product | Qty | Price | Comments
A 20 500 sd
B 70 200 cv
C 90 200 wsd
D 50 500 xyz
Currently I am using Expect which gives all new / mismatched rows.
select Product,Qty,Price
from TableB
except
select Product,Qty,Price
from TableA
Product | Qty | Price
B 70 200
C 90 200
D 50 500
But I need the result set like below
Product | Result
B Updated Qty
C Updated Price
D New

You can do this using LEFT JOIN:
SELECT b.Product,
b.Qty,
b.Price,
Result = CASE WHEN a.product IS NULL THEN 'New'
ELSE 'Updated: ' +
STUFF( CASE WHEN a.Qty != b.Qty THEN ',Qty' ELSE '' END +
CASE WHEN a.Price != b.Price THEN ',Price' ELSE '' END,
1, 1, '')
END
FROM TableB b
LEFT JOIN TableA a
ON a.Product = b.Product
WHERE a.Product IS NULL
OR a.Qty != b.Qty
OR a.Price != b.Price;
Example on SQL Fiddle

Not the most concise approach for sure, but readable and probably efficient:
SELECT B.Product,
Result = 'Updated Qty'
FROM TableB B
LEFT OUTER JOIN TableA A
ON B.Product = A.Product
WHERE A.Product IS NOT NULL
AND A.Qty <> B.Qty
AND A.Price = B.Price
UNION ALL
SELECT B.Product,
Result = 'Updated Price'
FROM TableB B
LEFT OUTER JOIN TableA A
ON B.Product = A.Product
WHERE A.Product IS NOT NULL
AND A.Price <> B.Price
AND A.Qty = B.Qty
UNION ALL
SELECT B.Product,
Result = 'Updated Qty and Price'
FROM TableB B
LEFT OUTER JOIN TableA A
ON B.Product = A.Product
WHERE A.Product IS NOT NULL
AND A.Price <> B.Price
AND A.Qty <> B.Qty
UNION ALL
SELECT B.Product,
Result = 'New'
FROM TableB B
LEFT OUTER JOIN TableA A
ON B.Product = A.Product
WHERE A.Product IS NULL
Demo
If you need to order the result you have to do that in an outer query like here.

Related

Optimizing SQL query having DISTINCT keyword and functions

I have this query that generates about 40,000 records and the execution time of this query is about 1 minute 30 seconds.
SELECT DISTINCT
a.ID,
a.NAME,
a.DIV,
a.UID,
(select NAME from EMPLOYEE where UID= a.UID and UID<>'') as boss_id,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 1 and id = a.ID) as TERM1,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 2 and id = a.ID) as TERM2,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 3 and id = a.ID) as TERM3,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 4 and id = a.ID) as TERM4,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 5 and id = a.ID) as TERM5,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 6 and id = a.ID) as TERM6,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 7 and id = a.ID) as TERM7,
(select DATE(MAX(create_time)) from XYZ where XYZ_ID= 8 and id = a.ID) as TERM8
FROM EMPLOYEE a
WHERE ID LIKE 'D%'
I tried using group by, different kinds of join to improve the execution time but couldn't succeed.Both the tables ABC and XYZ are indexed.
Also, I think that the root cause of this problem is either the DISTINCT keyword or the MAX function.
How can I optimize the above query to bring down the execution time to at least less than a minute?
Any help is appreciated.
Query is not tested, this is just an idea on how you could get this done in two different ways.
(SQL Server solutions here)
Using LEFT JOIN for each ID should look something like this:
SELECT a.ID,
a.NAME,
a.DIV,
a.UID,
b.Name as boss_id,
MAX(xyz1.create_time) as TERM1,
MAX(xyz2.create_time) as TERM2,
MAX(xyz3.create_time) as TERM3,
MAX(xyz4.create_time) as TERM4,
MAX(xyz5.create_time) as TERM5,
MAX(xyz6.create_time) as TERM6,
MAX(xyz7.create_time) as TERM7,
MAX(xyz8.create_time) as TERM8
FROM EMPLOYEE a
JOIN EMPLOYEE b on a.UID = b.UID and b.UID <> ''
LEFT JOIN XYZ xyz1 on a.ID = xyz1.ID and xyz1.XYZ_ID = 1
LEFT JOIN XYZ xyz2 on a.ID = xyz2.ID and xyz1.XYZ_ID = 2
LEFT JOIN XYZ xyz3 on a.ID = xyz3.ID and xyz1.XYZ_ID = 3
LEFT JOIN XYZ xyz4 on a.ID = xyz4.ID and xyz1.XYZ_ID = 4
LEFT JOIN XYZ xyz5 on a.ID = xyz5.ID and xyz1.XYZ_ID = 5
LEFT JOIN XYZ xyz6 on a.ID = xyz6.ID and xyz1.XYZ_ID = 6
LEFT JOIN XYZ xyz7 on a.ID = xyz7.ID and xyz1.XYZ_ID = 7
LEFT JOIN XYZ xyz8 on a.ID = xyz8.ID and xyz1.XYZ_ID = 8
WHERE a.ID LIKE 'D%'
GROUP BY a.ID, a.NAME, a.DIV, a.UID, b.Name
Using PIVOT would look something like this:
select * from (
SELECT DISTINCT
a.ID,
a.NAME,
a.DIV,
a.UID,
b.NAME as boss_id,
xyz.xyz_id,
xyz.create_time
FROM EMPLOYEE a
JOIN EMPLOYEE b on a.UID = b.UID and b.UID <> ''
LEFT JOIN (SELECT DATE(MAX(create_time)) create_time, XYZ_ID, ID
from XYZ
where XYZ_ID between 1 and 8
group by XYZ_ID, ID) xyz on a.ID = xyz1.ID
WHERE a.ID LIKE 'D%') src
PIVOT (
max(create_time) for xyz_id IN (['1'], ['2'], ['3'], ['4'],
['5'], ['6'], ['7'], ['8'])
) PIV
Give it a shot
I would recommend group by and conditional aggregation:
SELECT e.ID, e.NAME, e.DIV, e.UID,
DATE(MAX(CASE WHEN XYZ_ID = 1 THEN create_time END)) as term1,
DATE(MAX(CASE WHEN XYZ_ID = 2 THEN create_time END)) as term2,
DATE(MAX(CASE WHEN XYZ_ID = 3 THEN create_time END)) as term3,
DATE(MAX(CASE WHEN XYZ_ID = 4 THEN create_time END)) as term4,
DATE(MAX(CASE WHEN XYZ_ID = 5 THEN create_time END)) as term5,
DATE(MAX(CASE WHEN XYZ_ID = 6 THEN create_time END)) as term6,
DATE(MAX(CASE WHEN XYZ_ID = 7 THEN create_time END)) as term7,
DATE(MAX(CASE WHEN XYZ_ID = 8 THEN create_time END)) as term8
FROM EMPLOYEE e LEFT JOIN
XYZ
ON xyz.ID = e.id
WHERE e.ID LIKE 'D%'
GROUP BY e.ID, e.NAME, e.DIV, e.UID;
I don't understand the logic for boss_id, so I left that out. This should improve the performance significantly.

Conditional Left Join SQL

table A
----------------------------
NAME | CODE | BRANCH
----------------------------
bob | PL | B
david | AA | B
susan | PL | C
joe | AB | C
alfred | PL | B
table B
----------------------------
CODE | DESCRIPTION
----------------------------
PL | code 1
PB | code 2
PC | code 3
table C
----------------------------
CODE | DESCRIPTION
----------------------------
AA | code 4
AB | code 5
AC | code 6
Is there any way to join table A, B and C. without join all the table?
select A.*, COALESCE(B.DESCRIPTION, C.DESCRIPTION) AS DESCRIPTION from A
left join B on A.CODE = B.CODE
left join C on A.CODE = C.CODE
In my real case there will be more than 10 to join with the same column.
So I need conditional left join, something like this
SELECT A* , DESCRIPTION
FROM A LEFT JOIN (
CASE
WHEN A.CODE = 'B' THEN SELECT * FROM B
WHEN A.CODE = 'C' THEN SELECT * FROM C
END
) BC ON A.CODE = BC.CODE
You cannot use CASE to implement flow control. In SQL CASE is an expression that returns a single value.
You can instead use the following query:
select A.*,
CASE A.BRANCH
WHEN 'B' THEN B.DESCRIPTION
WHEN 'C' THEN C.DESCRIPTION
END AS DESCRIPTION
from A
left join B on A.CODE = B.CODE AND A.BRANCH = 'B'
left join C on A.CODE = C.CODE AND A.BRANCH = 'C'
You could use this to generate queries. Then you write a PL/SQL block to loop through all these queries and execute dynamically to give you separate results.
SELECT 'SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN '
|| CASE WHEN A.BRANCH = 'B' THEN 'TABLEB B' END
|| CASE WHEN A.BRANCH = 'C' THEN 'TABLEC C' END
|| ' ON '
|| 'A.CODE = '
|| CASE WHEN A.BRANCH = 'B' THEN 'B.CODE' END
|| CASE WHEN A.BRANCH = 'C' THEN 'C.CODE' END
v_query
FROM TableA A;
Output
V_QUERY
--------------------------------------------------------------------------------
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEC C ON A.CODE = C.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEC C ON A.CODE = C.CODE
SELECT A.* , DESCRIPTION
FROM TABLEA A LEFT JOIN TABLEB B ON A.CODE = B.CODE

SQL - SUM within subquery

I have the following code that looks at the SalesVol of different products and groups it by transaction_week
SELECT a.transaction_week,
SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
WHERE series in (62,236,501,52)
GROUP BY a.transaction_week
ORDER BY a.transaction_week
| tw | SalesVol |
| 1 | 4768 |
| 2 | 4567 |
| 3 | 4354 |
| 4 | 4678 |
I want to be able to have multiple subqueries where I change the series numbers for example.
SELECT a.transaction_week,
(SELECT SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
WHERE series in (62,236,501,52)) as personal care
(SELECT SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
WHERE series in (37,202,203,456)) as white goods
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
GROUP BY a.transaction_week
ORDER BY a.transaction_week
I can't get the subqueries at work as it is giving me the overall sum value and not grouping it by transaction_week
Instead of using subqueries, add series to the condition of the CASE statements:
SELECT a.transaction_week,
sum(CASE WHEN series IN (62,236,501,52) AND record_type IN (6,37,13)
THEN quantity ELSE 0 END) as personal_care,
sum(CASE WHEN series IN (37,202,203,456) AND record_type IN (6,37,13)
THEN quantity ELSE 0 END) as white_goods
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
GROUP BY a.transaction_week
ORDER BY a.transaction_week;
You just miss the a.transaction_week in you subquery. The JOIN in outer query is unneccessary.
SELECT a.transaction_week,
(
SELECT SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a2
LEFT JOIN table 2 b ON b.Date = a2.transaction_date
LEFT JOIN table 3 c ON c.sku = a2.product
WHERE series in (62,236,501,52) AND a2.transaction_week = a.transaction_week
) as personal care,
(
SELECT SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a 2
LEFT JOIN table 2 b ON b.Date = a2.transaction_date
LEFT JOIN table 3 c ON c.sku = a2.product
WHERE series in (37,202,203,456) AND a2.transaction_week = a.transaction_week
) as white goods
FROM table 1 a
GROUP BY a.transaction_week
ORDER BY a.transaction_week
Try this it would work fast as well as up to your requirement:
SELECT a.transaction_week ,
whitegoods.SalesVol AS 'White Goods' ,
personalcare.SalesVol1 AS 'Personal Care'
FROM table1 a
LEFT JOIN table2 b ON b.[Date] = a.transaction_date
LEFT JOIN table3 c ON c.sku = a.product
CROSS APPLY ( SELECT SUM(CASE WHEN record_type IN ( 6, 37, 13 )
THEN quantity
ELSE 0
END) AS SalesVol
FROM table1 a2
WHERE b.[Date] = a2.transaction_date
AND c.sku = a2.product
AND series IN ( 37, 202, 203, 456 )
AND a2.transaction_week = a.transaction_week
) whitegoods
CROSS APPLY ( SELECT SUM(CASE WHEN record_type IN ( 6, 37, 13 )
THEN quantity
ELSE 0
END) AS SalesVol1
FROM table1 a2
WHERE b.[Date] = a2.transaction_date
AND c.sku = a2.product
AND series IN ( 62, 236, 501, 52 )
AND a2.transaction_week = a.transaction_week
) personalcare
GROUP BY a.transaction_week
ORDER BY a.transaction_week
You should use the UNION operator. Please refer to the query below:
select a.transaction_week, SalesVol from
(SELECT a.transaction_week as transaction_week,
SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
WHERE series in (62,236,501,52)
UNION
SELECT a.transaction_week as transaction_week,
SUM(CASE WHEN record_type IN (6,37,13) THEN quantity ELSE 0 END) as SalesVol
FROM table 1 a
LEFT JOIN table 2 b ON b.Date = a.transaction_date
LEFT JOIN table 3 c ON c.sku = a.product
WHERE series in (37,202,203,456)
) AS tbl1
GROUP BY tbl1.transaction_week
ORDER BY tbl1.transaction_week

Find rows where one column value match and other does not

I have two tables A and B
Table A
CODE TYPE
A 1
A 2
A 3
B 1
C 1
C 2
Table B
CODE TYPE
A 1
A 2
A 4
B 2
C 1
C 3
I want to return rows where CODE is in both tables but TYPE is not and also CODE has more than one TYPE in both tables so my result would be
CODE TYPE SOURCE
A 3 Table A
A 4 Table B
C 2 Table A
C 3 Table B
Any help with this?
I think this covers both of your conditions.
select code, coalesce(typeA, typeB) as type, src
from
(
select
coalesce(a.code, b.code) as code,
a.type as typeA,
b.type as typeB,
case when b.type is null then 'A' when a.type is null then 'B' end as src,
count(a.code) over (partition by coalesce(a.code, b.code)) as countA,
count(b.code) over (partition by coalesce(a.code, b.code)) as countB
from
A a full outer join B b
on b.code = a.code and b.type = a.type
) T
where
countA >= 2 and countB >= 2
and (typeA is null or typeB is null)
You can use a full join to see if the code matches and check if the type is null on either of the tables.
select coalesce(a.code,b.code) code, coalesce(a.type,b.type) type,
case when b.type is null then 'A' when a.type is null then 'B' end src
from a
full join b on a.code = b.code and a.type = b.type
where a.type is null or b.type is null
To limit the results to codes which have more than one type, use
select x.code, coalesce(a.type,b.type) type,
case when b.type is null then 'Table A' when a.type is null then 'Table B' end src
from a
full join b on a.code = b.code and a.type = b.type
join (select a.code from a join b on a.code = b.code
group by a.code having count(*) > 1) x on x.code = a.code or x.code = b.code
where a.type is null or b.type is null
order by 1
Using union
with tu as (
select CODE, TYPE, src='Table A'
from TableA
union all
select CODE, TYPE, src='Table B'
from TableB
)
select CODE, TYPE, max(src)
from tu t1
where exists (select 1 from tu t2 where t2.CODE=t1.CODE and t2.src=t1.src and t1.TYPE <> t2.TYPE)
group by CODE, TYPE
having count(*)=1
order by CODE, TYPE

T-SQL Removing multiple LEFT JOIN

I have such query. It returns ColA and ColB from TableA and UserName from table Users. Then it displays several fields from TableB as additional columns to results. It works but is there any better way than using these multiple LEFT JOINS ?
SELECT a.COlA, a.ColB, u.UserName,
b1.Value,
b2.Value,
b3.Value,
b4.Value,
FROM TableA a JOIN Users u ON a.UserId = u.UserId
LEFT JOIN TableB b1 ON a.EventId = b1.EventId AND b1.Code = 5
LEFT JOIN TableB b2 ON a.EventId = b2.EventId AND b2.Code = 15
LEFT JOIN TableB b3 ON a.EventId = b3.EventId AND b3.Code = 18
LEFT JOIN TableB b4 ON a.EventId = b4.EventId AND b4.Code = 40
WHERE (a.UserId = 3) ORDER BY u.UserName ASC
TableB looks like:
Id | EventId | Code | Value
----------------------------
1 | 1 | 5 | textA
2 | 1 | 15 | textB
3 | 1 | 18 | textC
Sometimes Code is missing but for each event there are no duplicated Codes (so each LEFT JOIN is just another cell in the same result record).
I cannot understand why you want to change something that is working, but here's another way (which does those LEFT joins, but in a different way):
SELECT a.COlA, a.ColB, u.UserName,
( SELECT b.Value FROM TableB b WHERE a.EventId = b.EventId AND b.Code = 5 ),
( SELECT b.Value FROM TableB b WHERE a.EventId = b.EventId AND b.Code = 15 ),
( SELECT b.Value FROM TableB b WHERE a.EventId = b.EventId AND b.Code = 18 ),
( SELECT b.Value FROM TableB b WHERE a.EventId = b.EventId AND b.Code = 40 )
FROM TableA a JOIN Users u ON a.UserId = u.UserId
WHERE (a.UserId = 3)
ORDER BY u.UserName ASC
SELECT
a.COlA, a.ColB, u.UserName
,MAX(CASE WHEN b.Value = 5 THEN b.value ELSE 0 END) AS V5
,MAX(CASE WHEN b.Value = 15 THEN b.value ELSE 0 END) AS V15
,MAX(CASE WHEN b.Value = 18 THEN b.value ELSE 0 END) AS V18
,MAX(CASE WHEN b.Value = 40 THEN b.value ELSE 0 END) AS V45
,COUNT(CASE WHEN b.Value not IN (5,15,18,40) THEN 1 ELSE NULL END) AS CountVOther
FROM TableA a
INNER JOIN Users u ON a.UserId = u.UserId
LEFT JOIN TableB b ON (a.EventId = b.EventId)
WHERE (a.UserId = 3)
GROUP BY a.colA, a.colB, u.Username
ORDER BY u.UserName ASC