Get column name where values differ between two rows - sql

I have a table with lots of columns. Sometimes I need to find differences between two rows. I can do it just by scrolling through screen but it is dull. I'm looking for a query that will do this for me, something like
SELECT columns_for_id_1 != columns_for_id_2
FROM xyz
WHERE id in (1,2)
Table:
id col1 col2 col3 col4
1 qqq www eee rrr
2 qqq www XXX rrr
Result:
"Different columns: id, col3"
Is there a simple way to do this?
UPDATE
Another example as wanted:
What I have (table has more than 50 column, not only 7):
Id| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
==============================================
1 | aaa | bbb | ccc | ddd | eee | fff |
----------------------------------------------
2 | aaa | XXX | ccc | YYY | eee | fff |
Query:
SELECT *
FROM table
WHERE Id = 1 OR Id = 2
AND "columns value differs"
Query result: "Id, Col2, Col4"
OR something like:
Id|Col2 |Col4 |
===============
1 |bbb |ddd |
---------------
2 |XXX |YYY |
Right now I have to scroll through more than 50 columns to see if rows are the same, it's not efficient and prone to mistakes. I don't want any long query like
SELECT (COMPARE Id1.Col1 with Id2.Col1 if different then print "Col1 differs", COMPARE Id1.Col2 with Id2.Col2...) because I will compare the rows myself faster ;)

Something like this:
SELECT col, MIN(VAL) AS val1, MAX(val) AS val2
FROM (
SELECT id, val, col
FROM (
SELECT id, [col1], [col2], [col3], [col4]
FROM mytable
WHERE id IN (1,2)) AS src
UNPIVOT (
val FOR col IN ([col1], [col2], [col3], [col4])) AS unpvt ) AS t
GROUP BY col
HAVING MIN(val) <> MAX(val)
Output:
col val1 val2
================
col3 eee XXX

Try this simple query, may be help you
SELECT (CASE WHEN a.col1 <> b.col1 THEN 'Different Col1'
WHEN a.col2 <> b.col2 THEN 'Different Col2'
...
ELSE 'No Different' END) --You can add only required columns here
FROM xyz AS a
INNER JOIN xyz AS b ON b.id = 1 --First Record
WHERE a.id = 2 --Second record to compare

If you are on SQL Server 2012 then you can also use LEAD/LAG windowed funuction to do this. MSDN Reference is here - https://msdn.microsoft.com/en-us/library/hh213125.aspx
select
id,
col1,
col2,
col3,
col4,
stuff(diff_cols,len(diff_cols-1),1,'') diff_cols
from
(
SELECT
id,
col1,
col2,
col3,
col4,
concat
(
'Different columns:',
CASE
WHEN LEAD(id, 1,0) OVER (ORDER BY id) <> id THEN 'id,'
WHEN LEAD(col1, 1,0) OVER (ORDER BY id) <> col1 THEN 'col1,'
WHEN LEAD(col2, 1,0) OVER (ORDER BY id) <> col2 THEN 'col2,'
WHEN LEAD(col3, 1,0) OVER (ORDER BY id) <> col3 THEN 'col3,'
WHEN LEAD(col4, 1,0) OVER (ORDER BY id) <> col4 THEN 'col4,'
) diff_cols
FROM xyz
) tmp

Related

Row count discrepancy between Intersect and Except queries

I'm getting some strange behaviour using intersect and except. Tb1 has the least rows out of the two tables, and the difference in row count between tb1 and the intersect query results is 143 (intersect = 9782, tb1 = 9925).
But when I run the same query with except, it returns 24 lines. My understanding is that it should have returned 143 rows, being the rows that didn't match in the intersect query. Could someone help me understand why this might be?
There is a possibility that both datasets have multiple duplicate rows (being subset data). Could this be the cause of the difference?
SELECT
amount
,date
FROM tb1
INTERSECT
SELECT
amount
,date
FROM tb2
As you're probably already aware, the difference between UNION and UNION ALL is that the former returns a unique result, while the latter doesn't.
The same can be said about INTERSECT versus INTERSECT ALL.
And also about EXCEPT versus EXCEPT ALL.
So when there are dups, then the totals can be different from what you expect.
Here's a simplified demo to illustrate.
create table TableA (
col1 int not null,
col2 varchar(8)
);
create table TableB (
col1 int not null,
col2 varchar(8)
);
insert into TableA (Col1, Col2) values
(1,'A') -- only A
, (3,'AB') -- 1 in both
, (4,'AAB'), (4,'AAB') -- 2 in A, 1 in B
, (5,'ABB') -- 1 in A, 2 in B
, (6,'AABB'), (6,'AABB') -- 2 in both
, (7, NULL); -- 1 NULL in both
8 rows affected
insert into TableB (Col1, Col2) values
(2,'B') -- only B
, (3,'AB') -- 1 in both
, (4,'AAB') -- 2 in A, 1 in B
, (5,'ABB'), (5,'ABB') -- 1 in A, 2 in B
, (6,'AABB'), (6,'AABB') -- 2 in both
, (7, null); -- 1 NULL in both
8 rows affected
select Col1, Col2 from TableA
intersect
select Col1, Col2 from TableB
order by Col1, Col2
col1 | col2
---: | :---
3 | AB
4 | AAB
5 | ABB
6 | AABB
7 | null
select Col1, Col2 from TableA
intersect all
select Col1, Col2 from TableB
order by Col1, Col2
col1 | col2
---: | :---
3 | AB
4 | AAB
5 | ABB
6 | AABB
6 | AABB
7 | null
select Col1, Col2 from TableA
except
select Col1, Col2 from TableB
order by Col1, Col2
col1 | col2
---: | :---
1 | A
select Col1, Col2 from TableA
except all
select Col1, Col2 from TableB
order by Col1, Col2
col1 | col2
---: | :---
1 | A
4 | AAB
Demo on db<>fiddle here

Modify query so as to add new row with sum of values in some column

I have table in SQL Server like below using below code:
select col1, count(*) as col2,
case when col1 = 'aaa' then 'xxx'
when col1 = 'bbb' then 'yyy'
when col1 = 'ccc' then 'zzz'
else 'ttt'
end 'col3'
from table1
group by col1
col1 | col2 | col3
----------------------
aaa | 10 | xxx
bbb | 20 | yyy
ccc | 30 | yyy
How can I modify my query in SQL Server so as to add new row with sum of values in col2? So I need something like below:
col1 | col2 | col3
----------------------
aaa | 10 | xxx
bbb | 20 | yyy
ccc | 30 | yyy
sum | 60 | sum of values in col2
You could use ROLLUP for this. The documentation explains how this works. https://learn.microsoft.com/en-us/sql/t-sql/queries/select-group-by-transact-sql?view=sql-server-ver15
select col1, count(*) as col2,
case when col1 = 'aaa' then 'xxx'
when col1 = 'bbb' then 'yyy'
when col1 = 'ccc' then 'zzz'
else 'ttt'
end 'col3'
from table1
group by rollup(col1)
---EDIT---
Here is the updated code demonstrating how coalesce works.
select coalesce(col1, 'sum')
, count(*) as col2
, case when col1 = 'aaa' then 'xxx'
when col1 = 'bbb' then 'yyy'
when col1 = 'ccc' then 'zzz'
else 'ttt'
end 'col3'
from table1
group by rollup(col1)
I tend to like GROUPING SETS for such items
Declare #YourTable Table ([col1] varchar(50),[col2] int,[col3] varchar(50)) Insert Into #YourTable Values
('aaa',10,'xxx')
,('bbb',20,'yyy')
,('ccc',30,'yyy')
Select col1 = coalesce(col1,'sum')
,col2 = sum(Col2)
,col3 = coalesce(col3,'sum of values in col2')
from #YourTable
Group by grouping sets ( (col1,col3)
,()
)
Results
col1 col2 col3
aaa 10 xxx
bbb 20 yyy
ccc 30 yyy
sum 60 sum of values in col2

sql selecting unique rows based on a specific column

I have an table like this :
Col1 Col2 Col3 Col4
asasa 1 d 44
asasa 2 sd 34
asasa 3 f 3
dssd 4 d 2
sdsdsd 5 sd 11
dssd 1 dd 34
xxxsdsds2 d 3
erewer 3 sd 3
I am trying to filter out something like this based on Col1
Col1 Col2 Col3 Col4
asasa 1 d 44
dssd 4 d 2
sdsdsd 5 sd 11
xxxsdsds2 d 3
erewer 3 sd 3
I am trying to get the all unique rows based on the values in Col1. If I have duplicates in Col1, the first row should be taken.
I tried SELECT Col1 FROM tblname GROUP BY Col1 and got unique Col1 but extending it using * is giving me error.
You should be able to achieve your goal using something like the following:
WITH CTE AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY Col2) AS rn FROM MyTable
)
SELECT * FROM CTE WHERE rn = 1
What it does is it creates a CTE (Common Table Expression) that adds a ROW_NUMBER on Col1, ordered by the data in row2.
In the outer select, we then only grab the rows from the CTE where the row number generated is 1.
Try this
;WITH CTE(
SELECT *,
ROW_NUMBER() OVER(PARTITIAN BY Col1 ORDER BY(SELECT NULL))RN
FROM tblname
)
SELECT Col1, Col2, Col3, Col4 FROM CTE;
Depending on the flavor of SQL that you have are using, what may help you are window functions.
In SQL Server, this can be accomplished with the FIRST_VALUE window function like so:
DROP TABLE IF EXISTS #vals;
CREATE TABLE #vals (COL1 VARCHAR(10), COL2 INT, COL3 VARCHAR(5), COL4 INT);
INSERT INTO #vals (COL1, COL2, COL3, COL4)
VALUES ('asasa', 1, 'd', 44),
('asasa', 2, 'sd', 34),
('asasa', 3, 'f', 3),
('dssd' , 4, 'd', 2),
('sdsdsd', 5, 'sd', 11),
('dssd', 1, 'dd', 34),
('xxxsdsds', 2, 'd', 3),
('erewer', 3, 'sd', 3);
SELECT *
FROM #vals
SELECT DISTINCT COL1,
FIRST_VALUE(COL2) OVER (PARTITION BY COL1 ORDER BY Col1) AS Col2,
FIRST_VALUE(COL3) OVER (PARTITION BY COL1 ORDER BY Col1) AS Col3,
FIRST_VALUE(COL4) OVER (PARTITION BY COL1 ORDER BY Col1) AS Col4
FROM #vals AS v1
This returns:
|COL1 | Col2 | Col3 | Col4|
|-----------|-----------|-----------|-------|
|asasa | 1 | d | 44 |
|dssd | 4 | d | 2 |
|erewer | 3 | sd | 3 |
|sdsdsd | 5 | sd | 11 |
|xxxsdsds | 2 | d | 3 |
which may then be ORDERed in whatever way is needed.
Select DISTINCT , should do the trick. Here is a good reference https://www.w3schools.com/sql/sql_distinct.asp

How to select last three non-NULL columns across multiple columns

For example, if my dataset looks like this:
id | col1 | col2 | col3 | col4 | col5 | col6
---+------+------+------+------+------+-----
A | a1 | a2 | a3 | a4 | a5 | a6
B | b1 | b2 | b3 | b4 | NULL | NULL
C | c1 | c2 | c3 | NULL | NULL | NULL
The desired output would be:
id | col1 | col2 | col3 | col4 | col5 | col6
---+------+------+------+------+------+-----
A | a4 | a5 | a6 |
B | b2 | b3 | b4 |
C | c1 | c2 | c3 |
Does anyone know how to achieve that?
I just found this thread: https://dba.stackexchange.com/questions/210431/select-first-and-last-non-empty-blank-column-of-a-record-mysql
This allow me to pick the last non-null column, but I have no idea to to get the second and third last column in the same time as well.
This will do what you request (db <> fiddle)
Edit: The initial version probably didn't do what you want if there were less than three NOT NULL values in a row. This version will shift them left.
SELECT Id,
CA.Col1,
CA.Col2,
CA.Col3,
NULL AS Col4,
NULL AS Col5,
NULL AS Col6
FROM YourTable
CROSS APPLY (SELECT MAX(CASE WHEN RN = 1 THEN val END) AS Col1,
MAX(CASE WHEN RN = 2 THEN val END) AS Col2,
MAX(CASE WHEN RN = 3 THEN val END) AS Col3
FROM (SELECT val,
ROW_NUMBER() OVER (ORDER BY ord) AS RN
FROM
(SELECT TOP 3 *
FROM (VALUES(1, col1),
(2, col2),
(3, col3),
(4, col4),
(5, col5),
(6, col6) ) v(ord, val)
WHERE val IS NOT NULL
ORDER BY ord DESC
) d1
) d2
) CA
You can also use pivot and unpivot to achieve the desired result.
try the following:
;with cte as
(
select id, cols, col as val, ROW_NUMBER() over (partition by id order by cols desc) rn
from #t
unpivot
(
col for cols in ([col1], [col2], [col3], [col4], [col5], [col6])
)upvt
)
select id, ISNULL([3], '') as col1, ISNULL([2], '') as col2, ISNULL([1], '') as col3, '' col4, '' col5, '' col6
from
(
select id, val, rn from cte
)t
pivot
(
max(val) for rn in ([1], [2], [3])
)pvt
order by 1
Please find the db<>fiddle here.

Query to get previous value

I have a scenerio where I need previous column value but it should not be same as current column value.
Table A:
+------+------+-------------+
| Col1 | Col2 | Lead_Col2 |
+------+------+-------------+
| 1 | A | NULL |
| 2 | B | A |
| 3 | B | A |
| 4 | C | B |
| 5 | C | B |
| 6 | C | B |
| 7 | D | C |
+------+------+-------------+
As Given above, I need previuos column(Col2) value. which is not same as current value.
Try:
select *
from (select col1,
col2,
lag(col2, 1) over(order by col1) as prev_col2
from table_a)
where col2 <> prev_col2
The name lead_col2 is misleading, because you really want a lag.
Here is a brute force method that uses a correlated subquery to get the index of the value and then joins the value in:
select aa.col1, aa.col2, aa.col2
from (select col1, col2,
(select max(col1) as maxcol1
from a a2
where a2.id < a.id and a2.col2 <> a.col2
) as prev_col1
from a
) aa left join
a
on aa.maxcol1 = a.col1
EDIT:
You can also use logic with lead() and ignore NULLs. If a value is the last in its sequence, then use that value, otherwise set it to NULL. Then use lag() with ignoreNULL`s:
select col1, col2,
lag(col3) over (order by col1 ignore nulls)
from (select col1, col2,
(case when col2 <> lead(col2) over (order by col1) then col2
end) as col3
from a
) a;
Try this:
select t.col1
,t.col2
,first_value(lag_col2) over (partition by col2 order by ord) lag_col2
from (select t.*
,case when lag_col2 = col2 then 1 else 0 end ord
from (select t.*
,lag (col2) over (order by col1) lag_col2
from table1 t
)t
)t
order by col1
SQL Fiddle