How to select last three non-NULL columns across multiple columns - sql

For example, if my dataset looks like this:
id | col1 | col2 | col3 | col4 | col5 | col6
---+------+------+------+------+------+-----
A | a1 | a2 | a3 | a4 | a5 | a6
B | b1 | b2 | b3 | b4 | NULL | NULL
C | c1 | c2 | c3 | NULL | NULL | NULL
The desired output would be:
id | col1 | col2 | col3 | col4 | col5 | col6
---+------+------+------+------+------+-----
A | a4 | a5 | a6 |
B | b2 | b3 | b4 |
C | c1 | c2 | c3 |
Does anyone know how to achieve that?
I just found this thread: https://dba.stackexchange.com/questions/210431/select-first-and-last-non-empty-blank-column-of-a-record-mysql
This allow me to pick the last non-null column, but I have no idea to to get the second and third last column in the same time as well.

This will do what you request (db <> fiddle)
Edit: The initial version probably didn't do what you want if there were less than three NOT NULL values in a row. This version will shift them left.
SELECT Id,
CA.Col1,
CA.Col2,
CA.Col3,
NULL AS Col4,
NULL AS Col5,
NULL AS Col6
FROM YourTable
CROSS APPLY (SELECT MAX(CASE WHEN RN = 1 THEN val END) AS Col1,
MAX(CASE WHEN RN = 2 THEN val END) AS Col2,
MAX(CASE WHEN RN = 3 THEN val END) AS Col3
FROM (SELECT val,
ROW_NUMBER() OVER (ORDER BY ord) AS RN
FROM
(SELECT TOP 3 *
FROM (VALUES(1, col1),
(2, col2),
(3, col3),
(4, col4),
(5, col5),
(6, col6) ) v(ord, val)
WHERE val IS NOT NULL
ORDER BY ord DESC
) d1
) d2
) CA

You can also use pivot and unpivot to achieve the desired result.
try the following:
;with cte as
(
select id, cols, col as val, ROW_NUMBER() over (partition by id order by cols desc) rn
from #t
unpivot
(
col for cols in ([col1], [col2], [col3], [col4], [col5], [col6])
)upvt
)
select id, ISNULL([3], '') as col1, ISNULL([2], '') as col2, ISNULL([1], '') as col3, '' col4, '' col5, '' col6
from
(
select id, val, rn from cte
)t
pivot
(
max(val) for rn in ([1], [2], [3])
)pvt
order by 1
Please find the db<>fiddle here.

Related

sql selecting unique rows based on a specific column

I have an table like this :
Col1 Col2 Col3 Col4
asasa 1 d 44
asasa 2 sd 34
asasa 3 f 3
dssd 4 d 2
sdsdsd 5 sd 11
dssd 1 dd 34
xxxsdsds2 d 3
erewer 3 sd 3
I am trying to filter out something like this based on Col1
Col1 Col2 Col3 Col4
asasa 1 d 44
dssd 4 d 2
sdsdsd 5 sd 11
xxxsdsds2 d 3
erewer 3 sd 3
I am trying to get the all unique rows based on the values in Col1. If I have duplicates in Col1, the first row should be taken.
I tried SELECT Col1 FROM tblname GROUP BY Col1 and got unique Col1 but extending it using * is giving me error.
You should be able to achieve your goal using something like the following:
WITH CTE AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY Col2) AS rn FROM MyTable
)
SELECT * FROM CTE WHERE rn = 1
What it does is it creates a CTE (Common Table Expression) that adds a ROW_NUMBER on Col1, ordered by the data in row2.
In the outer select, we then only grab the rows from the CTE where the row number generated is 1.
Try this
;WITH CTE(
SELECT *,
ROW_NUMBER() OVER(PARTITIAN BY Col1 ORDER BY(SELECT NULL))RN
FROM tblname
)
SELECT Col1, Col2, Col3, Col4 FROM CTE;
Depending on the flavor of SQL that you have are using, what may help you are window functions.
In SQL Server, this can be accomplished with the FIRST_VALUE window function like so:
DROP TABLE IF EXISTS #vals;
CREATE TABLE #vals (COL1 VARCHAR(10), COL2 INT, COL3 VARCHAR(5), COL4 INT);
INSERT INTO #vals (COL1, COL2, COL3, COL4)
VALUES ('asasa', 1, 'd', 44),
('asasa', 2, 'sd', 34),
('asasa', 3, 'f', 3),
('dssd' , 4, 'd', 2),
('sdsdsd', 5, 'sd', 11),
('dssd', 1, 'dd', 34),
('xxxsdsds', 2, 'd', 3),
('erewer', 3, 'sd', 3);
SELECT *
FROM #vals
SELECT DISTINCT COL1,
FIRST_VALUE(COL2) OVER (PARTITION BY COL1 ORDER BY Col1) AS Col2,
FIRST_VALUE(COL3) OVER (PARTITION BY COL1 ORDER BY Col1) AS Col3,
FIRST_VALUE(COL4) OVER (PARTITION BY COL1 ORDER BY Col1) AS Col4
FROM #vals AS v1
This returns:
|COL1 | Col2 | Col3 | Col4|
|-----------|-----------|-----------|-------|
|asasa | 1 | d | 44 |
|dssd | 4 | d | 2 |
|erewer | 3 | sd | 3 |
|sdsdsd | 5 | sd | 11 |
|xxxsdsds | 2 | d | 3 |
which may then be ORDERed in whatever way is needed.
Select DISTINCT , should do the trick. Here is a good reference https://www.w3schools.com/sql/sql_distinct.asp

How to use SQL DISTINCT to remove duplicates from multiple columns?

Let's say I have a table with lots of duplicated values. I want to remove the duplicates for each column individually. Using DISTINCT removes duplicate combinations of columns so other columns still contain duplicated values.
Original table is:
Col1 | Col2 | Col3
-----+------+------
a1 | b1 | c1
a1 | b2 | c1
a2 | b1 | NULL
a2 | b2 | c1
a3 | b1 | c1
a3 | NULL | NULL
My desire result is:
Col1 | Col2 | Col3
-----+------+------
a1 | b1 | c1
a2 | b2 | NULL
a3 | NULL | NULL
I can get this result by several query separately:
SELECT DISTINCT Col1
FROM TABLE
SELECT DISTINCT Col2
FROM TABLE
SELECT DISTINCT Col3
FROM TABLE
But how can I do it in a singe query and return result in one result set?
Thanks
I'd use a group by...
;WITH c1 AS (
SELECT col1
, ROW_NUMBER() OVER (ORDER BY col1) AS [r]
FROM #foo
WHERE col1 IS NOT NULL
GROUP BY col1
)
, c2 AS (
SELECT col2
, ROW_NUMBER() OVER (ORDER BY col2) as [r]
FROM #foo
WHERE col2 IS NOT NULL
GROUP BY col2
)
, c3 AS (
SELECT col3
, ROW_NUMBER() OVER (ORDER BY col3) as [r]
FROM #foo
WHERE col3 IS NOT NULL
GROUP BY col3
)
select c1.col1
, c2.col2
, c3.col3
from c1 LEFT join c2
on c1.r = c2.r
left join c3
on c1.r = c3.r
ORDER BY c1.r ASC;
I wasn't quite sure from the problem description what you wanted. I crafted this based on the ideal-output provided.
Here is the sample data set I used.
CREATE TABLE #foo (
col1 char(2)
, col2 char(2)
, col3 char(2)
);
INSERT INTO #foo (col1, col2, col3)
VALUES ('a1', 'b2', null)
, ('a1', 'b1', 'c1')
, ('a2', Null, 'c1')
, ('a2', 'b1', null)
, ('a3', null, 'c1')
GO
Here is the dataset and output from the query:
Hope this helps!
You can UNION those three queries together:
SELECT DISTINCT Col1 FROM TABLE
UNION
SELECT DISTINCT Col2 FROM TABLE
UNION
SELECT DISTINCT Col3 FROM TABLE
This requires that all three fields be of the same type (can't mix numbers and strings and dates).
This smells of bad design though. If you find yourself unioning these often then perhaps change your table to look like the UNION'd results.

Get column name where values differ between two rows

I have a table with lots of columns. Sometimes I need to find differences between two rows. I can do it just by scrolling through screen but it is dull. I'm looking for a query that will do this for me, something like
SELECT columns_for_id_1 != columns_for_id_2
FROM xyz
WHERE id in (1,2)
Table:
id col1 col2 col3 col4
1 qqq www eee rrr
2 qqq www XXX rrr
Result:
"Different columns: id, col3"
Is there a simple way to do this?
UPDATE
Another example as wanted:
What I have (table has more than 50 column, not only 7):
Id| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
==============================================
1 | aaa | bbb | ccc | ddd | eee | fff |
----------------------------------------------
2 | aaa | XXX | ccc | YYY | eee | fff |
Query:
SELECT *
FROM table
WHERE Id = 1 OR Id = 2
AND "columns value differs"
Query result: "Id, Col2, Col4"
OR something like:
Id|Col2 |Col4 |
===============
1 |bbb |ddd |
---------------
2 |XXX |YYY |
Right now I have to scroll through more than 50 columns to see if rows are the same, it's not efficient and prone to mistakes. I don't want any long query like
SELECT (COMPARE Id1.Col1 with Id2.Col1 if different then print "Col1 differs", COMPARE Id1.Col2 with Id2.Col2...) because I will compare the rows myself faster ;)
Something like this:
SELECT col, MIN(VAL) AS val1, MAX(val) AS val2
FROM (
SELECT id, val, col
FROM (
SELECT id, [col1], [col2], [col3], [col4]
FROM mytable
WHERE id IN (1,2)) AS src
UNPIVOT (
val FOR col IN ([col1], [col2], [col3], [col4])) AS unpvt ) AS t
GROUP BY col
HAVING MIN(val) <> MAX(val)
Output:
col val1 val2
================
col3 eee XXX
Try this simple query, may be help you
SELECT (CASE WHEN a.col1 <> b.col1 THEN 'Different Col1'
WHEN a.col2 <> b.col2 THEN 'Different Col2'
...
ELSE 'No Different' END) --You can add only required columns here
FROM xyz AS a
INNER JOIN xyz AS b ON b.id = 1 --First Record
WHERE a.id = 2 --Second record to compare
If you are on SQL Server 2012 then you can also use LEAD/LAG windowed funuction to do this. MSDN Reference is here - https://msdn.microsoft.com/en-us/library/hh213125.aspx
select
id,
col1,
col2,
col3,
col4,
stuff(diff_cols,len(diff_cols-1),1,'') diff_cols
from
(
SELECT
id,
col1,
col2,
col3,
col4,
concat
(
'Different columns:',
CASE
WHEN LEAD(id, 1,0) OVER (ORDER BY id) <> id THEN 'id,'
WHEN LEAD(col1, 1,0) OVER (ORDER BY id) <> col1 THEN 'col1,'
WHEN LEAD(col2, 1,0) OVER (ORDER BY id) <> col2 THEN 'col2,'
WHEN LEAD(col3, 1,0) OVER (ORDER BY id) <> col3 THEN 'col3,'
WHEN LEAD(col4, 1,0) OVER (ORDER BY id) <> col4 THEN 'col4,'
) diff_cols
FROM xyz
) tmp

Query to get previous value

I have a scenerio where I need previous column value but it should not be same as current column value.
Table A:
+------+------+-------------+
| Col1 | Col2 | Lead_Col2 |
+------+------+-------------+
| 1 | A | NULL |
| 2 | B | A |
| 3 | B | A |
| 4 | C | B |
| 5 | C | B |
| 6 | C | B |
| 7 | D | C |
+------+------+-------------+
As Given above, I need previuos column(Col2) value. which is not same as current value.
Try:
select *
from (select col1,
col2,
lag(col2, 1) over(order by col1) as prev_col2
from table_a)
where col2 <> prev_col2
The name lead_col2 is misleading, because you really want a lag.
Here is a brute force method that uses a correlated subquery to get the index of the value and then joins the value in:
select aa.col1, aa.col2, aa.col2
from (select col1, col2,
(select max(col1) as maxcol1
from a a2
where a2.id < a.id and a2.col2 <> a.col2
) as prev_col1
from a
) aa left join
a
on aa.maxcol1 = a.col1
EDIT:
You can also use logic with lead() and ignore NULLs. If a value is the last in its sequence, then use that value, otherwise set it to NULL. Then use lag() with ignoreNULL`s:
select col1, col2,
lag(col3) over (order by col1 ignore nulls)
from (select col1, col2,
(case when col2 <> lead(col2) over (order by col1) then col2
end) as col3
from a
) a;
Try this:
select t.col1
,t.col2
,first_value(lag_col2) over (partition by col2 order by ord) lag_col2
from (select t.*
,case when lag_col2 = col2 then 1 else 0 end ord
from (select t.*
,lag (col2) over (order by col1) lag_col2
from table1 t
)t
)t
order by col1
SQL Fiddle

PIVOT entire column set by group

For a table like:
COL1 COL2 COL3 COL4
item1 7/29/13 cat blue
item3 7/29/13 fish purple
item1 7/30/13 rat green
item2 7/30/13 bat grey
item3 7/30/13 bird orange
How would you PIVOT to get rows by COL2, all other columns repeated across as blocks by COL1 values?
COL2 COL1 COL3 COL4 COL1 COL3 COL4 COL1 COL3 COL4
7/29/13 item1 cat blue item2 NULL NULL item3 fish purple
7/30/13 item1 rat green item2 bat grey item3 bird orange
In order to get this result you will need to do a few things:
get a distinct list of values from col1 and col2
unpivot the data in your columns col1, col3 and col4
pivot the result from the unpivot
To get the distinct list of dates and items (col1 and col2) along with the values from your existing table you will need to use something similar to the following:
select t.col1, t.col2,
t2.col3, t2.col4,
row_number() over(partition by t.col2
order by t.col1) seq
from
(
select distinct t.col1, c.col2
from yourtable t
cross join
(
select distinct col2
from yourtable
) c
) t
left join yourtable t2
on t.col1 = t2.col1
and t.col2 = t2.col2;
See SQL Fiddle with Demo. Once you have this list, then you will need to unpivot the data. There are several ways you can do this, using the UNPIVOT function or using CROSS APPLY:
select d.col2,
col = col+'_'+cast(seq as varchar(10)),
value
from
(
select t.col1, t.col2,
t2.col3, t2.col4,
row_number() over(partition by t.col2
order by t.col1) seq
from
(
select distinct t.col1, c.col2
from yourtable t
cross join
(
select distinct col2
from yourtable
) c
) t
left join yourtable t2
on t.col1 = t2.col1
and t.col2 = t2.col2
) d
cross apply
(
select 'col1', col1 union all
select 'col3', col3 union all
select 'col4', col4
) c (col, value);
See SQL Fiddle with Demo. this will give you data that looks like:
| COL2 | COL | VALUE |
-------------------------------------------------
| July, 29 2013 00:00:00+0000 | col1_1 | item1 |
| July, 29 2013 00:00:00+0000 | col3_1 | cat |
| July, 29 2013 00:00:00+0000 | col4_1 | blue |
| July, 29 2013 00:00:00+0000 | col1_2 | item2 |
| July, 29 2013 00:00:00+0000 | col3_2 | (null) |
| July, 29 2013 00:00:00+0000 | col4_2 | (null) |
Finally, you will apply the PIVOT function to the items in the col columns:
select col2,
col1_1, col3_1, col4_1,
col1_2, col3_2, col4_2,
col1_3, col3_3, col4_3
from
(
select d.col2,
col = col+'_'+cast(seq as varchar(10)),
value
from
(
select t.col1, t.col2,
t2.col3, t2.col4,
row_number() over(partition by t.col2
order by t.col1) seq
from
(
select distinct t.col1, c.col2
from yourtable t
cross join
(
select distinct col2
from yourtable
) c
) t
left join yourtable t2
on t.col1 = t2.col1
and t.col2 = t2.col2
) d
cross apply
(
select 'col1', col1 union all
select 'col3', col3 union all
select 'col4', col4
) c (col, value)
) src
pivot
(
max(value)
for col in (col1_1, col3_1, col4_1,
col1_2, col3_2, col4_2,
col1_3, col3_3, col4_3)
)piv;
See SQL Fiddle with Demo. If you have an unknown number of values, then you can use dynamic SQL to get the result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(col+'_'+cast(seq as varchar(10)))
from
(
select row_number() over(partition by col2
order by col1) seq
from yourtable
) t
cross apply
(
select 'col1', 1 union all
select 'col3', 2 union all
select 'col4', 3
) c (col, so)
group by col, seq, so
order by seq, so
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT col2, ' + #cols + '
from
(
select d.col2,
col = col+''_''+cast(seq as varchar(10)),
value
from
(
select t.col1, t.col2,
t2.col3, t2.col4,
row_number() over(partition by t.col2
order by t.col1) seq
from
(
select distinct t.col1, c.col2
from yourtable t
cross join
(
select distinct col2
from yourtable
) c
) t
left join yourtable t2
on t.col1 = t2.col1
and t.col2 = t2.col2
) d
cross apply
(
select ''col1'', col1 union all
select ''col3'', col3 union all
select ''col4'', col4
) c (col, value)
) x
pivot
(
max(value)
for col in (' + #cols + ')
) p '
execute sp_executesql #query;
See SQL Fiddle with Demo. All versions will give a result:
| COL2 | COL1_1 | COL3_1 | COL4_1 | COL1_2 | COL3_2 | COL4_2 | COL1_3 | COL3_3 | COL4_3 |
----------------------------------------------------------------------------------------------------------------
| July, 29 2013 00:00:00+0000 | item1 | cat | blue | item2 | (null) | (null) | item3 | fish | purple |
| July, 30 2013 00:00:00+0000 | item1 | rat | green | item2 | bat | grey | item3 | bird | orange |
The dynamic UNPIVOT+PIVOT method is always cool, when doing this sort of thing for a known and limited set of values subsequent JOIN's work nicely too (being lazy on the SELECT list):
WITH cte AS (SELECT *,ROW_NUMBER() OVER (PARTITION BY COL2 ORDER BY COL1)'RowRank'
FROM #Table1)
SELECT *
FROM cte a
LEFT JOIN cte b
ON a.COL2 = b.COL2
AND a.RowRank = b.RowRank - 1
LEFT JOIN cte c
ON b.COL2 = c.COL2
AND b.RowRank = c.RowRank - 1
WHERE a.RowRank = 1
Or if the order of the fields is to be maintained:
WITH cte AS (SELECT a.*,b.RowRank
FROM #Table1 a
JOIN (SELECT Col1,ROW_NUMBER() OVER (ORDER BY Col1)'RowRank'
FROM #Table1
GROUP BY COL1) b
ON a.Col1 = b.Col1)
SELECT *
FROM cte a
LEFT JOIN cte b
ON a.COL2 = b.COL2
AND a.RowRank = b.RowRank - 1
LEFT JOIN cte c
ON a.COL2 = c.COL2
AND a.RowRank = c.RowRank - 2
WHERE a.RowRank = 1
But this falls apart without an 'anchor' value, ie if no record had item1 for a given date it wouldn't be included.