SQL Server from X columns make 1 columns - sql

I have example table like this:
Column1 Column2 Column3 Column4 Column5
---------------------------------------------------------------
Dog 456 Long Short Small
Car 454 Blue NULL NULL
Fruit 466 Apple Pear NULL
And I expect table like this when XY columns will be put into 1 column
Column1 Column2 Column3
------------------------------
Dog 456 Long
Dog 456 Short
Dog 456 Small
Car 454 Blue
Fruit 466 Apple
Fruit 466 Pear
Thank you for your opinions when tables have over thousand rows.
Even I can do this in excel and then back import data to SQL Server

You can use unpivot as below:
Select * from #unpivotdata
unpivot( cols for col in([column3],[column4],[column5])) u
Output as below:
+---------+---------+-------+
| Column1 | column2 | cols |
+---------+---------+-------+
| Dog | 456 | Long |
| Dog | 456 | Short |
| Dog | 456 | Small |
| Car | 454 | Blue |
| Fruit | 466 | Apple |
| Fruit | 466 | Pear |
+---------+---------+-------+

SELECT Column1, Column2, Column3 FROM table WHERE Column3 IS NOT NULL
UNION ALL
SELECT Column1, Column2, Column4 FROM table WHERE Column4 IS NOT NULL
UNION ALL
SELECT Column1, Column2, Column5 FROM table WHERE Column5 IS NOT NULL

Using UNION ALL:
SELECT col1, col2, col3
FROM tab
UNION ALL
SELECT col1, col2, col4
FROM tab
WHERE col4 IS NOT NULL
SELECT col1, col2, col5
FROM tab
WHERE col5 IS NOT NULL;

UnPivot would be more performant, but if the number of columns is unknowm.
You many notice that only the "Key" columns are identified, so the width is dynamic.
Example
Declare #YourTable Table ([Column1] varchar(50),[Column2] varchar(50),[Column3] varchar(50),[Column4] varchar(50),[Column5] varchar(50))
Insert Into #YourTable Values
('Dog',456,'Long','Short','Small')
,('Car',454,'Blue',NULL,NULL)
,('Fruit',466,'Apple','Pear',NULL)
Select A.[Column1]
,A.[Column2]
,[Column3] = C.Value
From #YourTable A
Cross Apply ( values (cast((Select A.* for XML RAW) as xml))) B(XMLData)
Cross Apply (
Select Field = a.value('local-name(.)','varchar(100)')
,Value = a.value('.','varchar(max)')
From B.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./#*') as C2(a)
Where a.value('local-name(.)','varchar(100)') not in ('Column1','Column2')
) C
Returns

Related

How to get distinct count over multiple columns in Hive SQL?

I have a table that looks like this. And I want to get the distinct count horizontally across the three columns ignoring nulls.
ID
Column1
Column 2
Column 3
1
A
B
C
2
A
A
B
3
A
A
The desired output I'm looking for is:
ID
Column1
Column 2
Column 3
unique_count
1
A
B
C
3
2
A
A
B
2
3
A
A
1
One possible option would be
WITH sample AS (
SELECT 'A' Column1, 'B' Column2, 'C' Column3 UNION ALL
SELECT 'A', 'A', 'B' UNION ALL
SELECT 'A', 'A', NULL UNION ALL
SELECT '', 'A', NULL
)
SELECT Column1, Column2, Column3, COUNT(DISTINCT NULLIF(TRIM(c), '')) unique_count
FROM (SELECT *, ROW_NUMBER() OVER () rn FROM sample) t LATERAL VIEW EXPLODE(ARRAY(Column1, Column2, Column3)) tf AS c
GROUP BY Column1, Column2, Column3, rn;
output
+---------+---------+---------+--------------+
| column1 | column2 | column3 | unique_count |
+---------+---------+---------+--------------+
| | A | NULL | 1 |
| A | A | NULL | 1 |
| A | A | B | 2 |
| A | B | C | 3 |
+---------+---------+---------+--------------+
case when C1 not in (C2, C3) then 1 else 0 end +
case when C2 not in (C3) then 1 else 0 end + 1
This will not work if you intend to count nulls. The pattern would extend to more columns by successively comparing each one to all columns to its right. The order doesn't strictly matter. There's just no point in repeating the same test over and over.
If the values were alphabetically ordered then you could test only adjacent pairs to look for differences. While that applies to your limited sample it would not be the most general case.
Using a column pivot with a distinct count aggregate is likely to be a lot less efficient, less portable, and a lot less adaptable to a broad range of queries.

SQL Partition By Function without aggregation

I have a table with data like the following:
Column1 | Column2 | Column3 | Value
SQ03 | D | 1000040 | 1000
SQ03 | | 1000040 | 1000
SQ03 | | 1000050 | 2000
SQ03 | | 1000060 | 3000
SQ03 | L | 1000060 | 3000
SQ03 | D | 1000060 | 3000
What I need to do is to get a single value based on column3. Is a value in column3 is unique, I need to get that value. But if there are duplicates in Column3, I need to get the value where Column2 is not null. But like in the example that I showed in above, there are values for Column3 where Column2 is marked more than once, in these cases I need to get only one of these values, doesn't matter what.
So I thought on flagging which line I would need with the following solution:
select *, CASE
WHEN "Column2" != ' '
THEN 'X'
WHEN "Column2" = ' ' AND row_number() over(PARTITION BY "Column3" ORDER BY "Column2" DESC, "Column3") = 1
THEN 'X'
ELSE 'O'
END AS "FLAG" from DUMMY
WHERE "Column1" = 'SQ03'
But the problem with this solution is that it's aggregating the value from Column3. Like, it sums the values where Column3 has duplicates.
Can anyone help me with a solution where I don't get the values aggregated?
EDIT:
My expected output would be this:
Column1 | Column2 | Column3 | Value
SQ03 | D | 1000040 | 1000
SQ03 | | 1000050 | 2000
SQ03 | L | 1000060 | 3000
You can use a subquery to generate row numbers for each Column3 value (ordered by Column2 DESC to make NULL values come last), and then select the rows which have row_number = 1:
SELECT Column1, Column2, Column3, Value
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Column3 ORDER BY Column2 DESC) AS rn
FROM DUMMY
WHERE Column1 = 'SQ03'
) D
WHERE rn = 1
Alternatively you can use a CTE:
WITH CTE AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Column3 ORDER BY Column2 DESC) AS rn
FROM DUMMY
WHERE Column1 = 'SQ03'
)
SELECT Column1, Column2, Column3, Value
FROM CTE
WHERE rn = 1
Output for both queries:
Column1 Column2 Column3 Value
SQ03 D 1000040 1000
SQ03 (null) 1000050 2000
SQ03 L 1000060 3000
Demo on SQLFiddle
I think an aggregation function (as a window function) does what you want:
select t.*,
max(column3) over (partition by column1)
from t;

Import Unpivot results to new Table and match on Key

I currently have a few unpivot queries that yeilds about 2000 rows each. I need to take the results of those queires, and put in a new table to match on a key.
Query Example:
Select DeviceSlot
FROM tbl1
unpivot(
DeviceSlot
For col in(
col1,
col2,
col3,
)
)AS Unpivot
Now I need to match the results from the query, and insert it into a new table with about 20,000 rows.
Pseudo-Code for this:
Insert Into tbl2(DeviceSlot)
Select DeviceSlot
FROM tbl1
unpivot(
DeviceSlot
For col in(
col1,
col2,
col3
)
)AS Unpivot2
Where tbl1.key = tbl2.key
I've been pretty confused on how to do this, and I apologize if it is not clear.
I also have another unpivot query doing the same thing for different columns.
Not sure what you are asking for. While unpivoting to "normalize" data typically the wanted "key" is derived during the unpivot, for example, below the id column of the original table is repeated in the un-pivoted data to represent a foreign key for some new table.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Table1
([id] int, [col1] varchar(2), [col2] varchar(2), [col3] varchar(2))
;
INSERT INTO Table1
([id], [col1], [col2], [col3])
VALUES
(1, 'a', 'b', 'c'),
(2, 'aa', 'bb', 'cc')
;
Query 1:
select id as table1_fk, colheading, colvalue
from (
select * from table1
) t
unpivot (
colvalue for colheading in (col1, col2, col3)
) u
Results:
| table1_fk | colheading | colvalue |
|-----------|------------|----------|
| 1 | col1 | a |
| 1 | col2 | b |
| 1 | col3 | c |
| 2 | col1 | aa |
| 2 | col2 | bb |
| 2 | col3 | cc |

transform columns to rows

I have a table table1 like below
+----+------+------+------+------+------+
| id | loc | val1 | val2 | val3 | val4 |
+----+------+------+------+------+------+
| 1 | loc1 | 10 | 190 | null | 20 |
| 2 | loc2 | 20 | null | 10 | 10 |
+----+------+------+------+------+------+
need to combine the val1 to val4 into a new column val with a row for each so that the output is like below.
NOTE: - I data I have has val1 to val30 -> ie. 30 columns per row that need to be converted into rows.
+----+------+--------+
| id | loc | val |
+----+------+--------+
| 1 | loc1 | 10 |
| 1 | loc1 | 190 |
| 1 | loc1 | null |
| 1 | loc1 | 20 |
| 2 | loc2 | 20 |
| 2 | loc2 | null |
| 2 | loc2 | 10 |
| 2 | loc2 | 10 |
+----+------+--------+
You can use lateral join for transform columns to rows :
SELECT a.id,a.loc,t.vals
FROM table1 a,
unnest(ARRAY[a.val1,a.val2,a.val3,a.val4]) t(vals);
If you want to this with a dynamic added columns:
CREATE OR REPLACE FUNCTION columns_to_rows(
out id integer,
out loc text,
out vals integer
)
RETURNS SETOF record AS
$body$
DECLARE
columns_to_rows text;
BEGIN
SELECT string_agg('a.'||attname, ',') into columns_to_rows
FROM pg_attribute
WHERE attrelid = 'your_table'::regclass AND --table name
attnum > 0 and --get just the visible columns
attname <> all (array [ 'id', 'loc' ]) AND --exclude some columns
NOT attisdropped ; --column is not dropped
RETURN QUERY
EXECUTE format('SELECT a.id,a.loc,t.vals
FROM your_table a,
unnest(ARRAY[%s]) t(vals)',columns_to_rows);
end;
$body$
LANGUAGE 'plpgsql'
Look at this link for more detail: Columns to rows
You could use a cross join with generate_series for this:
select
id,
loc,
case x.i
when 1 then val1
when 2 then val2
. . .
end as val
from t
cross join generate_series(1, 4) x (i)
It uses the table only once and can be easily extended to accommodate more columns.
Demo
Note: In the accepted answer, first approach reads the table many times (as many times as column to be unpivoted) and second approach is wrong as there is no UNPIVOT in postgresql.
I'm sure there's a classier approach than this.
SELECT * FROM (
select id, loc, val1 as val from #t a
UNION ALL
select id, loc, val2 as val from #t a
UNION ALL
select id, loc, val3 as val from #t a
UNION ALL
select id, loc, val4 as val from #t a
) x
order by ID
Here's my attempt with unpivot but cant get the nulls, perhaps perform a join for the nulls? Anyway i'll still try
SELECT *
FROM (
SELECT * FROM #t
) main
UNPIVOT (
new_val
FOR val IN (val1, val2, val3, val4)
) unpiv
It will not work in postgress as needed by user. Saw when it was mentioned in comments.
I am finding a way to handle "NULL"
select p.id,p.loc,CASE WHEN p.val=0 THEN NULL ELSE p.val END AS val
from
(
SELECT id,loc,ISNULL(val1,0) AS val1,ISNULL(val2,0) AS val2,ISNULL(val3,0) AS val3,ISNULL(val4,0) AS val4
FROM Table1
)T
unpivot
(
val
for locval in(val1,val2,val3,val4)
)p
Test
EDIT:
Best Solution from my Side:
select a.id,a.loc,ex.val
from (select 'val1' as [over] union all select 'val2' union all select 'val3'
union all select 'val1' ) pmu
cross join (select id,loc from Table1) as a
left join
Table1 pt
unpivot
(
[val]
for [over] in (val1, val2, val3, val4)
) ex
on pmu.[over] = ex.[over] and
a.id = ex.id
Test

SQL. Is it possible?

Oracle
select * from table1;
column1 | column2 | column3 |
a | 2010 | 1 |
a | 2011 | 2 |
a | 2012 | 3 |
b | 2010 | 4 |
b | 2011 | 5 |
b | 2012 | 6 |
c | 2010 | 7 |
c | 2011 | 8 |
c | 2012 | 9 |
Is it possible to do something like this.
column1 | 2010 | 2011 | 2012 |
a | 1 | 2 | 3 |
b | 4 | 5 | 6 |
c | 7 | 8 | 9 |
Yes
SELECT t.column1,
(SELECT SUM(column3) FROM table1
WHERE column1 = t.column1 AND column2 = 2010) AS "2010",
(SELECT SUM(column3) FROM table1
WHERE column1 = t.column1 AND column2 = 2011) AS "2011",
(SELECT SUM(column3) FROM table1
WHERE column1 = t.column1 AND column2 = 2012) AS "2012"
FROM (
SELECT DISTINCT column1 FROM table1
) t
ORDER BY t.column1
Note, I've added the SUM() aggregate function around colum3 in case you may have duplicate values per column1, column2.
Depending on the database you're using, the following equivalent query might be a bit faster:
SELECT t.column1,
(SELECT SUM(column3) FROM table1
WHERE column1 = t.column1 AND column2 = 2010) AS "2010",
(SELECT SUM(column3) FROM table1
WHERE column1 = t.column1 AND column2 = 2011) AS "2011",
(SELECT SUM(column3) FROM table1
WHERE column1 = t.column1 AND column2 = 2012) AS "2012"
FROM table1 t
GROUP BY t.column1
ORDER BY t.column1
Note that you can achieve the same in a more concise way, using the PIVOT clause (as others have suggested). In Oracle 11g, this would translate to:
SELECT column1, "2010", "2011", "2012"
FROM table1
PIVOT (SUM(column3) FOR column2 IN (2010, 2011, 2012))
In any case, I don't know any database that allows for a dynamic number of columns per table expression, without resorting to tricks involving XML or other means of dynamic SQL. Typically, those tricks aren't much faster than what I suggested here. This means, you'll always have to foresee, how many years you want to support as columns, and adapt your query accordingly
try PIVOT in SQL sever
select column1 , [2010],[2011],[2012]
from your_table
PIVOT (MAX(column3) FOR column2 IN ([2010],[2011],[2012])) P
Edit1:
If your Column2 is dynamic, then try this:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(column2)
from your_table
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT column1 , ' + #cols + ' from your_table
pivot
(
MAX(column3)
for column2 in (' + #cols + ')
) p '
print(#query)
execute(#query)