SQL select values of each selected columns on separate rows - sql

I have a table with hundreds of rows and tens of W columns:
Column1 | Column2_W | Column3_W | ColumnX_W
123 | A | B | x
223 | A | NULL | NULL
How can i select it so that the output would be:
Column1 | W
123 | A
123 | B
123 | x
223 | A
EDIT: I am well aware that i am working with a terrible DB "design". Unfortunately i can't change it. This question is actually part of a larger problem i was handed today. I will try the given ideas tomorrow

SELECT Column1, Column2_W
FROM table
UNION ALL
SELECT Column1, Column3_W
FROM table
UNION ALL
SELECT Column1, Column4_W
FROM table
....
ORDER BY Column1
Better option: redesign your database! This looks like a spreadsheet, not a relational database.

Take a look at this article: UNPIVOT: Normalizing data on the fly
Unforunately, you're going to be stuck hand typing the column names with any of the solutions you use. Here's a sample using UNPIVOT in SQL Server...
SELECT Column1, W
FROM YourTable
UNPIVOT (W for Column1 in (Column2_W, Column3_W /* and so on */)) AS W

If you don't have any UNPIVOT options...
SELECT
Column1,
CASE ColumnID WHEN 2 THEN Column2_W
WHEN 3 THEN Column3_W
...
WHEN X THEN ColumnX_W
END AS Column2
FROM
yourTable
CROSS JOIN
( SELECT 2 AS ColumnID
UNION ALL SELECT 3
...
UNION ALL SELECT X
)
AS UnPivot

Related

How to get distinct count over multiple columns in Hive SQL?

I have a table that looks like this. And I want to get the distinct count horizontally across the three columns ignoring nulls.
ID
Column1
Column 2
Column 3
1
A
B
C
2
A
A
B
3
A
A
The desired output I'm looking for is:
ID
Column1
Column 2
Column 3
unique_count
1
A
B
C
3
2
A
A
B
2
3
A
A
1
One possible option would be
WITH sample AS (
SELECT 'A' Column1, 'B' Column2, 'C' Column3 UNION ALL
SELECT 'A', 'A', 'B' UNION ALL
SELECT 'A', 'A', NULL UNION ALL
SELECT '', 'A', NULL
)
SELECT Column1, Column2, Column3, COUNT(DISTINCT NULLIF(TRIM(c), '')) unique_count
FROM (SELECT *, ROW_NUMBER() OVER () rn FROM sample) t LATERAL VIEW EXPLODE(ARRAY(Column1, Column2, Column3)) tf AS c
GROUP BY Column1, Column2, Column3, rn;
output
+---------+---------+---------+--------------+
| column1 | column2 | column3 | unique_count |
+---------+---------+---------+--------------+
| | A | NULL | 1 |
| A | A | NULL | 1 |
| A | A | B | 2 |
| A | B | C | 3 |
+---------+---------+---------+--------------+
case when C1 not in (C2, C3) then 1 else 0 end +
case when C2 not in (C3) then 1 else 0 end + 1
This will not work if you intend to count nulls. The pattern would extend to more columns by successively comparing each one to all columns to its right. The order doesn't strictly matter. There's just no point in repeating the same test over and over.
If the values were alphabetically ordered then you could test only adjacent pairs to look for differences. While that applies to your limited sample it would not be the most general case.
Using a column pivot with a distinct count aggregate is likely to be a lot less efficient, less portable, and a lot less adaptable to a broad range of queries.

SQL code to get next variable in table with different value

I need to find a way in SQL Server 2014 Management Studios to find the next unique value in a column that shares the value of a different column.
So for example below I would want my results to be
Column 1 - A
Column 2 - 1
Column 3 - 4
As that is the first time that A has unique values in column 2 and 3
Column1 | Column2 | Column3
---------+---------+---------
| A | X | 1 |
| A | X | 2 |
| B | Y | 3 |
| A | Z | 4 |
Query:
SELECT
Column1,
LEAD(Column3) OVER (PARTITION BY Column2 ORDER BY Column3) AS FindValue
FROM
Table
If I understand it correctly I would try something like this:
-- first we find minimum values for column1, column2 variations
WITH min_values AS (
SELECT
column1,
column2,
min(column3) AS min_value
FROM
table
GROUP BY 1,2
)
-- then we find bottom 2 values for column1
,bottom_2 AS (
SELECT
column1,
min_value,
row_number() OVER (PARTITION BY column1 ORDER BY min_value ASC) AS rn
FROM
min_values
)
-- THEN we JOIN results INTO single record
SELECT
b1.column1, b2.min_value, b1.min_value
FROM
bottom_2 b1
JOIN
bottom_2 b2 ON b1.column1 = b2.column1 AND b2.rn < b1.rn
WHERE b1.rn <= 2
I just checked comments above and would like to add some notes.
If you want to find next value ordered by column2 then you have to change order by from min_value to column2 in row_number() line. Otherwise, if you are looking for next inserted value then you need a timestamp or some kind of id.

SQL - Select In from Array, and NOT in in same query

I have a userform built in VBA where my coworkers can enter multiple values that builds an array and places it into an IN statement, which works great. Problem is I need to also be able to display what values do not exist within the tables.
Example table
id | value
1 | value1
2 | value2
4 | value4
Then a query that could be generated would be
SELECT [id],[value] FROM [tablea] WHERE [id] IN (1,2,3,4)
Expected or desirable outcome would be as follows
id | value
1 | value1
2 | value2
3 | null
4 | value4
I've tried doing it like so;
SELECT [id],[value] FROM [tablea] WHERE [id] IN (1,2,3,4) AND [id] NOT IN (1,2,3,4)
since both arrays will be the same, this returns 0 of course.
I know I can do this with a union, and define the not in statement within the second union, but I'd like to do this without a union.. Any other thoughts?
This is on Microsoft SQL 2005
I unfortunately only have access to SELECT, since I'm performing queries either via VBA or Tableau. So I cannot create a derived table or have anything to reference other than the select statement.
You need a left join of some sort. One way would be to construct your query as:
select v.id, t.value
from (values (1), (2), (3), (4)
) v(id) left join
table t
on v.id = t.id;
Thanks to Joel Coehoorn for the tip towards using a CTE
I was able to accomplish this like so;
WITH numbers AS (
SELECT 1 AS num UNION ALL
SELECT 2 AS num UNION ALL
SELECT 3 AS num UNION ALL
SELECT 4 as num UNION ALL )
SELECT
COALESCE(id,num) as col1,
id as col2
FROM tablea
RIGHT JOIN numbers ON tablea.id = numbers.num
This would return
col1 | col2
1 | 1
2 | 2
3 | NULL
4 | 4

Undo a LISTAGG in redshift

I have a table that probably resulted from a listagg, similar to this:
# select * from s;
s
-----------
a,c,b,d,a
b,e,c,d,f
(2 rows)
How can I change it into this set of rows:
a
c
b
d
a
b
e
c
d
f
In redshift, you can join against a table of numbers, and use that as the split index:
--with recursive Numbers as (
-- select 1 as i
-- union all
-- select i + 1 as i from Numbers where i <= 5
--)
with Numbers(i) as (
select 1 union
select 2 union
select 3 union
select 4 union
select 5
)
select split_part(s,',', i) from Numbers, s ORDER by s,i;
EDIT: redshift doesn't seem to support recursive subqueries, only postgres. :(
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table s(
col varchar2(20) );
insert into s values('a,c,b,d,a');
insert into s values('b,e,c,d,f');
Query 1:
SELECT REGEXP_SUBSTR(t1.col, '([^,])+', 1, t2.COLUMN_VALUE )
FROM s t1 CROSS JOIN
TABLE
(
CAST
(
MULTISET
(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(t1.col, '([^,])+')
)
AS SYS.odciNumberList
)
) t2
Results:
| REGEXP_SUBSTR(T1.COL,'([^,])+',1,T2.COLUMN_VALUE) |
|---------------------------------------------------|
| a |
| c |
| b |
| d |
| a |
| b |
| e |
| c |
| d |
| f |
As this is tagged to Redshift and no answer so far has a complete overview from undoing a LISTAGG in Redshift properly, here is the code that solves all its use cases:
CREATE TEMPORARY TABLE s (
s varchar(255)
);
INSERT INTO s VALUES('a,c,b,d,a');
INSERT INTO s VALUES('b,e,c,d,f');
SELECT
TRIM(split_part(s.s,',',R::smallint)) AS s
FROM s
LEFT JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY 1) AS R
FROM any_large_table
LIMIT 1000
) extend_number
ON (SELECT MAX(regexp_count(s.s,',')+1) FROM s) >= extend_number.R
AND NULLIF(TRIM(split_part(s.s,',',extend_number.R::smallint)),'') IS NOT NULL;
DROP TABLE s;
Where “any_large_table” is any table you have in redshift already that has enough records for your purposes depending on the number of elements the list of each record will contain (i.e. in the above case, I ensure it is up to one-thousand records). Unfortunately, generate_series function does not work properly in Redshift as far as I know and that is the only way.
Another point of advise is check if you can get the values before they already list_agg whenever possible. As you can see from the above code, it looks quite complex, and you save a lot of maintenance time on your code if you keep things simple (that is, whenever the opportunity is available).

How to create a result from sql server with a summed column

Hello what I want is to write a query which will fetch me 3 column:-
nvarchar column1
integer Values column2
single cell of the summed column2
is it possible , I am getting the following error:-
Msg 8120, Level 16, State 1, Line 1
Column 'tablename.columnname' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
What is the correct procedure to get data in the format I wish to get.
Edit
Jaques' answer works but I dont get what I want. What I want is:
column 1 PID | column 2 SID | column 3 | column 4 | Column 5(Total of 4)
-------------------------------------------------------------------------
1 | 1 | ABC | 125.00 | 985.00
2 | 2 | XYZ | 420.00 |
3 | 3 | DEF | 230.00 |
4 | 4 | GHI | 210.00 |
i suspect you are using some aggregate function on some columns and not listing your remaining columns in group by clause. your query should look like this.
select sum(column2), column1 from table1
group by column1
You can do it in the following way, because you need to add all non aggregated values in the group by, which makes it difficult
Select column1, column2, SUM(column2) OVER (PARTITION BY column1) as Total from [Table]
This should work.
You can do it with a subselect from your edited answer, but why do you want it like that?
Select Column1, Column2, Column3, Column4, (Select SUM(Column4) from Table) as Column 5 from Table
You must include the same columns in the select and group by clauses.
If you want to sum a column with all the values, you must include in the select clause a column with different value for each row, like this:
SELECT columnId, sum(column4) as total
FROM MyTable
GROUP BY columnId
or simply don't include on the select any extra column, like this:
select sum(column4) from MyTable