Postgres pivot columns to rows - sql

How can I convert table of the following 5 columns structure:
Id, name, col1, col2, col2
1 aaa 10 20 30
2 bbb 100 200 300
to the following structure where Col1, Col2 and Col3 columns are now shown as strings in new columns Colx.
Id, name, Colx, Value
1 aaa Col1 10
1 aaa Col2 20
1 aaa Col3 30
2 bbb Col1 100
2 bbb Col2 200
2 bbb Col3 300
Thanks!
Avi

You can use a subquery with UNION statement
select nombre, colx, val from (
select nombre, 'col1' as colx, col1 as val from test
UNION
select nombre, 'col2' as colx, col2 as val from test
UNION
select nombre, 'col3' as colx, col3 as val from test
) as query
order by val

Related

Modify query so as to add new row with sum of values in some column

I have table in SQL Server like below using below code:
select col1, count(*) as col2,
case when col1 = 'aaa' then 'xxx'
when col1 = 'bbb' then 'yyy'
when col1 = 'ccc' then 'zzz'
else 'ttt'
end 'col3'
from table1
group by col1
col1 | col2 | col3
----------------------
aaa | 10 | xxx
bbb | 20 | yyy
ccc | 30 | yyy
How can I modify my query in SQL Server so as to add new row with sum of values in col2? So I need something like below:
col1 | col2 | col3
----------------------
aaa | 10 | xxx
bbb | 20 | yyy
ccc | 30 | yyy
sum | 60 | sum of values in col2
You could use ROLLUP for this. The documentation explains how this works. https://learn.microsoft.com/en-us/sql/t-sql/queries/select-group-by-transact-sql?view=sql-server-ver15
select col1, count(*) as col2,
case when col1 = 'aaa' then 'xxx'
when col1 = 'bbb' then 'yyy'
when col1 = 'ccc' then 'zzz'
else 'ttt'
end 'col3'
from table1
group by rollup(col1)
---EDIT---
Here is the updated code demonstrating how coalesce works.
select coalesce(col1, 'sum')
, count(*) as col2
, case when col1 = 'aaa' then 'xxx'
when col1 = 'bbb' then 'yyy'
when col1 = 'ccc' then 'zzz'
else 'ttt'
end 'col3'
from table1
group by rollup(col1)
I tend to like GROUPING SETS for such items
Declare #YourTable Table ([col1] varchar(50),[col2] int,[col3] varchar(50)) Insert Into #YourTable Values
('aaa',10,'xxx')
,('bbb',20,'yyy')
,('ccc',30,'yyy')
Select col1 = coalesce(col1,'sum')
,col2 = sum(Col2)
,col3 = coalesce(col3,'sum of values in col2')
from #YourTable
Group by grouping sets ( (col1,col3)
,()
)
Results
col1 col2 col3
aaa 10 xxx
bbb 20 yyy
ccc 30 yyy
sum 60 sum of values in col2

Can I change column order in SQL table based on a value that appears in different columns?

I have a table that looks like this:
Column1 | Column2 | Column3| Column4
4 | 3 | 2 | 1
2 | 1
3 | 2 | 1
I want to flip the columns so that 1 always start in column 1 and then the rest of the values follow to the right. Like this:
Column1 | Column2 | Column3 | Column4
1 | 2 | 3 | 4
1 | 2
1 | 2 | 3
This is an example table. The real table is a hierarchy of a company so 1 = CEO and 2 = SVP for example. 1 is always the same name but as the number gets higher (lower in chain of command) the more names that are in that level. I'm hoping for an automated solution that looks for 1, makes that the first column and then populates the columns. I am struggling because the value that 1 represents is in different columns so I can't just change the order of the columns.
I was able to accomplish this using VBA but I would prefer to keep it in SQL.
I don't have any useful code that I have tried so far.
You can use Case expression:
WITH CTE1 AS
(SELECT 4 AS COL1, 3 AS COL2 , 2 AS COL3, 1 AS COL4 FROM DUAL
UNION ALL
SELECT 2, 1, NULL, NULL FROM DUAL
UNION ALL
SELECT 3, 2, 1, NULL FROM DUAL
)
SELECT CASE WHEN COL1 <> 1 THEN 1 ELSE COL1 END AS COL1,
CASE WHEN COL2 <> 2 THEN 2 ELSE COL2 END AS COL2,
CASE WHEN COL3 <> 3 THEN 3 ELSE COL3 END AS COL3,
CASE WHEN COL4 <> 4 THEN 4 ELSE COL4 END AS COL4
FROM CTE1;
You can apply some CASEes checking all possibilities, this is assuming NULLs for missing data:
COALESCE(col4,col3,col2,col1) AS c1,
CASE
WHEN col4 IS NOT NULL THEN col3
WHEN col3 IS NOT NULL THEN col2
WHEN col2 IS NOT NULL THEN col1
END AS c2,
CASE
WHEN col4 IS NOT NULL THEN col2
WHEN col3 IS NOT NULL THEN col1
END AS c3,
CASE
WHEN col4 IS NOT NULL THEN col1
END AS c4
You want to sort the values. A generic SQL solution would use:
select max(case when seqnum = 1 then col end) as col1,
max(case when seqnum = 2 then col end) as col2,
max(case when seqnum = 3 then col end) as col3,
max(case when seqnum = 4 then col end) as col4
from (select col1, col2, col3, col4, col,
row_number() over (order by col) as seqnum
from ((select col1 as col, 1 as which, col1, col2, col3, col4 from t) union all
(select col2 as col, 2 as which, col1, col2, col3, col4 from t) union all
(select col3 as col, 3 as which, col1, col2, col3, col4 from t) union all
(select col4 as col, 4 as which, col1, col2, col3, col4 from t)
) t
where col is not null
) t
group by col1, col2, col3, col4;
This would be simpler in a database that supports lateral joins. And a unique id on each row would also help.

Finding partial and exact duplicate from a SQL table

I am trying to read duplicates from a table. There are some partial duplicates based on values of Col1 and Col2 and there are some full duplicates based on Col1, Col2 and Col3 as in below table.
Col1 Col2 Col3
1 John 100
1 John 200
2 Tom 150
3 Bob 100
3 Bob 100
4 Sam 500
I want to capture partial and exact duplicates in two separate outputs and ignore the non-repeated rows like 2 and 4 e.g.
Partial Duplicates
Col1 Col2 Col3
1 John 100
1 John 200
Full Duplicate
Col1 Col2 Col3
3 Bob 100
3 Bob 100
What is the best way to achieve this with SQL?
I tried using the self join with spark-sql but getting error: -
val source_df = sql("select col1, col2, col3 from sample_table")
source_df.as("df1").join(inter_df.as("df2"), $"df1.Col3" === $"df2.Col3" and $"df1.Col2" === $"df2.Col2" and $"df1.Col1" === $"df2.Col1").select($"df1.Col1",$"df1.Col2",$"df1.Col3",$"df2.Col3").show()
Error
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:
Exchange hashpartitioning(Col3#1957, 200)
For partial duplicates:
SELECT *
FROM tbl
WHERE EXISTS (
SELECT *
FROM tbl t2
WHERE tbl.col1 = t2.col1 AND tbl.col2 = t2.col2 AND tbl.col3 <> t2.col3
)
Returns:
col1 col2 col3
1 John 100
1 John 200
For full duplicates, add a unique identifier per combination of col1, col2, and col3, and look for cases where there's another record with the same col1, col2, and col3, but a different unique identifier:
;WITH cte AS (
SELECT ROW_NUMBER() OVER (PARTITION BY col1, col2, col3 ORDER BY col1, col2, col3) AS uniqueid, col1, col2, col3
FROM tbl
)
SELECT col1, col2, col3
FROM cte
WHERE EXISTS (
SELECT *
FROM cte t2
WHERE cte.col1 = t2.col1 AND cte.col2 = t2.col2 AND cte.col3 = t2.col3 AND cte.uniqueid <> t2.uniqueid
)
Returns:
col1 col2 col3
3 Bob 100
3 Bob 100
http://sqlfiddle.com/#!18/f1d78/2
CREATE TABLE tbl (col1 INT, col2 VARCHAR(5), col3 INT)
INSERT INTO tbl VALUES
(1, 'John', 100),
(1, 'John', 200),
(2, 'Tom', 150),
(3, 'Bob', 100),
(3, 'Bob', 100),
(4, 'Sam', 500)
partial duplicates - use exists. Here is the demo.
select
*
from myTable m1
where exists (
select
*
from myTable m2
where m1.Col1 = m2.Col1
and m1.Col2 = m2.Col2
and m1.Col3 <> m2.Col3
)
output:
----------------------
Col1 Col2 Col3
----------------------
1 John 100
1 John 200
----------------------
full duplicates - you can use count(*) as window function. Here is the demo.
with cte as
(
select
Col1,
Col2,
Col3,
count(*) over (partition by Col1, Col2, Col3) as rn
from myTable
)
select
Col1,
Col2,
Col3
from cte
where rn > 1
output:
----------------------
Col1 Col2 Col3
----------------------
3 Bob 100
3 Bob 100
----------------------

Select row from group which satisfies condition A. If not, give me a row that satisfies condition B

I bring forth an interesting problem that has been bothering me for the past few days. Let's say you have the following data structure:
Col1 | Col2 | Col3 | Col4
100 | "Val1" | 0 | 100
100 | "Val2" | 1 | null
100 | "Val 3" | 0 | null
101 | "Val4" | 0 | null
101 | "Val5" | 1 | null
102 | "Val6" | 0 | null
I need that one row where Col4!=null. If all rows' Col4 is null then return me a row where Col3=1, but if both Col4 is null and Col3=0, then return me any one row.
So the result set for the above data will look like,
Col1 | Col2 | Col3 | Col4
100 | "Val1" | 0 | 100
101 | "Val5" | 1 | null
102 | "Val6" | 0 | null
I know this could be done using analytics function, order them by Col1, Col4 and Col3 and use an analytic function to get the first row in each group but we are using our inhouse ORM that doesn't support analytic function.
Please let me know if this can be done using simple SQL (JOIN, Case, etc).
Edit:
There will only be one row per group where Col4 has non-null value and one row per group where col3 is 1. Also, a single row in the group can satisfy both conditions of having Col4 not null and Col3=1.
How about this? Every CONDx CTE solves one condition.
COND1 returns rows whose COL4 is not null
COND2 returns rows whose COL1 doesn't exist in COND1 result set and has NULLs for COL4 (in that case, count of distinct values = 0) and COL3 = 1
COND3 is everything that's left
The final result is union of all those.
SQL> with test (col1, col2, col3, col4) as
2 (select 100, 'val1', 0, 100 from dual union all
3 select 100, 'val2', 1, null from dual union all
4 select 100, 'val3', 0, null from dual union all
5 select 101, 'val4', 0, null from dual union all
6 select 101, 'val5', 1, null from dual union all
7 select 102, 'val6', 0, null from dual
8 ),
9 cond1 as
10 (select col1, col2, col3, col4
11 From test
12 where col4 is not null
13 ),
14 cond2 as
15 (select col1, col2, col3, col4
16 from test t
17 where t.col1 not in (select col1 from cond1)
18 and col1 in (select col1
19 from test
20 group by col1
21 having count(distinct col4) = 0
22 )
23 and col3 = 1
24 ),
25 cond3 as
26 (select col1, col2, col3, col4
27 from test t
28 where t.col1 not in (select col1 from cond1
29 union all
30 select col1 from cond2
31 )
32 )
33 select col1, col2, col3, col4 from cond1
34 union all
35 select col1, col2, col3, col4 from cond2
36 union all
37 select col1, col2, col3, col4 from cond3
38 order by col1;
COL1 COL2 COL3 COL4
---------- ---- ---------- ----------
100 val1 0 100
101 val5 1
102 val6 0
SQL>

Select records where all rows have same value in two columns

Here is my sample table
Col1 Col2
A 1
B 1
A 1
B 2
C 3
I want to be able to select distinct records where all rows have the same value in Col1 and Col2. So my answer should be
Col1 Col2
A 1
C 3
I tried
SELECT Col1, Col2 FROM Table GROUP BY Col1, Col2
This gives me
Col1 Col2
A 1
B 1
B 2
C 3
which is not the result I am looking for. Any tips would be appreciated.
Try this out:
SELECT col1, MAX(col2) aCol2 FROM t
GROUP BY col1
HAVING COUNT(DISTINCT col2) = 1
Output:
| COL1 | ACOL2 |
|------|-------|
| A | 1 |
| C | 3 |
Fiddle here.
Basically, this makes sure that amount the different values for col2 are unique for a given col1.
Try this:
SELECT * FROM MYTABLE
GROUP BY Col1, Col2
HAVING COUNT(*)>1
For example SQLFiddle here
you can try either of the below -
select col1, col2 from
(
select 'A' Col1 , 1 Col2
from dual
union all
select 'B' , 1
from dual
union all
select 'A' ,1
from dual
union all
select 'B' ,2
from dual
)
group by col1, col2
having count(*) >1;
OR
select col1, col2
from
(
select col1, col2, row_number() over (partition by col1, col2 order by col1, col2) cnt
from
(
select 'A' Col1 , 1 Col2
from dual
union all
select 'B' , 1
from dual
union all
select 'A' ,1
from dual
union all
select 'B' ,2
from dual
)
)
where cnt>1;