flatten data in SQL based on fixed set of column

flatten data in SQL based on fixed set of column - sql

I am stuck with a specific scenario of flattening the data and need help for it. I need the output as flattened data where the column values are not fixed. Due to this I want to restrict the output to fixed set of columns.
Given Table 'test_table'
ID
Name
Property
1
C1
xxx
2
C2
xyz
2
C3
zz
The scenario is, column Name can have any no. of values corresponding to an ID. I need to flatten the data based in such a way that there is one row per ID field. Since the Name field varies with each ID, I want to flatten it for fix 3 columns like Co1, Co2, Co3. The output should look like
ID
Co1
Co1_Property
Co2
Co2_Property
Co3
Co3_Property
1
C1
xxx
null
null
2
C2
xyz
C3
zz
Could not think of a solution using Pivot or aggregation. Any help would be appreciated.

You can use arrays:
select id,
array_agg(name order by name)[safe_ordinal(1)] as name_1,
array_agg(property order by name)[safe_ordinal(1)] as property_1,
array_agg(name order by name)[safe_ordinal(2)] as name_2,
array_agg(property order by name)[safe_ordinal(2)] as property_2,
array_agg(name order by name)[safe_ordinal(3)] as name_3,
array_agg(property order by name)[safe_ordinal(3)] as property_3
from t
group by id;

All current answers are too verbose and involve heavy repetition of same fragments of code again and again and if you need to account more columns you need to copy paste and add more lines which will make it even more verbose!
My preference is to avoid such type of coding and rather use something more generic as in below example
select * from (
select *, row_number() over(partition by id) col
from `project.dataset.table`)
pivot (max(name) as name, max(property) as property for col in (1, 2, 3))
If applied to sample data in your question - output is
If you want to change number of output columns - you just simply modify for col in (1, 2, 3) part of query.
For example if you would wanted to have 5 columns - you would use for col in (1, 2, 3, 4, 5) - that simple!!!

The standard practice is to use conditional aggregation. That is, to use CASE expressions to pick which row goes to which column, then MAX() to collapse multiple rows into individual rows...
SELECT
id,
MAX(CASE WHEN name = 'C1' THEN name END) AS co1,
MAX(CASE WHEN name = 'C1' THEN property END) AS co1_property,
MAX(CASE WHEN name = 'C2' THEN name END) AS co2,
MAX(CASE WHEN name = 'C2' THEN property END) AS co2_property,
MAX(CASE WHEN name = 'C3' THEN name END) AS co3,
MAX(CASE WHEN name = 'C3' THEN property END) AS co3_property
FROM
yourTable
GROUP BY
id
Background info:
Not having an ELSE in the CASE expression implicitly means ELSE NULL
The intention is therefore for each column to recieve NULL from every input row, except for the row being pivoted into that column
Aggregates, such as MAX() essentially skip NULL values
MAX( {NULL,NULL,'xxx',NULL,NULL} ) therefore equals 'xxx'
A similar approach "bunches" the values to the left (so that NULL values always only appears to the right...)
That approach first uses row_number() to give each row a value corresponding to which column you want to put that row in to..
WITH
sorted AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY name) AS seq_num
FROM
yourTable
)
SELECT
id,
MAX(CASE WHEN seq_num = 1 THEN name END) AS co1,
MAX(CASE WHEN seq_num = 1 THEN property END) AS co1_property,
MAX(CASE WHEN seq_num = 2 THEN name END) AS co2,
MAX(CASE WHEN seq_num = 2 THEN property END) AS co2_property,
MAX(CASE WHEN seq_num = 3 THEN name END) AS co3,
MAX(CASE WHEN seq_num = 3 THEN property END) AS co3_property
FROM
yourTable
GROUP BY
id

Related

Pivoting rows to columns with custom column names in SQL Server

I'm having some difficulty with pivoting rows into columns as I also want to name the columns. Here is my current code, modified:
SELECT Message, value
FROM Table1
CROSS APPLY
(
SELECT value FROM STRING_SPLIT(Message,'"')
WHERE value LIKE '%.%'
)
AS SourceTable
And my current output:
Message value
------------ -----
longmessage1 hello
longmessage1 hi
longmessage1 hey
longmessage1 hola
Just for the sake of shortness, I replaced the actual Message with longmessage1 above. My desired output:
Message greeting1 greeting2 greeting3 greeting4
------------ --------- --------- --------- ---------
longmessage1 hello hi hey hola
The maximum amount of greetings is six, and if a Message doesn't have six, I'm fine with the value of, say greeting 4 and 5 to be NULL.
FYI- I am using SQL Server. I think I could somehow use PIVOT to do this but I'm stuck on the custom column name part and if CROSS APPLY was even the right idea. If anyone could offer some suggestions, that'd be terrific. Thank you!

You can use row_number() and conditional aggregation:
SELECT t1.Message, a.*
FROM Table1 t1 CROSS APPLY
(SELECT MAX(CASE WHEN seqnum = 1 THEN value END) as greeting1,
MAX(CASE WHEN seqnum = 2 THEN value END) as greeting2,
MAX(CASE WHEN seqnum = 3 THEN value END) as greeting3,
MAX(CASE WHEN seqnum = 4 THEN value END) as greeting4,
MAX(CASE WHEN seqnum = 5 THEN value END) as greeting5,
MAX(CASE WHEN seqnum = 6 THEN value END) as greeting6
FROM (SELECT s.value,
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as seqnum
FROM STRING_SPLIT(t1.Message,'"')
WHERE value LIKE '%.%'
) s
) s;
Note: In practice, this will probably preserve the ordering of the values. However that is not guaranteed -- based on the documentation.

SQL - Transpose different rows in columns

I have 1 table (I can not modify it before) from a bulk insert, and I need to transpose all the rows in "VALUES" column into single columns. Except for the "YEAR" (going from 0 to 50) and the "VALUES" columns, the other 3 columns are filled with a unique value (example: in year 0 size is 'L', color is 'red' and price is '10' and this size,color and price values are constant for all my 50 years).
The input table is like this:
I would like to have as output all the "VALUES" in columns called for example "VALUE_0" "VALUE_1" "VALUE_2" "VALUE_3" etc, where the numbers 0,1,2,3 stand for the year considered.
CREATE TABLE #CURVE(
YEAR INT,
SIZE VARCHAR(100),
COLOR VARCHAR(100),
PRICE VARCHAR(100),
VALUES FLOAT
)
The Output should be:

One option uses conditional aggregation:
select
size,
color,
price,
max(case when year = 0 then value end) value_0,
max(case when year = 1 then value end) value_1,
max(case when year = 2 then value end) value_2,
max(case when year = 3 then value end) value_3,
max(case when year = 4 then value end) value_4,
max(case when year = 5 then value end) value_5
from #curve
group by size, color, price
You can easily extend the select clause to handle more years.
This is a cross-database solution, that usually performs as good or better than vendor-specific implementations of pivot. On top of that (and for what it's worth), I find it easier to understand.

we need to know how many values do you expect in Values column for a year?
Otherwise the query looks like this:
SELECT [YEAR], [SIZE], [COLOR], [PRICE], [1] as [Value1], [2] as [Value2], [3] as [Value3]
FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY [YEAR], [SIZE], [COLOR], [PRICE] ORDER BY (SELECT NULL)) ID
FROM #CURVE c
) t
PIVOT(MAX([Value]) FOR ID IN ([1], [2], [3]])) p

How to use pivote to find out the max value from a row with three columns, that is max value out of three columns

I have following table named as 'Table',
Where I want result like following table where if you take first row and last three columns I want value to be 56.
I want sql server code for above table 'Table' and result to be second table. Here MaxV-1 and MaxV-2 are dependent on 'Number' column. MaxV-1 is max value out of FirstV, SecondV and ThirdV when Number is equal to 1 and same logic for MaxV-2.

One method is an unpivot and conditional aggreation:
select t.model,
max(case when t.number = 1 then t.pro_code end) as pro_code_1,
max(case when t.number = 2 then t.pro_code end) as pro_code_2,
max(case when t.number = 1 then v.v end) as max_val_1,
max(case when t.number = 2 then v.v end) as max_val_2
from t cross apply
(select max(v.v) as v
from (values (t.firstv), (t.secondv), (t.thirdv)) v(v)
) v
group by t.model;

T-SQL query Fetching data in a single line where multiple rows have similar filters

We have a table in which I'm having 2 lines with the similar document no. and line no. in which I want the values in a single line in which I can see some values of other lines.
Original dataset
As attached in screen I have doc no. and line no is same in 2 lines and if I need values from 2nd line that should look like the second screenshot.
Result image should be:

You can define the row_numbers using row_number() function based on GST Component or Entry_no as because there are three type of GST's CGST-SGST-IGST moreover the other is UGST which is related to any union territory.
select max(case when entryno = 1 then 1 end) as Entry_no, doc,
max(case when entryno = 1 then GSTComp end) as GSTComp1,
max(case when entryno = 1 then [GST%] end) as [GST%1],
max(case when entryno = 1 then GSTAmt end) as GSTAmt1,
. . .
max(case when entryno = 3 then GSTAmt end) as GSTComp3
from (select *, row_number() over (partition by doc order by entryno) as seq
from table
) t
group by doc;

SELECT
t1.Entry_no
, t1.doc
, t1.GSTComp
, t1.GST%
, t1.GSTAmt
, t2.GSTComp as t2.GSTComp2
, t2.GSTAmt as t2.GSTAmt2
FROM table t1 INNER JOIN table t2
ON t1.doc = t2.doc and t1.lin = t2.lin
Try this query

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.

If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

flatten data in SQL based on fixed set of column - sql

Related

Pivoting rows to columns with custom column names in SQL Server

SQL - Transpose different rows in columns

How to use pivote to find out the max value from a row with three columns, that is max value out of three columns

T-SQL query Fetching data in a single line where multiple rows have similar filters

pivot table returns more than 1 row for the same ID

Categories

Resources