Transpose row value to additional pre-defined field when value is less than specified value - sql

I have a table that contains two identifying columns, a date, and a value. This value can be up to 100. What I want to do is where [ID] and [DATE] is the same across subsequent rows and the values are less than 100 (which also means [ID_SECONDARY] is always different), I want a query to place each one of these values in a column '[VALUE_1]...[VALUE_N]' along with the Value Description ([ID_SECONDARY]-->[VALUE_1_DESC]...[VALUE_N_DESC]). Ultimately each row should contain a unique [ID], [DATE], and an aggregation of the different [ID_SECONDARY] descriptions along with their values [VALUE_1]...[VALUE_N]. The number of unique [ID_SECONDARY] will not surpass 4, but could be from 1 to 4.
My initial inclination is to approach this using a cursor, but am hopeful there is a better alternative.
The first image is a sample of the information provided in the table, the second image is the output I'm looking for. Any help is greatly appreciated.
.
As far as I can tell this is different from the various dynamic pivot posts out there because the columns are independent of the secondary ID and are fully dependent on the VALUE column to determine if the value itself belongs in columns 1-4.

Try this
WITH a AS (
SELECT
ID
, [DATE]
, ID_SECONDARY
, VALUE
, ROW_NUMBER() OVER (PARTITION BY ID, DATE ORDER BY ID) AS RNUM
)
SELECT
a.ID
, a.[DATE]
, MAX (
CASE a.RNUM
WHEN 1 THEN a.VALUE
ELSE NULL
) AS VALUE_1
, MAX (
CASE a.RNUM
WHEN 1 THEN a.ID_SECONDARY
ELSE NULL
) AS VALUE_1_DESC
, MAX (
CASE a.RNUM
WHEN 2 THEN a.VALUE
ELSE NULL
) AS VALUE_2
, MAX (
CASE a.RNUM
WHEN 2 THEN a.ID_SECONDARY
ELSE NULL
) AS VALUE_2_DESC
, MAX (
CASE RNUM
WHEN 3 THEN a.VALUE
ELSE NULL
) AS VALUE_3
, MAX (
CASE RNUM
WHEN 3 THEN a.ID_SECONDARY
ELSE NULL
) AS VALUE_3_DESC
, MAX (
CASE RNUM
WHEN 4 THEN a.VALUE
ELSE NULL
) AS VALUE_4
, MAX (
CASE RNUM
WHEN 4 THEN a.ID_SECONDARY
ELSE NULL
) AS VALUE_4_DESC
FROM a
GROUP BY a.ID, a.[DATE]

Related

How to split values for the same id into columns

How to split values for the same id into columns?
Example Table
I want to achieve this:
I try:
SUM (CASE WHEN Type = 1 THEN Price END) AS Type_1
SUM (CASE WHEN Type = 2 THEN Price END) AS Type_2
SUM (CASE WHEN Type = 3 THEN Price END) AS Type_3
--SUM (CASE WHEN Type = 1 THEN CarsID END) AS CarsID_Type1
--SUM (CASE WHEN Type = 2 THEN CarsID END) AS CarsID_Type2
--SUM (CASE WHEN Type = 3 THEN CarsID END) AS CarsID_Type3
GROUP BY ID
3 additional columns (Type_1, Type_2, Type_3) are correctly created and everything is in one line. Unfortunately, the last commented out part causes an error:
Operand data type uniqueidentifier is invalid for sum operator.
What to replace with SUM to make the query run correctly.
I will be grateful for your help.
This query is more complex then I thought, but it will do what you want for any kind of id. But there is a problem, it was construct to work with the max of 3 different "types". If your column "Type" have more then 3 different values for the same "id" you will need to adapt the code below.
/*Shifting the value to another column*/
with first_lag as (
select
id
, type_
, case when count(type_)over(partition by id) > 1 then lag(type_)over(order by type_ desc)
end as lag_type_1
, CarsID
, case when count(CarsID)over(partition by id) > 1 then lag(CarsID)over(order by CarsID desc )
end as lag_cars_1
, price::int
from stack_overflow so
)
/*Shifting again the value to another column*/
, second_lag as(
select
id
, type_
, lag_type_1
, lag(lag_type_1)over(order by lag_type_1 desc) lag_type_2
, CarsID
, lag_cars_1
, lag(lag_cars_1)over(order by lag_cars_1 desc) lag_cars_2
, sum(price) over(partition by id) as price_by_id
from first_lag
)
/*Counting how many "types" the same id have (preparing to filter)*/
, counting_rows as (
select
*
, count(type_)over(partition by type_) +count(lag_type_1) over(partition by type_) +count(lag_type_2) over(partition by type_) as counting
from second_lag
)
/*Knowing which row have the max number of "types" (preparing to filter)*/
, selecting_max as (
select
*
,max(counting)over(partition by id) as max_flag
from counting_rows
)
/*Selecting the columns and filtering just the row with the max number of "types"*/
select id,type_,lag_type_1,lag_type_2,carsid,lag_cars_1,lag_cars_2,price_by_id
from selecting_max
where counting = max_flag
--group by id
Note it will work for different ids. (But only if it has 3 or less different"types")
Image

flatten data in SQL based on fixed set of column

I am stuck with a specific scenario of flattening the data and need help for it. I need the output as flattened data where the column values are not fixed. Due to this I want to restrict the output to fixed set of columns.
Given Table 'test_table'
ID
Name
Property
1
C1
xxx
2
C2
xyz
2
C3
zz
The scenario is, column Name can have any no. of values corresponding to an ID. I need to flatten the data based in such a way that there is one row per ID field. Since the Name field varies with each ID, I want to flatten it for fix 3 columns like Co1, Co2, Co3. The output should look like
ID
Co1
Co1_Property
Co2
Co2_Property
Co3
Co3_Property
1
C1
xxx
null
null
2
C2
xyz
C3
zz
Could not think of a solution using Pivot or aggregation. Any help would be appreciated.
You can use arrays:
select id,
array_agg(name order by name)[safe_ordinal(1)] as name_1,
array_agg(property order by name)[safe_ordinal(1)] as property_1,
array_agg(name order by name)[safe_ordinal(2)] as name_2,
array_agg(property order by name)[safe_ordinal(2)] as property_2,
array_agg(name order by name)[safe_ordinal(3)] as name_3,
array_agg(property order by name)[safe_ordinal(3)] as property_3
from t
group by id;
All current answers are too verbose and involve heavy repetition of same fragments of code again and again and if you need to account more columns you need to copy paste and add more lines which will make it even more verbose!
My preference is to avoid such type of coding and rather use something more generic as in below example
select * from (
select *, row_number() over(partition by id) col
from `project.dataset.table`)
pivot (max(name) as name, max(property) as property for col in (1, 2, 3))
If applied to sample data in your question - output is
If you want to change number of output columns - you just simply modify for col in (1, 2, 3) part of query.
For example if you would wanted to have 5 columns - you would use for col in (1, 2, 3, 4, 5) - that simple!!!
The standard practice is to use conditional aggregation. That is, to use CASE expressions to pick which row goes to which column, then MAX() to collapse multiple rows into individual rows...
SELECT
id,
MAX(CASE WHEN name = 'C1' THEN name END) AS co1,
MAX(CASE WHEN name = 'C1' THEN property END) AS co1_property,
MAX(CASE WHEN name = 'C2' THEN name END) AS co2,
MAX(CASE WHEN name = 'C2' THEN property END) AS co2_property,
MAX(CASE WHEN name = 'C3' THEN name END) AS co3,
MAX(CASE WHEN name = 'C3' THEN property END) AS co3_property
FROM
yourTable
GROUP BY
id
Background info:
Not having an ELSE in the CASE expression implicitly means ELSE NULL
The intention is therefore for each column to recieve NULL from every input row, except for the row being pivoted into that column
Aggregates, such as MAX() essentially skip NULL values
MAX( {NULL,NULL,'xxx',NULL,NULL} ) therefore equals 'xxx'
A similar approach "bunches" the values to the left (so that NULL values always only appears to the right...)
That approach first uses row_number() to give each row a value corresponding to which column you want to put that row in to..
WITH
sorted AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY name) AS seq_num
FROM
yourTable
)
SELECT
id,
MAX(CASE WHEN seq_num = 1 THEN name END) AS co1,
MAX(CASE WHEN seq_num = 1 THEN property END) AS co1_property,
MAX(CASE WHEN seq_num = 2 THEN name END) AS co2,
MAX(CASE WHEN seq_num = 2 THEN property END) AS co2_property,
MAX(CASE WHEN seq_num = 3 THEN name END) AS co3,
MAX(CASE WHEN seq_num = 3 THEN property END) AS co3_property
FROM
yourTable
GROUP BY
id

Is there a way to return all unique values within a given row

I am working in Oracle SQL and I have a data set with these headers
In this data set card number is unique, however, the customer can have duplicates in their Suggestion fields. Rather than going through and writing case statements, is there a way to keep only the unique values within the given row?
Please note, some customers will be left with more "unique" suggestions than others
For example:
My goal would be for my final output to look like this
As I have mentioned, previously, I would just write case statements saying
SELECT DISTINCT CARD_NUMBER
,SUGGESTION_1
,CASE
WHEN SUGGESTION_2 != SUGGESTION_1
THEN SUGGESTION_2
WHEN SUGGESTION_3 != SUGGESTION_1
THEN SUGGESTION_3
WHEN SUGGESTION_4 != SUGGESTION_1
THEN SUGGESTION_4
WHEN SUGGESTION_5 != SUGGESTION_1
THEN SUGGESTION_5
END AS SUGGESTION_2
CASE
WHEN SUGGESTION_2 != SUGGESTION_1
AND SUGGESTION_3 != SUGGESTION_1
AND SUGGESTION_3 != SUGGESTION_2
THEN SUGGESTION_3
`
I would do this until all unique values are left, and there just has to be an easier way
Any help would be EXTREMELY appreciated, thank you!
You can use union all and conditional aggregation. Here is the idea that puts the results in a single column:
select card, listagg(suggestion, ', ') within group (order by which) as suggestions
from (select card, suggestion, min(which) as which
from ((select card, 1 as which, suggestion_1 as suggestion from t) union all
(select card, 2, suggestion_2 from t) union all
(select card, 3, suggestion_3 from t) union all
(select card, 4, suggestion_4 from t) union all
(select card, 5, suggestion_5 from t)
) t
group by card, suggestion
) t
group by card;
You can do something similar with conditional aggregation if you want the values in separate columns.
I would try to pivot the table to long, and then back to wide
Setup:
create table testtbl
(
CARD_NUMBER int
,SUGGESTION_1 varchar2(100)
,SUGGESTION_2 varchar2(100)
,SUGGESTION_3 varchar2(100)
,SUGGESTION_4 varchar2(100)
,SUGGESTION_5 varchar2(100)
);
insert into testtbl values (1234,'G11','G4','G3','G2','G6');
insert into testtbl values (5678,'G4','G6','G6','G11','G6');
insert into testtbl values (9101,'G1','G3','G11','G4','G11');
Then the Query itself, first the pivoting to long. Here I use a function just to return the numbers from 1 to 5 - this is instead of joining the table 5 times to itself, this way it should only pass through the test table once.
I then use the analytical functionrow_number to sort the unique values according to their first placement.
The second select uses MAX to pivot back to wide
with cte AS
(
SELECT
CARD_NUMBER
,MIN(n.column_value ) n
,CASE n.column_value
WHEN 1 THEN f.SUGGESTION_1
WHEN 2 THEN f.SUGGESTION_2
WHEN 3 THEN f.SUGGESTION_3
WHEN 4 THEN f.SUGGESTION_4
WHEN 5 THEN f.SUGGESTION_5
END Suggestion
,ROW_NUMBER() OVER (PARTITION BY f.CARD_NUMBER ORDER BY MIN(n.column_value)) rn
FROM testtbl f
CROSS JOIN table(sys.odcinumberlist(1,2,3,4,5)) n
GROUP BY f.CARD_NUMBER,CASE n.column_value
WHEN 1 THEN f.SUGGESTION_1
WHEN 2 THEN f.SUGGESTION_2
WHEN 3 THEN f.SUGGESTION_3
WHEN 4 THEN f.SUGGESTION_4
WHEN 5 THEN f.SUGGESTION_5
END
)
SELECT
CARD_NUMBER
,MAX(CASE WHEN rn=1 THEN Suggestion ELSE '' end)SUGGESTION_1
,MAX(CASE WHEN rn=2 THEN Suggestion ELSE '' end)SUGGESTION_2
,MAX(CASE WHEN rn=3 THEN Suggestion ELSE '' end)SUGGESTION_3
,MAX(CASE WHEN rn=4 THEN Suggestion ELSE '' end)SUGGESTION_4
,MAX(CASE WHEN rn=5 THEN Suggestion ELSE '' end)SUGGESTION_5
FROM cte
GROUP BY CARD_NUMBER
ORDER BY CARD_NUMBER

Array contains NULL value

I'm trying to select the last enddate per nr. In case the nr contains an enddate with value NULL, it means this nr is still active. In short I cannot use MAX(enddate) because out of 2013-09-25 and NULL it would select the date whereas I need NULL.
I tried the following query though it seems that NULL IN (enddate) does not return what I suspected. Namely: 'if the array contains at least one value NULL...'. In other words, NULL should overrank MAX().
SELECT nr,
CASE WHEN NULL IN (enddate) THEN NULL ELSE MAX(enddate) END
FROM myTable
GROUP BY nr
Does someone know how to replace this expression?
You can use the query below. It returns NULL before other dates (provided that you put a date great enough) and then restores NULL.
SELECT nr, CASE d WHEN '20990101' THEN NULL ELSE d END d
FROM (
SELECT nr,
CASE MAX(ISNULL(enddate, '20990101') d
FROM myTable
GROUP BY nr
)
I couldn't check the syntax so there may be small typos.
You could use this query. The CTE calculates the maximum date ignoring any nulls, this is then left joined back to the table to see if there is a null value for each nr value. The case statement returns a null if it exists or the maximum date from the CTE.
WITH CTE1 AS
(SELECT nr, MAX(enddate) MaxEnddate
FROM myTable
GROUP BY nr)
SELECT CTE1.nr,
CASE WHEN MyTable.enddate IS NULL AND MyTable.NR IS NOT NULL THEN NULL ELSE CTE1.MaxEndDate END AS EndDate
FROM CTE1
LEFT JOIN MyTable
ON MyTable.nr=CTE1.nr
AND MyTable.enddate IS NULL
Just to build off of #Szymon answer a little bit:
drop table #temp
create table #temp (MyDate date) insert into #temp (MyDate) values ('1/1/2010'),('1/1/2011'),('1/1/2012'),(NULL)
select * from #temp
SELECT
(CASE WHEN MAX(ISNULL(MyDate, '2099-01-01')) = '2099-01-01' THEN NULL ELSE MAX(ISNULL(MyDate, '2099-01-01')) END) as Max_Date
FROM
#temp
The query replaces a NULL value with '2099-01-01'. Then, it looks to see if the Max is equal to '2099-01-01' and if so, returns NULL, and otherwise returns the actual Max.

Can I get the minimum of 2 columns which is greater than a given value using only one scan of a table

This is my example data (there are no indexes and I do not want to create any):
CREATE TABLE tblTest ( a INT , b INT );
INSERT INTO tblTest ( a, b ) VALUES
( 1 , 2 ),
( 5 , 1 ),
( 1 , 4 ),
( 3 , 2 )
I want the minimum value in of both column a and column b which is greater then a given value. E.g. if the given value is 3 then I want 4 to be returned.
This is my current solution:
SELECT MIN (subMin) FROM
(
SELECT MIN (a) as subMin FROM tblTest
WHERE a > 3 -- Returns 5
UNION
SELECT MIN (b) as subMin FROM tblTest
WHERE b > 3 -- Returns 4
)
This searches the table twice - once to get min(a) once to get min(b).
I believe it should be faster to do this with just one pass. Is this possible?
You want to use conditional aggregatino for this:
select min(case when a > 3 then a end) as minA,
min(case when b > 3 then b end) as minB
from tblTest;
To get the minimum of both values, you can use a SQLite extension, which handles multiple values for min():
select min(min(case when a > 3 then a end),
min(case when b > 3 then b end)
)
from tblTest
The only issue is that the min will return NULL if either argument is NULL. You can fix this by doing:
select coalesce(min(min(case when a > 3 then a end),
min(case when b > 3 then b end)
),
min(case when a > 3 then a end),
min(case when b > 3 then b end)
)
from tblTest
This version will return the minimum value, subject to your conditions. If one of the conditions has no rows, it will still return the minimum of the other value.
From the top of my head, you could modify the table and add a min value column to store the minimum value of the two columns. then query that column.
Or you can do this:
select min(val)
from
(
select min(col1, col2) as val
from table1
)
where
val > 3
The outer SELECT, queries the memory, not the table itself.
Check SQL Fiddle