PIVOT one numeric column into two columns - sql

Raw data:
ID Age Date Value
-------------------------
1 5 01/01/2023 10
1 5 01/04/2023 15
2 7 01/02/2023 17
3 9 10/02/2022 7
3 9 12/20/2022 9
Desired output:
One ID/Age per row (Age will always be the same per ID)
Use latest Date partitioned by ID if has multiple dates
Pivot the Value column into two separate columns, Value_1 and Value_2
If an ID does not have 2nd value, in the output, leave Value_2 blank
Value_1 is the highest, Value_2 is second highest
This is what the output should look like:
ID Age Date Value_1 Value_2
-------------------------------------
1 5 01/04/2023 15 10
2 7 01/02/2023 17
3 9 12/20/2022 9 7
I couldn't figure it out even after reading the PIVOT reference. It is little bit different than the example I read over. The column I am pivoting is a numeric column, not categorical. I need some help here.
https://www.techonthenet.com/sql_server/pivot.php
My attempt/some ideas:
select *
from
(select
ID, Age, Date,
Value,
row_number() over (partition by ID order by Date desc) row_num
from
table) a
where
a.row_num = 1
SELECT
ID, Age, Date, 'Value_1', 'Value_2'
FROM
(SELECT
ID, Age, Date, Value
FROM
table) AS SourceTable
PIVOT
(SUM(Value)
FOR Date IN ('day1', 'day2')
) AS PivotTable;
Updates: both #dale-k and #t-n's solution are good. If you have more than 2 values you can try #t-n's PIVOT approach. Thanks for your help!

The following query returns your desired results using standard aggregation with a case expression.
select
ID
, Age
, max([Date]) [Date]
, max([Value]) Value_1
, case when min([Value]) <> max([Value]) then min([Value]) else null end Value_2
from #MyTable
group by ID, Age;
Returns:
ID
Age
Date
Value_1
Value_2
1
5
2023-01-04
15
10
2
7
2023-01-02
17
NULL
3
9
2022-12-20
9
7

For a solution that uses ROW_NUMBER() and PIVOT and can be expanded to more than two Value columns, try:
SELECT PVT.ID, PVT.Age, PVT.Date,
[1] AS Value_1, [2] AS Value_2
FROM (
SELECT
ID, Age, Value,
MAX(Date) OVER(PARTITION BY ID, Age) AS Date,
ROW_NUMBER() OVER(PARTITION BY ID, Age ORDER BY Value DESC) AS RN
FROM #Data D
) Source
PIVOT (
MAX(Source.Value)
FOR Source.RN IN ([1], [2])
) PVT
ORDER BY PVT.ID, PVT.Age
See this db<>fiddle.

Related

PostgreSQL Pivot by Last Date

I need to make a PIVOT table from Source like this table
FactID UserID Date Product QTY
1 11 01/01/2020 A 600
2 11 02/01/2020 A 400
3 11 03/01/2020 B 500
4 11 04/01/2020 B 200
6 22 06/01/2020 A 1000
7 22 07/01/2020 A 200
8 22 08/01/2020 B 300
9 22 09/01/2020 B 100
Need Pivot Like this where Product QTY is QTY by Last Date
UserID A B
11 400 200
22 200 100
My try PostgreSQL
Select
UserID,
MAX(CASE WHEN Product='A' THEN 'QTY' END) AS 'A',
MAX(CASE WHEN Product='B' THEN 'QTY' END) AS 'B'
FROM table
GROUP BY UserID
And Result
UserID A B
11 600 500
22 1000 300
I mean I get a result by the maximum QTY and not by the maximum date!
What do I need to add to get results by the maximum (last) date ??
Postgres doesn't have "first" and "last" aggregation functions. One method for doing this (without a subquery) uses arrays:
select userid,
(array_agg(qty order by date desc) filter (where product = 'A'))[1] as a,
(array_agg(qty order by date desc) filter (where product = 'B'))[1] as b
from tab
group by userid;
Another method uses select distinct with first_value():
select distinct userid,
first_value(qty) over (partition by userid order by product = 'A' desc, date desc) as a,
first_value(qty) over (partition by userid order by product = 'B' desc, date desc) as b
from tab;
With the appropriate indexes, though, distinct on might be the fastest approach:
select userid,
max(qty) filter (where product = 'A') as a,
max(qty) filter (where product = 'B') as b
from (select distinct on (userid, product) t.*
from tab t
order by userid, product, date desc
) t
group by userid;
In particular, this can use an index on userid, product, date desc). The improvement in performance will be most notable if there are many dates for a given user.
You can use DENSE_RANK() window function in order to filter by the last date per each product and UserID before applying conditional aggregation such as
SELECT UserID,
MAX(CASE WHEN Product='A' THEN QTY END) AS "A",
MAX(CASE WHEN Product='B' THEN QTY END) AS "B"
FROM
(
SELECT t.*, DENSE_RANK() OVER (PARTITION BY Product,UserID ORDER BY Date DESC) AS rn
FROM tab t
) q
WHERE rn = 1
GROUP BY UserID
Demo
presuming all date values are distinct(no ties occur for dates)

How to select top 2 values for each id

I have a table with values
id sales date
1 5 "2015-01-04"
1 3 "2015-01-03"
1 1 "2015-01-01"
1 1 "2015-01-01"
2 7 "2015-01-05"
2 6 "2015-01-04"
2 4 "2015-01-03"
3 11 "2015-01-08"
3 10 "2015-01-07"
3 9 "2015-01-06"
3 8 "2015-01-05"
I want to select top two values of each id as shown in desired output.
Desired output:
id sales date
1 5 "2015-01-04"
1 3 "2015-01-03"
2 7 "2015-01-05"
2 6 "2015-01-04"
3 11 "2015-01-08"
3 10 "2015-01-07"
My attempt:
can someone help me with this. Thank you in advance!
select transactions.salesperson_id, transactions.id, transactions.date
from transactions
ORDER BY transactions.salesperson_id ASC, transactions.date DESC;
This can be done using window functions:
select id, sales, "date"
from (
select id, sales, "date",
dense_rank() over (partition by id order by "date" desc) as rnk
from transactions
) t
where rnk <= 2;
If there are multiple rows on the same date this might return more than two rows for the same ID. If you don't want that, use row_number() instead of dense_rank()
row_number() will get what you want.
select * from
(select row_number() over (partition by id order by date) as rn, sales, date from transactions) t1
where t1.rn <= 2

How to write Oracle query to group like below?

Id value
1 5
1 6
1 8
1 9
1 10
Result should be like below
Id minValue maxValue
1 5 6
1 8 10
Previous value difference should be 1 otherwise need to insert other row
This is famous GAPS and ISLANDS problem. You can read this wonderful article from Lalit Kumar B for detailed description. You can try below query -
SELECT id, MIN(value), MAX(value)
FROM (SELECT id, value, value - ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) rn
FROM test) T
GROUP BY id, rn;
Here is the Fiddle
If you subtract the position of each value within the list of values for that ID (which you can get with an analytic function) from the value itself:
value - dense_rank() over (partition by id order by value)
then consecutive (or duplicate) values will get the same result:
select id, value,
value - dense_rank() over (partition by id order by value)
from your_table;
ID VALUE GRP
---------- ---------- ----------
1 5 4
1 6 4
1 8 5
1 9 5
1 10 5
You can then aggregate using those differences:
select id, min(value) as minvalue, max(value) as maxvalue
from (
select id, value,
value - dense_rank() over (partition by id order by value) as grp
from your_table
)
group by id, grp
order by id, minvalue;
ID MINVALUE MAXVALUE
---------- ---------- ----------
1 5 6
1 8 10
db<>fiddle
You could use row_number() instead of dense_rank() if there are no duplicates.

How to calculate the number of a day in series of consecutive dates?

I have a table
id name created_at
1 name 1 08/01/2017
2 name 2 08/02/2017
3 name 3 08/03/2017
4 name 4 08/05/2017
5 name 5 08/06/2017
6 name 6 08/07/2017
7 name 7 08/10/2017
8 name 8 08/12/2017
I need to add a column where be rank for all rows, but if they were created from day to day.
The result should be like below
id name created_at days_on
1 name 1 08/01/2017 1
2 name 2 08/02/2017 2
3 name 3 08/03/2017 3
4 name 4 08/05/2017 1
5 name 5 08/06/2017 2
6 name 6 08/07/2017 3
7 name 7 08/10/2017 null
8 name 8 08/12/2017 null
There are many answers describing typical approaches to similar problems, where you can also find an explanation of the techniques used below.
select
id, name, created_at,
case when count(*) over wa > 1 then row_number() over wo end as rank
from (
select
id, name, created_at,
sum(first) over w as part
from (
select *, (lag(created_at) over w+ 1 is distinct from created_at)::int as first
from my_table
window w as (order by id)
) s
window w as (order by id)
) s
window
wa as (partition by part),
wo as (partition by part order by id);
DbFiddle.
This is a variation of the group-and-islands problem. Let me show a solution using lag() to define the groups:
lag() to get the previous day
cumulative sum to get the groups
row_number() to assign the final values
This works as:
select id, name, created_at,
(case when count(*) over (partition by grp) > 1
then row_number() over (partition by grp order by id)
end) as days_on
from (select t.*,
sum( (prev_ca <> created_at - interval '1 day')::int ) as grp
from (select t.*,
lag(created_at) over (order by id) as prev_ca
from t
) t;

Add a column with the max value of the group

I want to add an extra column, where the max values of each group (ID) will appear.
Here how the table looks like:
select ID, VALUE from mytable
ID VALUE
1 4
1 1
1 7
2 2
2 5
3 7
3 3
Here is the result I want to get:
ID VALUE max_values
1 4 7
1 1 7
1 7 7
2 2 5
2 5 5
3 7 7
3 3 7
Thank you for your help in advance!
Your previous questions indicate that you are using SQL Server, in which case you can use window functions:
SELECT ID,
Value,
MaxValue = MAX(Value) OVER(PARTITION BY ID)
FROM mytable;
Based on your comment on another answer about first summing value, you may need to use a subquery to actually get this:
SELECT ID,
Date,
Value,
MaxValue = MAX(Value) OVER(PARTITION BY ID)
FROM ( SELECT ID, Date, Value = SUM(Value)
FROM mytable
GROUP BY ID, Date
) AS t;
There is no need to use GROUP BY in subselect.
select ID, VALUE,
(select MAX(VALUE) from mytable where ID = t.ID) as MaxValue
from mytable t
Use this query.
SELECT ID
,value
,(
SELECT MAX(VALUE)
FROM GetMaxValue gmv
WHERE gmv.ID = gmv1.ID
GROUP BY ID
) as max_value
FROM GetMaxValue gmv1
ORDER BY ID
Try it with a sub select and group by, then grab the MAX of this group:
select
ID,
VALUE,
(select MAX(VALUE)
from mytable
group by ID
having ID = t.ID
) as max_values
from mytable t
Edit:
I built a SQL fiddle, which shows that my solution works, but also VDohnal is correct and doesn't need the group by, so I'll upvote his answer.