I have a table that looks like the following:
Accountno | MonthPeriod | YearPeriod | Value | StartYear | StartMonth | EndYear | Endmonth
---------- ------------- ------------ ------- ----------- ------------ --------- --------
210 4 2015 100 2015 1 2016 5
210 2 2016 200 2015 1 2016 5
300 4 2015 50 2015 1 2016 5
300 2 2016 100 2015 1 2016 5
I am looking for a way to fill in the missing months and years between the startmonth/startyear and endmonth/endyear, where the values for the newly populated entries will be of the previously populated values - populating the values backwards if that makes any sense? And for the entries towards the endyear/endmonth the accountno's last value (As the desired output table below)
Accountno | MonthPeriod | YearPeriod | Value | StartYear | StartMonth | EndYear | Endmonth
---------- ------------- ------------ ------- ----------- ------------ --------- --------
210 1 2015 100 2015 1 2016 5
210 2 2015 100 2015 1 2016 5
210 3 2015 100 2015 1 2016 5
210 4 2015 200 2015 1 2016 5
210 5 2015 200 2015 1 2016 5
210 6 2015 200 2015 1 2016 5
210 7 2015 200 2015 1 2016 5
210 8 2015 200 2015 1 2016 5
210 9 2015 200 2015 1 2016 5
210 10 2015 200 2015 1 2016 5
210 11 2015 200 2015 1 2016 5
210 12 2015 200 2015 1 2016 5
210 1 2016 200 2015 1 2016 5
210 2 2016 200 2015 1 2016 5
210 3 2016 200 2015 1 2016 5
210 4 2016 200 2015 1 2016 5
210 5 2016 200 2015 1 2016 5
300 1 2015 50 2015 1 2016 5
300 2 2015 50 2015 1 2016 5
300 3 2015 50 2015 1 2016 5
300 4 2015 50 2015 1 2016 5
300 5 2015 100 2015 1 2016 5
300 6 2015 100 2015 1 2016 5
300 7 2015 100 2015 1 2016 5
300 8 2015 100 2015 1 2016 5
300 9 2015 100 2015 1 2016 5
300 10 2015 100 2015 1 2016 5
300 11 2015 100 2015 1 2016 5
300 12 2015 100 2015 1 2016 5
300 1 2016 100 2015 1 2016 5
300 2 2016 100 2015 1 2016 5
300 3 2016 100 2015 1 2016 5
300 4 2016 100 2015 1 2016 5
300 5 2016 100 2015 1 2016 5
I tried manipulating the following code for my situation but did not exceed:
CREATE TABLE TEST( Month tinyint, Year int, Value int)
INSERT INTO TEST(Month, Year, Value)
VALUES
(1,2013,100),
(4,2013,101),
(8,2013,102),
(2,2014,103),
(4,2014,104)
DECLARE #Months Table(Month tinyint)
Insert into #Months(Month)Values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12);
With tblValues as (
select Rank() Over (ORDER BY y.Year, m.Month) as [Rank],
m.Month,
y.Year,
t.Value
from #Months m
CROSS JOIN ( Select Distinct Year from Test ) y
LEFT JOIN Test t on t.Month = m.Month and t.Year = y.Year
)
Select t.Month, t.Year, COALESCE(t.Value, t1.Value) as Value
from tblValues t
left join tblValues t1 on t1.Rank = (
Select Max(tmax.Rank)
From tblValues tmax
Where tmax.Rank < t.Rank AND tmax.Value is not null)
Order by t.Year, t.Month
Any assistance will be greatly appreciated!
Related
I have a Table called TaxAmount. It has 3 columns(ID, Year, Amount). refer the below image.
I want to divide each row into 12 months. I attached a sample image below.
I'm new in Oracle side. please help me to write a Oracle Query to display the above result.
I tried ROWNUM. But No luck.
Here's one option:
SQL> select id, year, column_value as month, amount
2 from taxamount cross join
3 table(cast(multiset(select level from dual
4 connect by level <= 12
5 ) as sys.odcinumberlist))
6 order by id, year, month;
ID YEAR MONTH AMOUNT
---------- ---------- ---------- ----------
1 2022 1 100
1 2022 2 100
1 2022 3 100
1 2022 4 100
1 2022 5 100
1 2022 6 100
1 2022 7 100
1 2022 8 100
1 2022 9 100
1 2022 10 100
1 2022 11 100
1 2022 12 100
2 2022 1 200
2 2022 2 200
2 2022 3 200
2 2022 4 200
2 2022 5 200
2 2022 6 200
2 2022 7 200
2 2022 8 200
2 2022 9 200
2 2022 10 200
2 2022 11 200
2 2022 12 200
3 2022 1 150
3 2022 2 150
3 2022 3 150
3 2022 4 150
3 2022 5 150
3 2022 6 150
3 2022 7 150
3 2022 8 150
3 2022 9 150
3 2022 10 150
3 2022 11 150
3 2022 12 150
36 rows selected.
SQL>
While transposing single columns is pretty straight forward I need to transpose a large amount of data with 3 sets of , 10+ related columns needed to be transposed.
create table test
(month int,year int,po1 int,po2 int,ro1 int,ro2 int,mo1 int,mo2 int, mo3 int);
insert into test
values
(5,2013,100,20,10,1,3,4,5),(4,2014,200,30,20,2,4,5,6),(6,2015,200,80,30,3,5,6,7) ;
select * FROM test;
gives
month
year
po1
po2
ro1
ro2
mo1
mo2
mo3
5
2013
100
20
10
1
3
4
5
4
2014
200
30
20
2
4
5
6
6
2015
200
80
30
3
5
6
7
Transposing using UNPIVOT
select
month, year,
PO, RO, MO
from ( SELECT * from test) src
unpivot
( PO for Description in (po1, po2))unpiv1
unpivot
(RO for Description1 in (ro1, ro2)) unpiv2
unpivot
(MO for Description2 in (mo1, mo2, mo3)) unpiv3
order by year
Gives me this
month
year
PO
RO
MO
5
2013
100
10
3
5
2013
100
10
4
5
2013
100
10
5
5
2013
100
1
3
5
2013
100
1
4
5
2013
100
1
5
5
2013
20
10
3
5
2013
20
10
4
5
2013
20
10
5
5
2013
20
1
3
5
2013
20
1
4
5
2013
20
1
5
4
2014
200
20
4
4
2014
200
20
5
4
2014
200
20
6
4
2014
200
2
4
4
2014
200
2
5
4
2014
200
2
6
4
2014
30
20
4
4
2014
30
20
5
4
2014
30
20
6
4
2014
30
2
4
4
2014
30
2
5
4
2014
30
2
6
6
2015
200
30
5
6
2015
200
30
6
6
2015
200
30
7
6
2015
200
3
5
6
2015
200
3
6
6
2015
200
3
7
6
2015
80
30
5
6
2015
80
30
6
6
2015
80
30
7
6
2015
80
3
5
6
2015
80
3
6
6
2015
80
3
7
I will like to turn it to something like this. Is that possible?
month
year
PO
RO
MO
5
2013
100
10
3
5
2013
20
1
4
5
2013
0
0
5
4
2014
200
20
4
4
2014
30
2
5
4
2014
0
0
6
6
2015
200
30
5
6
2015
80
3
6
6
2015
0
0
7
Maybe use a query like below which creates rows as per your design using CROSS APPLY
select month,year,po,ro,mo from
test cross apply
(values (po1,ro1,mo1), (po2,ro2,mo2),(0,0,mo3))v(po,ro,mo)
see demo here
Unpivot acts similar as union,Use union all in your case
SELECT month,
year,
po1 AS PO,
ro1 AS RO,
mo1 AS MO
FROM test
UNION ALL
SELECT month,
year,
po2,
ro2,
mo2
FROM test
UNION ALL
SELECT month,
year,
0,
0,
mo2
FROM test
I have a data frame as shown below. which is a sales data of two health care product starting from December 2016 to November 2018.
product price sale_date discount
A 50 2016-12-01 5
A 50 2017-01-03 4
B 200 2016-12-24 10
A 50 2017-01-18 3
B 200 2017-01-28 15
A 50 2017-01-18 6
B 200 2017-01-28 20
A 50 2017-04-18 6
B 200 2017-12-08 25
A 50 2017-11-18 6
B 200 2017-08-21 20
B 200 2017-12-28 30
A 50 2018-03-18 10
B 300 2018-06-08 45
B 300 2018-09-20 50
A 50 2018-11-18 8
B 300 2018-11-28 35
From the above I would like to prepare below data frame
Expected Output:
product year number_of_months total_price total_discount number_of_sales
A 2016 1 50 5 1
B 2016 1 200 10 1
A 2017 12 250 25 5
B 2017 12 1000 110 5
A 2018 11 100 18 2
B 2018 11 900 130 3
Note: Please note that the data starts from Dec 2016 to Nov 2018.
So number of months in 2016 is 1, in 2017 we have full data so 12 months and 2018 we have 11 months.
First aggregate sum by years and product and then create new column for counts by months by DataFrame.insert and Series.map:
df1 =(df.groupby(['product',df['sale_date'].dt.year], sort=False).sum().add_prefix('total_')
.reset_index())
df1.insert(2,'number_of_months', df1['sale_date'].map({2016:1, 2017:12, 2018:11}))
print (df1)
product sale_date number_of_months total_price total_discount
0 A 2016 1 50 5
1 A 2017 12 250 25
2 B 2016 1 200 10
3 B 2017 12 1000 110
4 A 2018 11 100 18
5 B 2018 11 900 130
If want dynamic dictionary by minumal and maximal datetimes use:
s = pd.date_range(df['sale_date'].min(), df['sale_date'].max(), freq='MS')
d = s.year.value_counts().to_dict()
print (d)
{2017: 12, 2018: 11, 2016: 1}
df1 = (df.groupby(['product',df['sale_date'].dt.year], sort=False).sum().add_prefix('total_')
.reset_index())
df1.insert(2,'number_of_months', df1['sale_date'].map(d))
print (df1)
product sale_date number_of_months total_price total_discount
0 A 2016 1 50 5
1 A 2017 12 250 25
2 B 2016 1 200 10
3 B 2017 12 1000 110
4 A 2018 11 100 18
5 B 2018 11 900 130
For ploting is used DataFrame.set_index with DataFrame.unstack:
df2 = (df1.set_index(['sale_date','product'])[['total_price','total_discount']]
.unstack(fill_value=0))
df2.columns = df2.columns.map('_'.join)
print (df2)
total_price_A total_price_B total_discount_A total_discount_B
sale_date
2016 50 200 5 10
2017 250 1000 25 110
2018 100 900 18 130
df2.plot()
EDIT:
df1 = (df.groupby(['product',df['sale_date'].dt.year], sort=False)
.agg( total_price=('price','sum'),
total_discount=('discount','sum'),
number_of_sales=('discount','size'))
.reset_index())
df1.insert(2,'number_of_months', df1['sale_date'].map({2016:1, 2017:12, 2018:11}))
print (df1)
product sale_date number_of_months total_price total_discount \
0 A 2016 NaN 50 5
1 A 2017 NaN 250 25
2 B 2016 NaN 200 10
3 B 2017 NaN 1000 110
4 A 2018 NaN 100 18
5 B 2018 NaN 900 130
number_of_sales
0 1
1 5
2 1
3 5
4 2
5 3
Table 1:
ID Year Month
-----------------
1 2018 1
2 2018 1
3 2018 1
1 2018 2
2 2018 2
3 2018 2
Table 2:
ID Year Jan Feb Mar
------------------------
1 2018 100 200 300
2 2018 200 400 300
3 2018 200 500 700
How can I join these two tables even though they are laid out differently?
I was exploring a case join but that doesn't seem to be exactly what I need.
I'd like my output to be:
ID Year Month Data
1 2018 1 100
2 2018 1 200
3 2018 1 200
1 2018 2 200
2 2018 2 400
3 2018 2 500
1 2018 3 300
2 2018 3 300
3 2018 3 700
So, firstly we get TableB in the right format:
SELECT B.ID, B.Year, B.MonthValue
INTO TableB_New
FROM TableB T
UNPIVOT
(
MonthValue FOR Month IN (Jan, Feb, Mar)
) AS B
And then you do the join. Good Luck!
How can I select data from this table
Yr Month El1 Value
---- ------ ------- ------
2017 2 AT010 100
2017 3 AT010 100
2017 4 AT010 50
2017 5 AT010 150
2017 3 BE020 10
.......
and insert it into another table in the following way
Yr Month El1 Value
---- ------ ------- ------
2017 0 AT010 0
2017 1 AT010 0
2017 2 AT010 100
2017 3 AT010 200
2017 4 AT010 250
2017 5 AT010 400
2017 6 AT010 400
2017 7 AT010 400
2017 8 AT010 400
2017 9 AT010 400
2017 10 AT010 400
2017 11 AT010 400
2017 12 AT010 400
2017 0 BE020 0
2017 1 BE020 0
2017 2 BE020 0
2017 3 BE020 10
2017 4 BE020 10
2017 5 BE020 10
2017 6 BE020 10
2017 7 BE020 10
2017 8 BE020 10
2017 9 BE020 10
2017 10 BE020 10
2017 11 BE020 10
2017 12 BE020 10
.......
I am trying to insert missing months from 0 to 12 and calculate running total at the same time.
I used this suggestion for running total calculation; however, I can't figure out how to enter missing month.
This code will be used in the stored procedure for a daily ETL job.
Hmmm . . . Generate the rows and then use left join or outer apply to bring in the values. Here is one way:
with yyyymm as (
select 2017 as yr, 1 as mom
union all
select yr, mon + 1
from yyyymm
where mon + 1 <= 12
)
select yyyymm.yr, yyyymm.mon, coalesce(e.el1, 0) as el1
from yyyymm cross join
(select distinct el1 from t) e outer apply
(select sum(t.value)
from t
where t.el1 = e.el1 and
t.yr = yyyy.yr and
t.month <= yyyy.mon
) tt
order by e.el1, yyyy.yr, yyyy.month;