While transposing single columns is pretty straight forward I need to transpose a large amount of data with 3 sets of , 10+ related columns needed to be transposed.
create table test
(month int,year int,po1 int,po2 int,ro1 int,ro2 int,mo1 int,mo2 int, mo3 int);
insert into test
values
(5,2013,100,20,10,1,3,4,5),(4,2014,200,30,20,2,4,5,6),(6,2015,200,80,30,3,5,6,7) ;
select * FROM test;
gives
month
year
po1
po2
ro1
ro2
mo1
mo2
mo3
5
2013
100
20
10
1
3
4
5
4
2014
200
30
20
2
4
5
6
6
2015
200
80
30
3
5
6
7
Transposing using UNPIVOT
select
month, year,
PO, RO, MO
from ( SELECT * from test) src
unpivot
( PO for Description in (po1, po2))unpiv1
unpivot
(RO for Description1 in (ro1, ro2)) unpiv2
unpivot
(MO for Description2 in (mo1, mo2, mo3)) unpiv3
order by year
Gives me this
month
year
PO
RO
MO
5
2013
100
10
3
5
2013
100
10
4
5
2013
100
10
5
5
2013
100
1
3
5
2013
100
1
4
5
2013
100
1
5
5
2013
20
10
3
5
2013
20
10
4
5
2013
20
10
5
5
2013
20
1
3
5
2013
20
1
4
5
2013
20
1
5
4
2014
200
20
4
4
2014
200
20
5
4
2014
200
20
6
4
2014
200
2
4
4
2014
200
2
5
4
2014
200
2
6
4
2014
30
20
4
4
2014
30
20
5
4
2014
30
20
6
4
2014
30
2
4
4
2014
30
2
5
4
2014
30
2
6
6
2015
200
30
5
6
2015
200
30
6
6
2015
200
30
7
6
2015
200
3
5
6
2015
200
3
6
6
2015
200
3
7
6
2015
80
30
5
6
2015
80
30
6
6
2015
80
30
7
6
2015
80
3
5
6
2015
80
3
6
6
2015
80
3
7
I will like to turn it to something like this. Is that possible?
month
year
PO
RO
MO
5
2013
100
10
3
5
2013
20
1
4
5
2013
0
0
5
4
2014
200
20
4
4
2014
30
2
5
4
2014
0
0
6
6
2015
200
30
5
6
2015
80
3
6
6
2015
0
0
7
Maybe use a query like below which creates rows as per your design using CROSS APPLY
select month,year,po,ro,mo from
test cross apply
(values (po1,ro1,mo1), (po2,ro2,mo2),(0,0,mo3))v(po,ro,mo)
see demo here
Unpivot acts similar as union,Use union all in your case
SELECT month,
year,
po1 AS PO,
ro1 AS RO,
mo1 AS MO
FROM test
UNION ALL
SELECT month,
year,
po2,
ro2,
mo2
FROM test
UNION ALL
SELECT month,
year,
0,
0,
mo2
FROM test
I have a pandas dataframe which has the folowing columns -
Day, Month, Year, City, Temperature.
I would like to have a new column that has the average (mean) temperature in same date (day\month) of all previous years.
Can someone please assist?
Thanks :-)
Try:
dti = pd.date_range('2000-1-1', '2021-12-1', freq='D')
temp = np.random.randint(10, 20, len(dti))
df = pd.DataFrame({'Day': dti.day, 'Month': dti.month, 'Year': dti.year,
'City': 'Nice', 'Temperature': temp})
out = df.set_index('Year').groupby(['City', 'Month', 'Day']) \
.expanding()['Temperature'].mean().reset_index()
Output:
>>> out
Day Month Year City Temperature
0 1 1 2000 Nice 12.000000
1 1 1 2001 Nice 12.000000
2 1 1 2002 Nice 11.333333
3 1 1 2003 Nice 12.250000
4 1 1 2004 Nice 11.800000
... ... ... ... ... ...
8001 31 12 2016 Nice 15.647059
8002 31 12 2017 Nice 15.555556
8003 31 12 2018 Nice 15.631579
8004 31 12 2019 Nice 15.750000
8005 31 12 2020 Nice 15.666667
[8006 rows x 5 columns]
Focus on 1st January of the dataset:
>>> df[df['Day'].eq(1) & df['Month'].eq(1)]
Day Month Year City Temperature # Mean
0 1 1 2000 Nice 12 # 12
366 1 1 2001 Nice 12 # 12
731 1 1 2002 Nice 10 # 11.33
1096 1 1 2003 Nice 15 # 12.25
1461 1 1 2004 Nice 10 # 11.80
1827 1 1 2005 Nice 12 # and so on
2192 1 1 2006 Nice 17
2557 1 1 2007 Nice 16
2922 1 1 2008 Nice 19
3288 1 1 2009 Nice 12
3653 1 1 2010 Nice 10
4018 1 1 2011 Nice 16
4383 1 1 2012 Nice 13
4749 1 1 2013 Nice 15
5114 1 1 2014 Nice 14
5479 1 1 2015 Nice 13
5844 1 1 2016 Nice 15
6210 1 1 2017 Nice 13
6575 1 1 2018 Nice 15
6940 1 1 2019 Nice 18
7305 1 1 2020 Nice 11
7671 1 1 2021 Nice 14
I need to create a custom quarter calculator to start always from previous month no matter month, year we are at and count back to get quarter. Previous year wuarters are to be numbered 5, 6 etc
So the goal is to move quarter grouping one month back.
Assume we run query on December 11th, result should be:
YEAR MNTH QTR QTR_ALT
2017 1 1 12
2017 2 1 12
2017 3 1 11
2017 4 2 11
2017 5 2 11
2017 6 2 10
2017 7 3 10
2017 8 3 10
2017 9 3 9
2017 10 4 9
2017 11 4 9
2017 12 4 8
2018 1 1 8
2018 2 1 8
2018 3 1 7
2018 4 2 7
2018 5 2 7
2018 6 2 6
2018 7 3 6
2018 8 3 6
2018 9 3 5
2018 10 4 5
2018 11 4 5
2018 12 4 1
2019 1 1 1
2019 2 1 1
2019 3 1 2
2019 4 2 2
2019 5 2 2
2019 6 2 3
2019 7 3 3
2019 8 3 3
2019 9 3 4
2019 10 4 4
2019 11 4 4
2019 12 4 THIS IS SKIPPED
Starting point is eliminating current_date so data end at previous month's last day
SELECT DISTINCT
YEAR,
MNTH,
QTR
FROM TABLE
WHERE DATA BETWEEN
(SELECT DATE_TRUNC(YEAR,ADD_MONTHS(CURRENT_DATE, -24))) AND
(SELECT DATE_TRUNC(MONTH,CURRENT_DATE)-1)
ORDER BY YEAR, MNTH, QTR
The following gets you all the dates you need, with the extra columns.
select to_char(add_months(a.dt, -b.y), 'YYYY') as year,
to_char(add_months(a.dt, -b.y), 'MM') as month,
ceil(to_number(to_char(add_months(a.dt, -b.y), 'MM')) / 3) as qtr,
ceil(b.y/3) as alt_qtr
from
(select trunc(sysdate, 'MONTH') as dt from dual) a,
(select rownum as y from dual connect by level <= 24) b;
I have the below dataframe and I am calculating the different with the previous value using diff periods but that makes the first value as Null, is there any way to fill that value?
example:
df['cal_val'] = df.groupby('year')['val'].diff(periods=1)
current output:
date year val cal_val
1/3/10 2010 12 NaN
1/6/10 2010 15 3
1/9/10 2010 18 3
1/12/10 2010 20 2
1/3/11 2011 10 NaN
1/6/11 2011 12 2
1/9/11 2011 15 3
1/12/11 2011 18 3
expected output:
date year val cal_val
1/3/10 2010 12 12
1/6/10 2010 15 3
1/9/10 2010 18 3
1/12/10 2010 20 2
1/3/11 2011 10 10
1/6/11 2011 12 2
1/9/11 2011 15 3
1/12/11 2011 18 3
If i have a Data Frame(df) as :
Year Rate
2001 10
2001 3
2001 5
2001 3
2001 6
2002 2
2002 7
2002 4
2002 9
2002 8
... ...
2018 8
2018 6
2018 4
2018 6
2018 5
How do i get a Data Frame that show only first 2 rows of each years, like:
Year Rate
2001 10
2001 3
2002 2
2002 7
... ...
2018 8
2018 6
Thanks
Use GroupBy.head:
df1 = df.groupby('Year').head(2)
print (df1)
Year Rate
0 2001 10
1 2001 3
5 2002 2
6 2002 7
10 2018 8
11 2018 6