Sustainability : multi-index (column) report using pandas

multi-index-09-03-2020 <!-- Custom stylesheet, it must be in the same directory as the html file

In [1]:

import pandas as pd
import numpy as np


#1 input file
input_file = r"D:\Python Scripts\multi_index_3.xlsx"

df1=pd.read_excel(input_file)
df1['Focus_area']=df1['Focus_area'].astype('category')
table0=df1
table0
#table0 is shown below.

Out[1]:

	Focus_area	City	As_on	Spend
0	Education	Mumbai	2019-01-01	1000000
1	Healthcare	Pune	2019-03-31	20000
2	Environment	Chennai	2018-03-31	100000
3	Disaster Relief	Bengaluru	2017-12-01	220000
4	Vocational Skills	Varanasi	2018-09-30	540000
5	Education	Pune	2019-06-30	3000000
6	Environment	Pune	2019-03-31	666000

In [67]:

df1['As_on']=pd.to_datetime(df1.As_on)

In [68]:

df1.dtypes

Out[68]:

Focus_area          category
City                  object
As_on         datetime64[ns]
Spend                  int64
dtype: object

In [69]:

df1 = df1.set_index(['Focus_area','City'])

In [70]:

df1.dtypes

Out[70]:

As_on    datetime64[ns]
Spend             int64
dtype: object

In [4]:

df1.groupby(df1['As_on'])['Spend'].sum()

Out[4]:

As_on
2017-12-01     220000
2018-03-31     100000
2018-09-30     540000
2019-01-01    1000000
2019-03-31     686000
2019-06-30    3000000
Name: Spend, dtype: int64

In [5]:

df1['Quarter'] = pd.PeriodIndex(df1['As_on'], freq='Q-MAR').strftime('Q%q')#Adding quarter to the dataframe.

Q-DEC specifies quarterly periods whose last quarter ends on the last day in December.¶

Q-MAR specifies quarterly periods whose last quarter ends on the last day in March.¶

In [ ]:

df1

In [7]:

df1['Year']=df1.As_on.dt.year#Adding Year to the dataframe.
df1

Out[7]:

	Focus_area	City	As_on	Spend	Quarter	Year
0	Education	Mumbai	2019-01-01	1000000	Q4	2019
1	Healthcare	Pune	2019-03-31	20000	Q4	2019
2	Environment	Chennai	2018-03-31	100000	Q4	2018
3	Disaster Relief	Bengaluru	2017-12-01	220000	Q3	2017
4	Vocational Skills	Varanasi	2018-09-30	540000	Q2	2018
5	Education	Pune	2019-06-30	3000000	Q1	2019
6	Environment	Pune	2019-03-31	666000	Q4	2019

In [12]:

df2 = pd.pivot_table(df1,index=['City','Focus_area'],columns=['Year','Quarter'],values=['Spend'],fill_value=0)#works
#df2.query('City==["Pune","Mumbai"]')#works

In [13]:

df2

Out[13]:

		Spend
	Year	2017	2018	2019
	Quarter	Q3	Q2	Q4	Q1	Q4
City	Focus_area
Bengaluru	Disaster Relief	220000	0	0	0	0
Chennai	Environment	0	0	100000	0	0
Mumbai	Education	0	0	0	0	1000000
Pune	Education	0	0	0	3000000	0
Environment	0	0	0	0	666000
Healthcare	0	0	0	0	20000
Varanasi	Vocational Skills	0	540000	0	0	0

In [14]:

df2 = pd.pivot_table(df1,index=['Year','Quarter'],columns=['City','Focus_area'],values=['Spend'],fill_value=0)#works
df2

Out[14]:

		Spend
	City	Bengaluru	Chennai	Mumbai	Pune	Varanasi
	Focus_area	Disaster Relief	Environment	Education	Education	Environment	Healthcare	Vocational Skills
Year	Quarter
2017	Q3	220000	0	0	0	0	0	0
2018	Q2	0	0	0	0	0	0	540000
Q4	0	100000	0	0	0	0	0
2019	Q1	0	0	0	3000000	0	0	0
Q4	0	0	1000000	0	666000	20000	0

In [ ]:

df2.stack(level=0)#works
#level=0 is the outermost level i.e. spend
#level=1 is the next level i.e. city
#level=2 is the innermost level i.e. Focus_area

In [22]:

df2.unstack(level=1)
#level=0 is the outermost level i.e. Year
#level=1 is the next level i.e. quarter

Out[22]:

	Spend
City	Bengaluru	Chennai	Mumbai	...	Pune	Varanasi
Focus_area	Disaster Relief	Environment	Education	...	Environment	Healthcare	Vocational Skills
Quarter	Q1	Q2	Q3	Q4	Q1	Q2	Q3	Q4	Q1	Q2	...	Q3	Q4	Q1	Q2	Q3	Q4	Q1	Q2	Q3	Q4
Year
2017	NaN	NaN	220000.0	NaN	NaN	NaN	0.0	NaN	NaN	NaN	...	0.0	NaN	NaN	NaN	0.0	NaN	NaN	NaN	0.0	NaN
2018	NaN	0.0	NaN	0.0	NaN	0.0	NaN	100000.0	NaN	0.0	...	NaN	0.0	NaN	0.0	NaN	0.0	NaN	540000.0	NaN	0.0
2019	0.0	NaN	NaN	0.0	0.0	NaN	NaN	0.0	0.0	NaN	...	NaN	666000.0	0.0	NaN	NaN	20000.0	0.0	NaN	NaN	0.0

3 rows × 28 columns

In [ ]:

-->

Sustainability

Pages

Thursday, March 12, 2020

multi-index (column) report using pandas

Q-DEC specifies quarterly periods whose last quarter ends on the last day in December.¶

Q-MAR specifies quarterly periods whose last quarter ends on the last day in March.¶

No comments: