Sunday, January 19, 2020

ET500-Raw data to pandas dataframe to charts

ET500-Raw data to pandas dataframe to charts

Recently Economic Times published ET500 – list of top 500 companies in India.

I copied the data from website for top 25 companies. It was continuous and looked like this.

There were no delimiters. Using python, I imported the file, and using re module; cleaned and separated elements in each line. Then imported these in pandas data frame. It looked as this.


The next step was to plot it. The below plot is using seaborn library.


Within these spaces the topprofit-making (Rs 30,000 Cr and above) companies are Reliance, ONGC, TCS. The next bracket of Rs 20,000 Cr and above but below Ra 30,000 Cr has HDFC Bank. Rs 15,000 Cr and above, but below the above levels have Indian Oil, HDFC and Infosys. 

Till Rs 10,000 Cr of PAT; revenue and PAT appear to go together. After that PAT level there is a lot of deviation in revenue levels. 

In my next blog I will do more analysis and visualization. 

1 comment:

Makarand Karkare said...

Cool work.
How do you make time for this?