Showing posts with label pandas. Show all posts

Monday, August 15, 2022

Ramsar Sites in India

India, on its 75th year of Independence (15th August), achieved a milestone of 75 Ramsar sites.

It recently added 11 more wetlands to the list of Ramsar sites to make total 75 Ramsar sites covering an area of 13 lakh 26 thousand 677 Hectare in the country.

I have created an interactive map. Kindly visit the map at https://andybandra.github.io/ramsar_india/ You can see and click the map shown below.

On the map you may click an icon to get the name and state of a site, and its area in hectares.

India is one of the Contracting Parties to Ramsar Convention, signed in Ramsar, Iran in 1971. India signed it on 1st February 1982. These are wetlands deemed to be of "international importance" under the Ramsar Convention.

What is RAMSAR Wetland Sites

The Ramsar Convention is an international treaty for the conservation and sustainable utilization of wetlands, recognizing the fundamental ecological functions of wetlands and their economic, cultural, scientific, and recreational value.

Tamil Nadu has the highest number of Ramsar Sites in India with 14 Ramsar Sites. The largest site is marked in blue icon. It is the Sundarbans National Park a national park, tiger reserve and biosphere reserve in West Bengal,

The list is available at https://en.wikipedia.org/wiki/List_of_Ramsar_sites_in_India.

75 Ramsar India Sites

Saturday, December 11, 2021

KeralaDay3

Kerala Day 3

Periyar Tiger Reserve

We started from Munnar. We came down a moutain range and climed up another. I did not note down the geo locations while coming down and hence those are missing. While we did not see tigers and elephants we did see the following. Great cormorant, Woolly-necked stork, Sambar deer, Bengal monitor, White-throated kingfisher, Brahminy Kite and Gaur.
Please see the photos my facebook.

The path taken

The above interactive visualization was made using Python and Folium. I wanted to make a visualization with Folium for last 2 years!
The folium library in turn uses the leaflet library to render the web-page visualisation. Using Folium I do not have to worry about the HTML, CSS, Javascript code! Plus Python allows me to hold my data in pandas and manupulation of the data becomes convenient. Customizing markers and icons becomes easy using both the pandas and folium libraries.

Folium was not part of the standard Anaconda distribution. It had to be downloaded by using the following command.
$ conda install folium -c conda-forge
at the Anaconda promo. But before that I created a separate environment for folium (folium_project) by closing the base environment. The command was
$ conda create --name folium_project --clone base

Folium allows one to leverage the power of leaflet by just writing a few lines of code. The web-page visualization including its code is on my github account.

Tuesday, March 10, 2020

CSR Multi-index report in python using pandas

I wanted to make a report with multi-index columns and rows using python and pandas.

The starting point was data in Excel file. The data was brought in pandas data frame.

I wanted an output report that looked like the report shown below.

But there was one condition. The condition was the quarters of the year has to be as per Indian financial year that starts on 1st April and end on 31st March the following year.

What helped me was the below syntax.

Q-DEC specifies quarterly periods whose last quarter ends on the last day in December.

Q-MAR specifies quarterly periods whose last quarter ends on the last day in March.

So, I chose Q-MAR.

The data shown is about spend on Social and Environmental efforts of a company in India. A similar report has to be published by the company, as applicable, as per Indian laws.

The report format can also be extended to report on financial numbers. The data itself is imaginary.

Monday, January 20, 2020

ET top 25 - further analysis

Further to below blog, I have added a few more plots.

A) I have added a column to dataframe to indicate if the company belongs to public (pub) sector or private (pvt) sector. The below plot (similar to the earlier blogpost) shows the plot by types of companies.

B) I also wanted to study the distribution, so I have added a histogram. From it, we can see that most of companies are below revenue level of Rs 2,00,000 Cr (USD 28.5billion).

C) I added one more calculated column in the dataframe to show profit percentage compared to revenue in the below graph. High profits percentage companies are in private sector and are IT companies namely TCS (only company with 20% plus profit ratio) and Infosys. The other two companies are banking firms namely HDFC Bank and HDFC. The top five companies by revenue are not that profitable by the ratio of profits to revenue and most of them are public sector companies.

The data was analyzed using python, pandas and plotted using seaborn library.

Thursday, December 19, 2019

Cleaned category list using Python3

I analyzed an excel containing a list of 300+ #unicorns using #Python and #Pandas. I made some nice charts also.

Later I realized that the column containing the classification values of unicorns such as TravelTech, EduTeach, Ecommerce had not been written consistently.

These similar looking classification values were written differently.

Ecommerce was written as eCommerce, ecommerce, e-commerce and so on. With these classification values my analysis wasn’t right. The grouping on classification values had given me incorrect analysis. These kinds of errors are common when no data validation is in place.

So started all over again. Just to describe in this post; I have taken the values and created a list.

The existing values are given below.

['Auto Tech', 'AutoTech', 'Digital health', 'Digital Health', 'EdTech', 'Edtech', 'Ed Tech', 'e-commerce', 'eCommerce', 'ecommerce', 'Food & Beverage', 'Food & Beverages', 'Food and Beverage', 'Health & Wellnes', 'Health & Wellness', 'IoT', 'Internet of Things', 'Sales Tech', 'SalesTech', 'On Demand', 'On-Demand', 'On-demand', 'Supply Chain & Logistics', 'Supply chain & Logistics', 'Travel Tech', 'TravelTech']

Using Python, I cleaned the list. I used #Spyder 4.0 which is beautiful. I used good old loops in the logic. I am comfortable with loops.

The new list is given below.

['Autotech', 'Autotech', 'Digitalhealth', 'Digitalhealth', 'Edtech', 'Edtech', 'Edtech', 'Ecommerce', 'Ecommerce', 'Ecommerce', 'Food&Beverages', 'Food&Beverages', 'Food&Beverages', 'Health&Wellness', 'Health&Wellness', 'Iot', 'Internetofthings', 'Salestech', 'Salestech', 'Ondemand', 'Ondemand', 'Ondemand', 'Supplychain&Logistics', 'Supplychain&Logistics', 'Traveltech', 'Traveltech']

The new cleaned list is now ready for analysis. All the classification values are written consistently.

However, there is one more iteration I have to do. IoT and ‘Internet of Things’ are shown separately.

I hope to take care of that as well shortly.

Sustainability

Pages