Thursday, December 19, 2019

Cleaned category list using Python3

I analyzed an excel containing a list of 300+ #unicorns using #Python and #Pandas. I made some nice charts also. 

Later I realized that the column containing the classification values of unicorns such as TravelTech, EduTeach, Ecommerce had not been written consistently. 

These similar looking classification values were written differently. 

Ecommerce was written as eCommerce, ecommerce, e-commerce and so on.  With these classification values my analysis wasn’t right. The grouping on classification values had given me incorrect analysis. These kinds of errors are common when no data validation is in place.

So started all over again. Just to describe in this post; I have taken the values and created a list. 

The existing values are given below. 

['Auto Tech', 'AutoTech', 'Digital health', 'Digital Health', 'EdTech', 'Edtech', 'Ed Tech', 'e-commerce', 'eCommerce', 'ecommerce', 'Food & Beverage', 'Food & Beverages', 'Food and Beverage', 'Health & Wellnes', 'Health & Wellness', 'IoT', 'Internet of Things', 'Sales Tech', 'SalesTech', 'On Demand', 'On-Demand', 'On-demand', 'Supply Chain & Logistics', 'Supply chain & Logistics', 'Travel Tech', 'TravelTech']

Using Python, I cleaned the list. I used #Spyder 4.0 which is beautiful. I used good old loops in the logic. I am comfortable with loops. 

The new list is given below.

['Autotech', 'Autotech',    'Digitalhealth',    'Digitalhealth',    'Edtech',  'Edtech', 'Edtech', 'Ecommerce',    'Ecommerce',    'Ecommerce',    'Food&Beverages', 'Food&Beverages', 'Food&Beverages',    'Health&Wellness', 'Health&Wellness', 'Iot', 'Internetofthings', 'Salestech', 'Salestech', 'Ondemand', 'Ondemand', 'Ondemand', 'Supplychain&Logistics', 'Supplychain&Logistics', 'Traveltech', 'Traveltech']   

The new cleaned  list is now ready for analysis. All the classification values are written consistently. 

However, there is one more iteration I have to do. IoT and ‘Internet of Things’ are shown separately.

I hope to take care of that as well shortly. 

Saturday, November 30, 2019

Mutual Funds Performance

The diagram shows two sub plots. The left subplot shows the 5 large cap funds by their names ans assets (AUM) in rupees crores.
One way to judge a fund is by its AUM. The AUM has grown because investors have invested money it. The large AUM may overcome the sudden withdrawals by investors. 
But this is not the only way to evaluate performance of a fund. 
So on the right hand side I have plotted another chart indicating the performance (returns) of the fund over last 10 years for the regular scheme. All funds have given a return in excess of 10 per cent. 
10 years may be a good indicator to judge the performance. 
However both the parameters together may not be sufficient to evaluate performance of funds. There are other factors as well that are not considered in this post.

The data has been analyzed and plotted using Python3 and pandas. 
This time I imported .xls file in to pandas for analysis. 
The source of the data is https://www.amfiindia.com

Disclaimer: This post is not a suggestion or advice to invest in any particular mutual fund. Please contact your investor advisor for it. 

Tuesday, November 26, 2019

Sankey Diagram

Sankey diagrams are a type of flow diagram in which the width of the arrows or bands is proportional to the flow rate. 

Above is the diagram of three mutual fund houses, their types of mutual funds and where they invest namely large caps, mid caps, small caps equities debt. 

As against a table or numbers the Sankey diagrams help understand the data. 

The Sankey diagram is my first plot made using plotly and Jupyter notebook. 

#Sankey diagrams emphasize the major transfers or flows within a system. 

Sunday, November 24, 2019

6 largest charitable foundations worldwide

List of wealthiest charitable foundations (From Wikipedia)

This is a list of wealthiest charitable foundations worldwide. It consists of the 6 largest charitable foundations, private foundations engaged in philanthropy, and other charitable organizations that have disclosed their assets. In many countries such disclosure is not legally required, and often not done.

Only nonprofit foundations are included in this list. Organisations that are part of a larger company are excluded, such as holding companies.

The entries are ordered by the size of the organisation's financial endowment (that is, the value of assets net of liabilities, or invested donations). The endowment value is an estimate measured in United States dollars, based on the exchange rates on December 31, 2016.

Due to fluctuations in holdings, currency exchange and asset values, this list only represents the valuation of each foundation on a single day.

wealthiest charitable foundations worldwide
6 largest and wealthiest charitable foundations worldwide

Tuesday, November 19, 2019

A Trillion and Indian Mutual Fund Industry

A Trillion and Indian Mutual Fund Industry 

Lately the word trillion has come in conversations in India. 

Indians are used to writing numbers in lakhs and crores. A lakh is a hundred thousand. A crore is hundred hundred thousand. In the west the numbers are written in sets of three zeros such as a thousand or a million. A trillion has 12 zeros. 

A trillion rupees in Indian way of counting is Rs 1 Lakh Crore. 

How big is this amount?

Let us put it in perspective by looking at Indian mutual find industry. 

Each of the top 9 mutual fund houses, has assets under management (AUM) in excess of 1 trillion indian rupees. 

Incidentally the top 10 mutual fund houses account for 80% of the AUM of the entire mutual fund industry. 

The total AUM of all the mutual fund houses is 25 trillion Indian rupees in September 2019. 

If I compare this number with Indian GDP, which was USD 2.6 trillion in 2017, it is less than 15%. 

Worldwide the average ratio of AUM with GDP is 13.62%. The ratio of AUM with GDP for USA is 100% plus. 

The future
Indian AUM of mutual funds is expected to grow to 100 trillion rupees by 2025. It is indeed a great news! 

The future for the Indian mutual fund Industry is definitely bright.

Sunday, October 6, 2019

World Unicorns 2019


World Unicorns 2019
USA obviously is ahead with 172 unicorns. China comes second with 89 unicorns. UK has 17, with India a close behind UK with 16 unicorns.
South Korea and Germany are next with 8 each. 9 countries have between 2 to 4 unicorns. 11 countries have at least 1 unicorn. In all 26 countries have unicorns.
How I did it?
World unicorns 2019
For mapping unicorns on the world map; I started with downloading the excel file from CB Insights.
The first challenge to me was the valuation was in currency format which was read as a string in pandas. By writing a function the valuation was converted into a number.
Then built-in group by function was used to group and aggregate on count of unicorns by country.
Now I did not have country codes in my data frame. The country codes were needed later to map it using pygal library. So, I downloaded country codes file from pygal. Later I mapped it and had my original data frame have country codes in an additional column. However, the mapping did not recognize South Korea. Its official name is different. So, I had to correct it.
Then the countries were put in five buckets as per the count of unicorns in a country.
The final challenge was to convert these five data frames into dictionaries. It was repeatedly giving type error. So, I had to go back to pandas and dictionaries and read again about sub setting pandas and converting to dictionaries. Finally, with a lot of tries I could convert five data frames in to five dictionaries.
These five dictionaries were used with pygal to plot the words map.

Friday, August 17, 2018

Topic-Sub-total in Excel


Topic-Sub-total in Excel

Situation-
Do you do subtotals of numbers having categories? Please see the example below.

Example-
Let us say you have two stores namely store A and store B. And you want to do sum of store sales and also the total sum. Please see the screen clip below. If you use SUM formula for sub-totalling and also for totalling you get a result that is not the correct number. In fact, it sums two times. 


























Solution-
The problem can be overcome by using the function called sub-total to sum up. When you do that, you are asked to choose the code. Since we want to sum up within sub-total function, we will use code 9 for sum. Please see the screen clip below. 

For doing the total (or total of total) use sub-total only. Now you get the correct total. It works even if you add more entries. It works in Google Sheets as well in LibreOffice Calc