NLPx

Tales of Data Science

Top 10 countries on StackOverflow and GitHub

stackovwerflow_github

Here I would like to show you the result of my analysis, which I conducted in late October, 2014 in order to find and outline statistical trends, connected with users from different countries.

How does the amount of users from different countries changes with time? What countries there are most users from? Citizens of what countries commit on GitHub more? These questions I wanted to answer while working on this analysis. Please note, that this analysis is “shallow” enough – I mean that I didn’t analyze StackOverflow rating system thoroughly, just in a nutshell. Please mention this.

The study was conducted on 24 October, 2014, and if you are interested, I may update this study to bring it up to date, just let me know 🙂

You also may read this post in Russian if you like. There are more information about presence of Russian citizens on StackOverflow and GitHub.

In late October, 2014 there were 3 580 212 registered users on StackOverflow and 3 955 191 registered users on GitHub. I analyzed statistical data for 2013 and 2014 for StackOverflow and for GitHub I worked through data for 2012, 2013 and 2014.

Please note that this statistical data is not complete as more than 25% of users on StackOverflow didn’t specify their citizenship (or country of living). For GitHub this amount is much higher – 70%. But the data I collected must be enough to make a comparison of countries.

StackOverflow

Here you can see top-10 countries by absolute number of users (late October, 2014):

Country Number of users
1 USA 253452
2 India 67297
3 Great Britain 33395
4 Germany 19706
5 Canada 16685
6 China 14234
7 Australia 12592
8 Brazil 12325
9 France 12217
10 Russia 11319

On the map below you may see the ratio of number of registered users to total number of citizens per country. More formally, “number of registered users divided by number of citizens”.

I think, the results are pretty obvious, aren’t they? You may see that English-speaking countries have higher ratio than others. And please remember that China have been experiencing problems with Great Firewall of China  – that is why (maybe) their ratio is so low.

As you may know, there is so called “reputation” system on Stack Overflow. Here we will consider users having reputation more than 1000 experienced users. There are 68 044 users on Stack Overflow with reputation more than 1000 (including 1000), which is 1,9% of all users.

Below you may see a number of experienced users per country for top 10 countries in 2013 and 2014:

2013 2014
Country Number of experienced users Country Number of experienced users
1 USA 9592 USA ~18000
2 Great Britain 2906 Great Britain 4182
3 India 2005 India 3460
4 Germany 1461 Germany 2434
5 Canada 1397 Canada 1995
6 Australia 1260 Australia 1612
7 France 724 France 1178
8 Sweden 637 Netherlands 1075
9 Russia 447 Sweden 880
10 Poland 439 Russia 800

 

On a map below you may see a gain in experienced users from 2013 to 2014 per country. In other words, number of users in 2014 minus number of users in 2013.

I’m a bit amused that China is not in top-10

GitHub

As you know, GitHub is a web-service to post and share code written in various programming languages. More formally, GitHub is a web-based Git repository hosting service, which offers all of the distributed revision control and source code management functionality of Git.

By July, 2012, all projects on GitHub had 6 826 827 commits. And in late 2013 users who specified their country of living (about 26% of all users) made 28 500 000 commits (about 44% of overall number of commits).

Below you may see a comparison of top 10 countries by percentage of commits for every country for 2012 and 2013. In other words, just a percentage of commits made by citizens of a country in 2012 and 2013.

2012 2013
Country Percent of overall number of commits Country Percent of overall number of commits
1 USA 38,6 USA 35
2 Great Britain 6,3 Great Britain 7
3 Germany 6 Germany 6
4 Canada 4 China 5
5 Japan 3,8 France 4,5
6 China 3,6 Canada 4,2
7 France 2,7 Japan 4
8 Netherlands 2 Russia 3,5
9 Brazil 1,9 Brazil 3
10 Russia 1,8 Australia 2,5

 

On the map below you may see an increase in percent of commits per country (based on a table below). Note that USA had a decrease and France had the biggest increase of 1.8%, Russia was on the second place with an increase of 1.7%

On the table below you may see a top 10 countries by ratio of number of GitHub users to overal number of internet-users for the country for year 2013. In other words, number of GitHub users divided by all internet users in a country and multiplied by 100.

Country Percent of GitHub users compared to overal number of internet-users for the country
1 USA 31
2 Great Britain 6,5
3 China 6
4 Germany 5
5 France 4,5
6 Brazil 4
7 Canada 3,5
8 India 3,3
9 Russia 3
10 Japan 2,5

 

In 2014 the situation was as follows: more than 1 050 000 active GitHub users specified their country of living (of more than 3 955 000 overal users). For open GitHub personal repositories there may be a rule: the more interesting code a user writes, the more followers he or she obtains. And we considered a GitHub user successful if he or she had at least 10 followers.

Below you may see a table of top 10 countries sorted by number of active GitHub users and number of active successful users:

Number of active users per country (of 1 050 000 users overall) Number of active successful users per country  (of 78460 users overall)
Country Number of users Country Number of users
1 USA 176910 USA 14675
2 Great Britain 34628 Great Britain 2659
3 China 32009 China 2548
4 Germany 28341 Germany 2226
5 India 25761 Japan 1708
6 France 18549 France 1257
7 Canada 16539 Brazil 1160
8 Russia 16319 Canada 1068
9 Japan 16020 Australia 977
10 Australia 14565 Russia 719

On the map below you may see a ratio of active successful GitHub users per country to number of all active GitHub users per country. In other words, number of active successful GitHub users divided by number of all active GitHub users for every country and multiplied by 100.

I found it interesting that this ratio is the highest in Japan and the lowest – in India.

Notes

There is a large gap between number of users from English-speaking and non-English-speaking contries on GitHub and StackOverflow, which is obvious. It is interesting that ratio of users from non-English-speaking contries is higher on GitHub rather than on StackOverflow. For example, there are about 1.5% GitHub users are from Russia, while there are only 0.5% StackOverflow users are from Russia. So we can conclude that to use GitHub a person should know just basic English (A0-A2), while to actively participate on StackOverflow a person must speak English much better (about B1-B2).

 

References

Good research of GitHub popularity in 2012

Number of GitHub commits in 2013. Nice visualization

If you are interested in StackOverflow statistics, I recommend you to use this services:

6,052 total views, 1 views today

Top 10 countries on StackOverflow and GitHub
5 1 vote

Leave a Reply

6 Comments on "Top 10 countries on StackOverflow and GitHub"

avatar
Sort by:   newest | oldest | most voted
sai krishna gaddipati
Guest
sai krishna gaddipati

Hmm India is not activley using Gthub. So what are we using Bitbucket?? Own Repos??

Sheryl
Guest

Nice. We needed something like this for our project. We would like to quote a few of your statistics, if that is ok. Just undergrads doing a project on gitHub use.

Hitesh
Guest

interesting read ..How come n one else read this post

wpDiscuz