#WebRTC 101: 1st assignment

Now that we understand the basis of libwebrtc code management, we can start answering otherwise problematic questions. This week I was at CommConUK, and was discussing the number of contributors to libwebrtc, pointing my interlocutor to the AUTHORS file to start with. “Less than 100” was the other party position, and to be honest, I had never checked. So, who would risk a guess as to how many contributors to the webrtc stack there were in the past three years, and more importantly, how to check?

Let’s detail the rules of engagement here. The comparison we are trying to make is with other projects hosted in GitHub, over a three years duration. Github account for every single commit, even if two people from the same company commit to a project. For the numbers to be comparable, we need to account for users the same way.

The AUTHORS file in the libwebrtc repository is a good start. It is actually 97 lines long. Ha Ha! I was wrong, and indeed there were less than 100 contributors did I first think. Great, I learned something.

Except, well, it s slightly more complicated than that.

First I was using revision 73, while the latest version as just barely more than 100 entries (here). Ok, not a big difference.

More importantly, 30+ of those entries were corporation entries, so that 100+ number was not directly comparable to the numbers of contributors in a GitHub project. So I decided to extract all the commits of the past 3 years, and to count the number of unique emails used by authors.

This is not very difficult if you have access to a shell:

git log --pretty=short | git shortlog -s | wc -l

And the magic number is … 574. Ok, it must be slightly less, because of the buildbot and duplicated entries, but not by much. That’s since 2011, the beginning of the project. Adding –after=”2017-06-30″ to the log command leads to 281 entries.

But, wait, didn’t we learn in WebRTC 101 that the DEPS mechanism allow libwebrtc to be composed of several git repositories? So if we want to compare the number of authors of the entire stack to another full stack, we need to iterate over the dependencies and repeat the operation.

As shown in 101, one just has to look at the content of the .gclient_entries file to have a flat list of subdirectories where the submodule git root is. I’m duplicating the list here without the git hash to make it concise.

entries = {
  'src': 'https://chromium.googlesource.com/external/webrtc.git',
  'src/base': 'https://chromium.googlesource.com/chromium/src/base,
  'src/build': 'https://chromium.googlesource.com/chromium/src/build,
  'src/buildtools': 'https://chromium.googlesource.com/chromium/buildtools.git,
  'src/testing': 'https://chromium.googlesource.com/chromium/src/testing,
  ...

So here is the assignment: write a script that compute the number of commits per name across the dependencies. Make sure to include the commits from the main repository. Bonus points for those who will deduplicate users who committed under different names/emails. The first one to find a close enough number will get an A0 poster of all the main classes of libwertc signed and sent to him/her. Best luck.

In any case, it’s way more than 100.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.