The value of GitHub’s Network Graph

January 1st, 2016 by bostjan

While some online posts rant about suggest GitHub’s network graph is broken or that it outright sucks, I must admit I find it extremely useful!

As a maintainer of some open source projects, like Snoopy, the command logging library, it enables me to get a glimpse of what “my” users are up to. Primarily, I can obtain two valuable pieces of information:

  1. What features are missing so badly that users are already trying to implement them themselves?
  2. How many changes to “my” software exists without corresponding pull requests?

Ad 1: Missing features

If you want your software to be actually used by others, you have to provide users with what they need. Otherwise someone will fork your software and do that instead of you.

Ad 2: Missing pull requests

Face it: as a FOSS user, if you need that nice software, which is almost perfect for all your needs, to do something that it currently doesn’t do, you can implement necessary changes yourself. From there on, the easiest way for you would be to submit changes upstream, in order to get them merged with original project. This means you are relieved of future maintenance of your patchset, which would occur on almost every version upgrade (hunk offsets, conflicts, etc.). Achieving this is basically equal to handing your work over to someone else, sans payment.

In order to actually be able to do that, you must be prepared to play by their rules. Maintenance is not an easy task. Maintainer needs to touch many places of software code, and is mainly concerned with the big picture of the project. Therefore usually she can NOT hold everything in her head. Furthermore, if given software contains a test suite, it is there to facilitate simpler maintenance work and quality assurance, and an instant feedback to developers. This is the reason that many projects require “corresponding tests for all submitted changes, else rejected”.

There is another reason why you should always be very extensive in your test cases: future changes to code that that will cause test suite failures, even if failure occurs in parts of code you have contributed, will have blame directed to creator of that change directly and not to you first (“hey, your code does not work, again!”, “well, my code is ok, but this new change did this and that which causes failure in my parts of code”). If your change is already accepted upstream and CI test suite is passing, if future work breaks test cases that test your parts of code, it will be up to fresh contributor and maintainer to resolve the issue.

Back to GitHub’s network graph:

If there are many changes to your software floating around GitHub, without users even trying to merge changes upstream, everybody is losing. You are losing valuable contributions (alternatively spelled as “free work on your project goes to waste”), and non-contributing users are setting themselves up for additional work in the future (merge conflicts, updates behind schedule).

If this is the case, it would be best to find reason(s) why upstream contributions are not happening. Here are a few possibilities:

  1. your project’s structure is too complex and users who implement changes believe those changes will not be accepted upstream;
  2. your pull request acceptance rules are too harsh (if you require pull requests to contain code on properly named branches, accompanied by relevant test cases, and a box of lager being sent to your home address – well, maybe you should drop the “test cases” requirement:);
  3. you are generally an asshole to interact with;
  4. you are coldly rejecting pull requests for not following your rules, instead of reminding contributors about that, insisting on what is required while positively encouraging them, and working nicely with them to get there;
  5. you have valid technical requirements, but your users are not automatically notified when their contributions break things – missing test suite and/or automatic test suite run on every pull request;
  6. you do not have contributor’s guidelines clearly specified and publicly available where relevant (see GitHub’s method for displaying contribution guidelines, for example).

Social coding is fun, and riding the wave of it is far better than struggling to swim behind while trying to catch on. If you are stuck behind, maybe investing your time in an all-nighter of refactoring (code and/or process) can make your experience nice once more.

Happy contributing!

Leave a Reply