How does contributing to open source work (for me)?
Recently, we have been on boarding some new staff at Astun (where I work) and one of the questions that seems to keep coming up is “how does open source work?” So I’ve been giving a bit more thought to how and why I contribute to open source projects than I usually do, and whether open source is sustainable in this way.
You have to remember that I’m old (really old in computing terms) such that when I started out if you wanted software on your Unix computer you compiled it yourself from open sources or wrote it yourself. There was of course some provided code on the machine after all you needed to use Sun’s
cc to compile
gcc so you could compile everything else you needed. Linux was just starting to be available when I started my first job but most of our work was done on Sun workstations. As I was working at a university (during the Thatcher years) the big draw of open source code was the cost (free) but over time I also came to appreciate the value of being able to look at the source code to see how a program worked so I could either write something similar or fix the problem I had found. I found my first compiler bug while I was working on my PhD and fortunately could print out that piece of code to take to the support team to show them why it was wrong.
When we completed the program I was developing in that first job I released it under the GNU Public Licence (GPL). Partly because it was too hard to work out who owned it - the university (my employer) who claimed to own anything I wrote or the ESRC (the funder) who also claimed to own anything I wrote when funded by them. I could foresee spending time in small offices talking to lawyers if we tried to commercialise it, and that didn’t appeal.
During the next few years, James Macgill and I started writing and collecting useful bits of Java code into what became GeoTools version 1. Again we gave the code away under the GPL, partly for the same reasons as before but mostly so we could help other people produce repeatable science. We likened computational geography to (say) chemistry before commercial glass ware became available - your experiment might fail to replicate someone’s results not because their results were wrong but because you had built the apparatus wrong. We felt (and still do) that if the only way to check someone’s paper was to drop thousands of pounds (or dollars) on a copy of Arc/Info then that was a problem (and how could we know there were no bugs in Arc/Info). So we encouraged others to take our code and check it, extend it and pass it on to other users, in then hope that better science would be produced.
Along the way we came across the Open Geospatial Consortium (OGC) and the web map test bed which fitted nicely with some work I was involved in on the public participation in planning which needed maps on the web (though in those days we still had to take computers out to the public to get useful numbers of users). We also met Chris Holmes and Rob Hranac of the Open Planning Project who were also working on this sort of thing. From those meetings GeoServer was born, again we all benefited from access to other open source code and again released the project under the GPL to encourage reuse and contributions.
Overtime we pulled in code such as Seagis or built on top of other open source code such as JTS. We also added support for a variety of new data sources in a plugable way that allowed others to add just the bits of code they needed.
Next year sees the 20th anniversary of GeoTools version 2 and the project is still progressing nicely.
A lot of my job is now providing training and support to users of GeoServer (along with QGIS and other open source geospatial projects). Some of that “support” time is used to maintain the bits of GeoServer and GeoTools that are not actually fun (security updates and other technical debt). In my other working day of the week (when I freelance) I help out people who would like some custom GeoTools code written. For example, the ability to specify which transform you’d like to use when converting to and from a coordinate reference system (GEOT-6920) was funded by the Tanzanian Ministry of Home Affairs. And finally in my spare time I write code that interests me, there are many sources of ideas:
- The GeoTools and GeoServer issue trackers are often fertile places to search.
- GeoTools questions on gis.stackexchange.com are always likely too.
- Other questions I see around on twitter.
Basically though something has to sound interesting for me to spend my limited free time on it. After all I don’t have a lot of free time to program and I really want to spend that time having fun (or I could be doing something else that is fun instead). The only other thing that motivates me is that some bug (or lack of feature) is embarrassing, for example I recently finished off Jody’s work that allows GeoServer administrators see which modules and extensions are loaded in their version. Every time we went to this page when I was running a GeoServer course I would have to explain why this page didn’t actually show all the extensions (including the one we had just added) and there was really no go reason other than lack of time (and funding) so I just sat down and fixed the remaining modules (except the community ones, you are still on your own there). This means that the next GeoServer course will be slightly more fun for me to do (provided I’ve updated to 2.20.x).
So recently “projects” have been adding some functionality to GeoTools that I was interested in such as contouring point layers. I had always wondered how it was done, and GeoTools currently couldn’t do it and I had the whole of the christmas holiday to play. I’ve also recently had a look at how to maintain topologically correct borders between polygons when simplifying them. This came from a question on gis.stackexchange and again turned out to be an interesting problem that is probably useful for many users. The latest project is looking at adding a label line along the centre of a polygon. In asking for more help on this problem I’ve found that MapServer is trying to add this functionality too (so now it’s a race). In the time this blog post has been languishing part written I’ve managed to finish this up (the answer is to use Dijkstra to find the longest shortest path between nodes of index 1).
So, I now have a much better handle on how JTS works (for some functions anyway) and more respect for topology than I used to. I think that next I might try to look at some problems that don’t require me to read up on graphs. The obvious next step is to package up the code with some unit tests, push it in the GeoTools code base and then add something to the GeoServer manual so that other people can use them. I promise I will get around to that some time (if you would like me to do it urgently then please get it touch and I can talk you through the process or give up a price).
If you would like to know more of how I (and Andrea) spend our time please come and watch our talk at FOSS4G 2021 (it’s this Wednesday at 13:00UTC).