The Great Relicensing

If you were around the Rust GitHub community in early 2016, you may have seen or even received an issue that looks like this:

github search results for author:cmr is:issue Relicense

If you never saw one of these issues or want a reminder, the issue template is here and there are plenty of issues to look at!

One of my motivations in the "Great Relicensing" as I have dubbed it was to enable easy uplift and free transfer of code between community and rust-lang repositories. Plenty of code out there was MIT, but Rust is dual Apache-2.0/MIT (the "Rust license"). It's annoying to make COPYRIGHT longer, and that doesn't help when you want to put code into std.

In numbers, the relicensing as of 2019-08-23 (source):

- 870 issues 
- 870 participants
- 430 repository owners
- 62% closed (540)

Some Python scripts drove the Great Relicensing. It was file-with-list-of-jobs based and in practice very interactive, with tight loops reprocessing elements at a time, introducing new scripts and files as needed. As a conservative first step (and ultimately the only step), I targeted all packages with Apache-2.0 XOR MIT for "upgrade" to the Rust license. Ultimately 752 repositories went through the scripts in the first round. According to my emails with GitHub 1579 repositories were in my crosshairs, but I can't corroborate that with the contents of the repo. While they were running, the scripts would look at the issue comments for comments indicating consent and update the checklist of contributors in the issue body. Once all contributors had signed off, it would open a pull request. In theory there is a timing attack possible if there is a substantial delay between when I created the issue and when the PR was created. I did not think to check for this at the time, although it was relevant when clippy later relicensed from MPL-2.0. The list of repositories processed in this manner is here, although I haven't checked them for this problem.

I managed to annoy some people, and GitHub (very kindly‡) asked me to stop, so I turned off all my scripts. Clicking into a few of the still-open issues, I see several that had all contributors check off but no robot to come do the PR. If you are feeling adventerous, perhaps you can write a script to find these projects and help them along. One of my takeaways from the relicensing is the idea of ecosystem-wide automation-assisted coordinated action. GitHub monetizes this niche today, and you can buy access to automation through their Marketplace. Instead of just making a new relicense assistent, a Rust Ecosystem Friendly Robot of some kind (think highfive, triagebot, dependabot, all rolled together and more) could help extend crate-quality practices across more of the ecosystem. It's fun to imagine large-scale open source coordination!

I wrote some code to look at how licenses have changed over time. I'm not sure my metholodgy was bulletproof, but I ended up with a spreadsheet and a chart.

stacked line plot of licenses over time

Here's the same chart, but instead of plotting the number of packages using the license, plotting the percentage of packages using the license:

same data but normalized to show percent of crates using a license

There is indeed a noticable increase in proportion of packages using the "Rust license" (or a superset) over the course of 2016. While putting together this chart I noticed that people use lots of weird license combinations. It's hard to imagine what motivates someone to use the "AGPL-3.0/GPL-3.0/MIT/Apache-2.0" license.

emails from github

(email text).

You'll only receive email when they publish something new.

More from cmr
All posts