Changelogs: To write or to generate?
The first rule of automation: Automate as much as you can, but not more.
In the last two posts, we’ve established why changelogs are important and how they should look. In this post, we’re going to take a look at the practicalities.
Let’s start with where to put it - If your project is on GitHub, you have two good places to write a changelog. One is simply creating a file called CHANGELOG.md (analoguous to the infamous README.md) in your project root folder. Unfortunately, there’s no standard for the actual file name, but CHANGELOG.md is a good default and quite common (and this is not exactly a good opportunity to show your creative side).
The other one is to use GitHub releases. GitHub releases allow you to associate a description and release downloads to a git tag. The benefit is that this makes the changelog somewhat machine readable via the GitHub API - For example, Depfu can embed the releases information into the pull request. The drawbacks are that releases are not really first class citizens in GitHub’s UI and can be a bit hard to find if you don’t know where they are and of course you tie the project to GitHub in a way you may or may not be comfortable with.
The Chalk project is a pretty good example of a well maintained releases page.
The description field in the GitHub releases takes Markdown, which is a pretty great format for your changelog. It’s easy to insert headers that can be directly linked to, it’s easy to create ordered or unordered lists and even if you’re on a system that does not allow you to render Markdown as HTML, you can read it pretty well in its plain text source format.
Markdown of course is also great for writing traditional changelog files. One thing to watch out for, of course, is consistency. You should always use the same header level for the version headers (some people use bigger headers for major releases, a convention that I personally find distracting), and the same construct for dividing up the different change types. This could be a two level unstructured list or using subheaders.
Process vs. tools
So, how can we, especially in larger open source projects with countless contributors, ensure a mostly complete changelog? Let’s start simple: A very low tech but proven way is to make writing changelogs part of the normal development process. For that, the changelog should have an “unreleased” section at the top, where changelog entries would be collected until a new version is released. You can then make it mandatory to submit a changelog entry along with your pull request. This is also helpful because it shows a list of upcoming changes for the next release. It also allows for an additional step of moderation and editing as part of the release process.
The next question would be if we could automate parts of the process. Automation here fulfills two goals: Minimizing the work load for the maintainers and ensuring consistency (and completeness) of the resulting changelog.
The most obvious idea is to somehow parse the commits that went into a release. If you are willing to use (and enforce) a specific format for your commit messages, the results can be pretty decent. The conventional-changelog project, for example, uses the conventional commit message syntax (A formalisation of the Angular project’s conventions) which allows it to automatically group the changes into a number of different categories and also makes it really easy to mark breaking changes. On top of this, you could even go ahead and automate the full release process, including choosing the correct version number change based on the semantic versioning conventions.
Here’s an example from Angular itself. The commit message from commit d7e5bbf looks like this:
feat(compiler-cli): add support to extend `angularCompilerOptions` (#22717)
`TypeScript` only supports merging and extending of `compilerOptions`. This is an implementation to support extending and inheriting of `angularCompilerOptions` from multiple files.
Closes: #22684
And the resulting changelog entry looks like this:
7.0.0-beta.7 (2018-09-26)
Features
You’ve got issues (and pull requests)
For many, these conventions for commit messages are a bit too strict and are probably also a bit too fine grained measure of progress. In this case, if the project is built around GitHub issues and pull requests, these are probably better units. The github-changelog-generator project looks at issues and PRs and their corresponding labels to generate the changelog.
Here’s an example (from the project itself): The PR #530 has an “enhancement” label and thus is listed in the changelog as follows (excerpt from the 1.15.0.beta release):
Implemented enhancements:
- add breaking-changes section to changelog #530 (bastelfreak)
- […]
To make the changelog generation a bit more explicit, Changelox, a project of my friend and ex colleague Benjamin, has a slightly different approach: You include the changelog entries in the PR description text, using a simple format. By marking PRs that don’t have a changelog entry as “Failed Builds” on GitHub, you can enforce the process (and the app itself supports you in various ways to write good changelog entries). The video on the homepage should give you a good impression on how that would work.
Of course this is only a small selection. The github-changelog-generator team maintains a long list of similar projects that might be worth a look.
Keep the changes close by
One thing that we can safely say is that the further away you move the changelog writing from the actual code changes, the higher are the chances to either forget things or to actually get details wrong. In turn, it definitely makes sense to move these two processes as close together as possible. Whether you do this by simply setting conventions or by automation is up to you. But if you choose an automated approach, you need to make sure the result is good enough. Nobody needs a dump of git log with sha1 hashes and useless commit titles in a separate text file - GitHub has a page for that. Also, if your system relies on convention, remember that these need to be followed quite strictly by everyone involved, which probably means having a bunch of automated guards that ensure consistency, but also can be quite a hassle. If in doubt, use the simple, handwritten approach.
Some final words
As I mentioned in the first part of this series, we’re quite unhappy with the percentage of projects that don’t have any changelog type of information for their project and that we think that this needs to change.
Now, I know that maintaining open source projects can be tough and mentally draining and everything, but this is not a good place to be in - If you don’t feel like doing it yourself, remember that there are many people out there who would love to help. Create a GitHub issue saying “We really would love to have a changelog” and mark it as beginner friendly. Research if one of the automation approaches would work for you. Maybe you don’t feel like writing a changelog but adhering to a commit message format is no big deal for you.
Of course, in the end, it’s still your decision if you want and can maintain a changelog, but if we go back to the beginning of the series where I assumed that a good changelog can be an indicator for a project’s governance, having no changelog at all nowadays makes me assume that your governance is actually not that great and I would probably not use your project unless there’s no alternatives.
And if you, as a user of a project, see that changes information is missing, don’t immediately start screaming at the maintainers, instead, if you can, offer your help.
I hope with this small series of posts, we were able to show the importance of a well constructed changelog and showed that creating and maintaining one is not rocket science either.