This page is primarily intended for Mercurial's developers.
Why Mercurial tags work the way they do.
- Basic desirable tag properties
- Additional concerns specific to Mercurial
- Properties of the Mercurial tags implementation
- Exploring the alternate solution space
- Examining the downsides
There are a number of possible ways to implement tags. This page discusses the various design trade-offs and why the design of tags is as it is, to avoid rehashing the topic regularly on the mailing list.
2. Basic desirable tag properties
Tags are used and abused for many purposes, but their primary purpose is to mark significant points in project history such as releases.
It should be possible to change or delete tags after they're created. This usually isn't necessary, but mistakes happen.
2.2. In history
Tags should be 'in history'. This means that it should be possible to do the sorts of operations you do on normal changes on history:
- review who made the tag
- see when the tag was introduced
- see why the tag was introduced
Together with the mutable property, we'll also be able to see when and why tags are deleted or changed. Without history, changed or deleted tags would disappear without any possibility of recovery.
2.3. After the fact
It will often be desirable to tag a commit after the commit has occurred. For instance, release tagging may happen on revisions after they've gone through a test cycle, weeks after the relevant commit has occurred.
3. Additional concerns specific to Mercurial
History is append-only and elements in the past cannot be changed or deleted. This is central to how Mercurial operates both locally and when transmitting history between repositories. This rules out many approaches to addressing the 'mutable' property of tags.
Because Mercurial is a distributed system, any user can perform any operation on their local copy of the repository. This means that any user can tag and multiple users can simultaneously create conflicting tags.
Historically, everything that Mercurial shares is also in history, so the Mercurial protocol only deals with changesets. The new pushkey concept allows sharing key/value pairs that are not tracked in history to implement shared bookmarks.
Mercurial history may have multiple named or anonymous branches. Solutions need to account for this.
4. Properties of the Mercurial tags implementation
Mercurial implements tags as a single .hgtags text file in the working directory. The file is append only. When a tag is moved to another revision it will first append a line re-tagging the old value so divergence will give conflicts. The 'effective' tags are taken from the .hgtags files on the heads of all branches. Tags closest to tip take precedence.
This solution has the basic properties above and has the following benefits:
- uses the existing history mechanism
- uses the existing working directory tracking mechanism
- uses the existing merge and conflict mechanisms
- uses the existing synchronization mechanisms
Perceived downsides include:
- tag and source history are mixed (each tag requires a new commit)
- some trivial merges require manual intervention
tags refer to earlier commits ('hg clone -r tag' will not contain the tag)
- there's a non-project file in the working directory
5. Exploring the alternate solution space
Due to our backwards-compatibility rules, changing tags in any significant way will not be considered. These alternatives are all moot, and are described here as background.
5.1. "What if we put tags in commits?"
Tags could be implemented as a special field in the changeset. This would make it "in history", but would not satisfy the "mutable" or "after the fact" properties. We could also require that tag commits be special commits that are children of the commit they're tagging, but this would create new heads and new merge issues, and also still doesn't satisfy the "mutable" requirement.
5.2. "What if the tags lived in a tracked file in .hg/?"
All files that are tracked by Mercurial are in the working directory. This gives an unambiguous dividing line between what is and isn't tracked. Much like .hgignore, .hgtags is a user-editable text file that is tracked and shared with with other users.
If parts of the .hg/ space were tracked, they'd need to be very carefully delineated and documented so that no one could accidentally commit or overwrite configuration or metadata that was not intended to be shared. This would significantly complicate Mercurial's handling of a very small set of files without providing an unambiguous win in terms of usability.
5.3. "What if we had a branch of history with a separate root to put tags in?"
It's possible for users to do this today simply by committing all their tags to a separate 'tags' branch, but it's not advisable. This will make merging tag conflicts more challenging, as you'll need to do a separate step of checking out the tags branch to merge it. It also doesn't address any of the perceived downsides, except removing one of the special files from the working directory.
5.4. "What if we had separate revlogs for tags that weren't part of normal history?"
One can imagine implementing tags as a separate parallel repository with its own changesets. This would satisfy "in history" without cluttering the main history. This would require parallel commands or options to review this history, a substantially more complex wire protocol, and so on. While this would successfully hide tag history from normal operations, it would probably do so well enough that most users were unaware of the separate tag history and the implications of its existence.
For instance, 'pull -r' would still serve the same purpose and thus still work in a similar way, but now its behavior would be mysterious to people who were not aware of the separate tags history. We would also require additional options to specify "revisions in the tag history".
All the downsides of having tags in a separate branch also apply, including the complexity of merging heads and resolving conflicts in the tag history.
5.5. "What if we used something like bookmarks as tags?"
This doesn't meet the "in history" requirement. Bookmarks are so named to remind people that they're ephemeral (if you shake the book, they may fall out and you'll lose your place), and thus are not suited to a lot of the tasks that tags are used for.
6. Examining the downsides
6.1. Tag and source history are mixed
As discussed above, the alternatives are significantly more confusing, and many users consider the current behavior here to be useful. Given that tagging will generally be an infrequent operation, the impact here should be fairly low. To ignore commits touching tags, you can use:
hg log -r 'not file(.hgtags)'
6.2. Merge conflicts
Like any merge, it is possible for conflicts to exist while merging the tags data. These happen more often than necessary with .hgtags because if two different tags are added to the end of the file, a typical merge tool will not know that their order doesn't matter.
We are open to improved automation of tag merging, but this will likely take us from 90% automatic to only 95% automatic, so the fairly marginal incremental benefit will need to be weighed against the cost of complex new code.
6.3. Working directory pollution
Just about all version control systems will have some pollution of the working directory; traditional systems have a private subdirectory in every directory. Core Mercurial restricts this to a small set of files named '.hg*' in the project root:
- .hgsub and .hgsubstate
6.4. hg clone -r
The '-r' option to clone is not intended as a shorthand for 'hg clone; hg update <rev>', it is intended as an advanced method for truncating history. The proper shorthand is 'hg clone -u'