Note:

This page is primarily intended for developers of Mercurial.

Tag Design

Why Mercurial tags work the way they do.

1. Overview

There are a number of possible ways to implement tags. This page discusses the various design trade-offs and why the design of tags is as it is, to avoid rehashing the topic regularly on the mailing list.

2. Basic desirable tag properties

Tags are used and abused for many purposes, but their primary purpose is to mark significant points in project history such as releases.

2.1. Mutable

It should be possible to change or delete tags after they're created. This usually isn't necessary, but mistakes happen.

2.2. In history

Tags should be 'in history'. This means that it should be possible to do the sorts of operations you do on normal changes on history:

Together with the mutable property, we'll also be able to see when and why tags are deleted or changed. Without history, changed or deleted tags would disappear without any possibility of recovery.

2.3. After the fact

It will often be desirable to tag a commit after the commit has occurred. For instance, release tagging may happen on revisions after they've gone through a test cycle, weeks after the relevant commit has occurred.

3. Additional concerns specific to Mercurial

3.1. Append-only

History is append-only and elements in the past cannot be changed or deleted. This is central to how Mercurial operates both locally and when transmitting history between repositories. This rules out many approaches to addressing the 'mutable' property of tags.

3.2. Distributed

Because Mercurial is a distributed system, any user can perform any operation on their local copy of the repository. This means that any user can tag and multiple users can simultaneously create conflicting tags.

3.3. Protocol

Historically, everything that Mercurial shares is also in history, so the Mercurial protocol only deals with changesets. The new pushkey concept allows sharing key/value pairs that are not tracked in history to implement shared bookmarks.

3.4. Branches

Mercurial history may have multiple named or anonymous branches. Solutions need to account for this.

4. Properties of the Mercurial tags implementation

Mercurial implements tags as a single .hgtags text file in the working directory. The file is append only. When a tag is moved to another revision it will first append a line re-tagging the old value so divergence will give conflicts. The 'effective' tags are taken from the .hgtags files on the heads of all branches. Tags closest to tip take precedence.

This solution has the basic properties above and has the following benefits:

Perceived downsides include:

5. Exploring the alternate solution space

<!> Due to our backwards-compatibility rules, changing tags in any significant way will not be considered. These alternatives are all moot, and are described here as background.

5.1. "What if we put tags in commits?"

Tags could be implemented as a special field in the changeset. This would make it "in history", but would not satisfy the "mutable" or "after the fact" properties. We could also require that tag commits be special commits that are children of the commit they're tagging, but this would create new heads and new merge issues, and also still doesn't satisfy the "mutable" requirement.

5.2. "What if the tags lived in a tracked file in .hg/?"

All files that are tracked by Mercurial are in the working directory. This gives an unambiguous dividing line between what is and isn't tracked. Much like .hgignore, .hgtags is a user-editable text file that is tracked and shared with with other users.

If parts of the .hg/ space were tracked, they'd need to be very carefully delineated and documented so that no one could accidentally commit or overwrite configuration or metadata that was not intended to be shared. This would significantly complicate Mercurial's handling of a very small set of files without providing an unambiguous win in terms of usability.

5.3. "What if we had a branch of history with a separate root to put tags in?"

It's possible for users to do this today simply by committing all their tags to a separate 'tags' branch, but it's not advisable. This will make merging tag conflicts more challenging, as you'll need to do a separate step of checking out the tags branch to merge it. It also doesn't address any of the perceived downsides, except removing one of the special files from the working directory.

5.4. "What if we had separate revlogs for tags that weren't part of normal history?"

One can imagine implementing tags as a separate parallel repository with its own changesets. This would satisfy "in history" without cluttering the main history. This would require parallel commands or options to review this history, a substantially more complex wire protocol, and so on. While this would successfully hide tag history from normal operations, it would probably do so well enough that most users were unaware of the separate tag history and the implications of its existence.

For instance, 'pull -r' would still serve the same purpose and thus still work in a similar way, but now its behavior would be mysterious to people who were not aware of the separate tags history. We would also require additional options to specify "revisions in the tag history".

All the downsides of having tags in a separate branch also apply, including the complexity of merging heads and resolving conflicts in the tag history.

5.5. "What if we used something like bookmarks as tags?"

This doesn't meet the "in history" requirement. Bookmarks are so named to remind people that they're ephemeral (if you shake the book, they may fall out and you'll lose your place), and thus are not suited to a lot of the tasks that tags are used for.

6. Examining the downsides

6.1. Tag and source history are mixed

As discussed above, the alternatives are significantly more confusing, and many users consider the current behavior here to be useful. Given that tagging will generally be an infrequent operation, the impact here should be fairly low. To ignore commits touching tags, you can use:

hg log -r 'not file(.hgtags)'

6.2. Merge conflicts

Like any merge, it is possible for conflicts to exist while merging the tags data. These happen more often than necessary with .hgtags because if two different tags are added to the end of the file, a typical merge tool will not know that their order doesn't matter.

We are open to improved automation of tag merging, but this will likely take us from 90% automatic to only 95% automatic, so the fairly marginal incremental benefit will need to be weighed against the cost of complex new code.

6.3. Working directory pollution

Just about all version control systems will have some pollution of the working directory; traditional systems have a private subdirectory in every directory. Core Mercurial restricts this to a small set of files named '.hg*' in the project root:

6.4. hg clone -r

The '-r' option to clone is not intended as a shorthand for 'hg clone; hg update <rev>', it is intended as an advanced method for truncating history. The proper shorthand is 'hg clone -u'


CategoryDeveloper

TagDesign (last edited 2014-02-22 16:11:40 by MadsKiilerich)