Fixing .hg file open ordering

Bryan O'Sullivan bos at serpentine.com
Wed Nov 2 18:20:00 EDT 2016


There's long been a well-defined order for accessing historical data:
changelog first, then manifest, then revlog, and the reverse for writes.

I think that what has happened with bookmarks is that we literally forgot
about the next necessary ordering constraint: you must read bookmarks
before the changelog, otherwise you run into the race we see here. This
same constraint should apply to any other file that contains commit hashes.

It's unfortunate that the bmstore API is "backwards" in this way, but
what's the sequence of operations by which bookmarks seem to disappear?
There's something missing from your description to help piece the sequence
of events together, because a bookmark that points to a commit that is not
yet known will be explicitly ignored in the bmstore constructor.

Here's what I *think* must be the sequence:

I read the changelog.
I'm asked to look up the foo bookmark.
Someone else writes out some commits and a new bookmarks file.
I open the bookmarks file, which was written *after* the last changelog
write.
The newly written bookmarks file contains a version of foo that points at a
commit that I haven't yet seen, and because the bmstore constructor ignores
it, it appears to have vanished.

I think this will be a little harder to fix than you may be anticipating.

You can fix the read path by reading the bookmarks before the changelog,
but the write path is subtle. Here's why.

Bookmarks are stored in memory. If I'm being asked to write out a new
bookmarks file as part of a transaction, what happens in this interleaving?

I read bookmarks
You write bookmarks
I write bookmarks, based off my stale understanding

In other words, I'll potentially cause your write to go missing unless I
validate during the transaction that the bookmarks haven't changed since I
read them before I write them.


On Wed, Nov 2, 2016 at 2:38 PM, Durham Goode <durham at fb.com> wrote:

> On 11/2/16 2:09 PM, Durham Goode wrote:
>
>> There's currently no defined order in which Mercurial should open files
>> in the .hg directory.  For instance, it's possible to read the changelog
>> first, then several seconds later read the bookmark file. If during those
>> several seconds the repo receives new commits and a bookmark moves, then it
>> will read a bookmark that points at a node that doesn't exist in the
>> inmemory changelog.
>>
>> We're seeing this on our server, which causes certain bookmarks to
>> disappear for users in some situations (until they pull again).
>>
> To clarify (as Jun has brought up outside of email), we could also solve
> this by special casing our various dependencies.  Like make
> localrepo.changelog manually load the bookmarks first (we can't do it right
> now because the bookmarkstore constructor requires the changelog so it can
> check if each bookmark exists).  So we wouldn't necessarily need a new
> layer for it, but it would require more diligence to make sure all reads
> happen in the right order across the code base.
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20161102/0131338d/attachment.html>


More information about the Mercurial-devel mailing list