[PATCH] transaction: move changelog finalizer to be before bookmark finalizer

Durham Goode durham at fb.com
Sat Mar 19 13:04:23 EDT 2016


On 3/18/16 6:07 PM, Pierre-Yves David wrote:
>
> On 03/16/2016 11:31 PM, Durham Goode wrote:
>> # HG changeset patch
>> # User Durham Goode <durham at fb.com>
>> # Date 1458196059 25200
>> #      Wed Mar 16 23:27:39 2016 -0700
>> # Node ID cdf4a94fdc46d27196f5411fc7a4008690834fba
>> # Parent  ed75909c4c670a7d9db4a2bef9817a0d5f0b4d9c
>> transaction: move changelog finalizer to be before bookmark finalizer
>>
>> Previously, transaction close would run the file generators before 
>> running the
>> finalizers (see the list below for what is in each). Since file 
>> generators
>> contain the bookmarks and the dirstate, this meant we made the 
>> dirstate and
>> bookmarks visible to external readers before we actually wrote the 
>> commits into
>> the changelog, which could result in missing bookmarks and missing 
>> working copy
>> parents (especially on servers with high commit throughput, since 
>> pulls might
>> fail to see certain bookmarks in this situation).
>>
>> By moving the changelog writing to be before the bookmark/dirstate 
>> writing, we
>> ensure the commits are present before they are referenced.
>>
>> For reference, file generators currently consist of: bookmarks, 
>> dirstate, and
>> phases. Finalizers currently consist of: changelog, revbranchcache, 
>> and fncache.
>> All of the former reference the latter, so therefore the latter 
>> should be
>> written first.
>>
>> Technically there's still plenty of race conditions (can the order of 
>> finalizers
>> affect how external readers see the repo?), but this is a step 
>> forward at least.
>
> Can we get a more complete analyse on the race condition we have here 
> and which ones we are trading for which others one?
>
> For example, this changes could result in phase root being visible 
> after the changesets, making public being seen public while they are not.
>
> I would like to have a clear idea of where we are walking there and 
> what tradeoff we are performing
K, given that file generators are "bookmarks, dirstate, and phases" and 
finalizers are "revlogs, revbranchcache, fncache", the old race 
conditions are:

- bookmarks written before revlogs - causes bookmarks to appear missing 
during read
- dirstate written before revlogs - causes missing working copy parent
- phases written before revlogs - should be benign, since any phase on a 
commit that is missing only affects commits that are also missing
- bookmarks/dirstate/phases written before revbranch cache - no issues
- bookmarks/dirstate/phases written before fncache - no issues

and the new race conditions are:

- revlogs written before bookmarks - may see commits where the bookmark 
hasn't moved on top of them yet; odd but not an invalid state
- revlogs written before dirstate - no issues
- revlogs written before phases - some commits may show up with the 
wrong phase (potentially public)
- revbranch cache written before bookmarks/dirstate/phases - no issues
- fncache written before bookmarks/dirstate/phases - no issues

Race conditions that existed before (within the two categories):
- fncache before revlog - fncache could point to file that doesn't exist
- revlog before fncache - fncache could be missing a file


So in summary, my change removes the invalid bookmarks and invalid 
dirstate case, and replaces it with an out-of-date phase case.  The out 
of date phase case could potentially be bad, since if we send public 
commits to the client, it may be hard to make them draft later. On the 
flip side, most central-server setups are publishing servers, so this 
wouldn't actually be an issue for them. This could be fixed by moving 
the phase writes to be before the revlogs (or by redoing the entire 
storage system to be more widely transactional).


More information about the Mercurial-devel mailing list