[PATCH 1 of 5] transaction: introduce scope

Durham Goode durham at fb.com
Tue Feb 28 17:34:29 EST 2017



On 2/27/17 9:35 AM, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark at fb.com>
> # Date 1488185788 28800
> #      Mon Feb 27 00:56:28 2017 -0800
> # Node ID 5dac612ec9a87553fcd329100846bcb01bae9d80
> # Parent  b4cb86ab4c719eb615a4308eafd8b1386a511eeb
> # Available At https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=nuarHzhP1wi1T9iURRCj1A&m=32KRbLZXwqv-6cN9kkgbRvm2CXLXQaqOiiVERyotxSQ&s=QQPzFUNWmG02Bz8xilA5LPRAfljxDn-s2-q1qSGuXps&e=
> #              hg pull https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=nuarHzhP1wi1T9iURRCj1A&m=32KRbLZXwqv-6cN9kkgbRvm2CXLXQaqOiiVERyotxSQ&s=QQPzFUNWmG02Bz8xilA5LPRAfljxDn-s2-q1qSGuXps&e=  -r 5dac612ec9a8
> transaction: introduce scope
>
> Previously, transaction tries to cover everything: bookmarks, dirstate,
> phaseroots, and branch. It backs them up unconditionally.
>
> That could be inefficient if we still back up the dirstate in a giant repo,
> where the transaction is only intended to change bookmarks.
>
> This patch introduces "scope" to transactions, which is a set of things that
> the transaction should cover. For bookmark-only update, it could avoid
> specifying the "dirstate" scope to be more efficient.

I was talking with Jun about this in person and a few things came up.  I 
proposed we just reenable hardlink support for transaction backups of 
atomic-rewrite-files like the dirstate. That would remove the IO of 
copying that I assumed was the problem. However, it turns out the 
dirstate doesn't even use the tr.addbackup() mechanism that everything 
else uses.  Instead, it writes out the *current* contents of the 
inmemory dirstate and treats that as the backup, which seems extra wrong 
since the current inmemory data may not match the data on disk that 
we're about to overwrite, and extra wrong since it means CPU time spent 
serializing the dirstate whenever a transaction opens.

Seems like we should refactor the dirstate/dirstateguard business to use 
the real transaction logic, then we could benefit from the hardlink 
backups optimization.

Not that such a refactor should block this series.


More information about the Mercurial-devel mailing list