[PATCH 7 of 7] context: avoid writing outdated dirstate out (issue5584)

Siddharth Agarwal sid at less-broken.com
Fri Jun 9 14:38:56 EDT 2017


On 6/8/17 9:08 PM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1496981269 -32400
> #      Fri Jun 09 13:07:49 2017 +0900
> # Node ID 664e3e012535c9d161f0ecb95c6dc0bfdfe9c672
> # Parent  9b4ff10f9db00ae7d91a28067d6b50655f5f957d
> context: avoid writing outdated dirstate out (issue5584)

This series looks great, thanks.

>
> Before this patch, workingctx.status() may cause writing outdated
> dirstate out, if:
>
>    - .hg/dirstate is changed simultaneously after last loading it,
>    - there is any file, which should be dirstate.normal()-ed
>
> Typical issue case is:
>
>    - the working directory is updated by "hg update"
>    - .hg/dirstate is updated in background (e.g. fsmonitor)
>
> This patch compares identities of dirstate before and after
> acquisition of wlock, and avoids writing outdated dirstate out, if
> change of .hg/dirstate is detected.
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -1763,19 +1763,30 @@ class workingctx(committablectx):
>           # update dirstate for files that are actually clean
>           if fixup:
>               try:
> +                oldid = self._repo.dirstate.identity()
> +
>                   # updating the dirstate is optional
>                   # so we don't wait on the lock
>                   # wlock can invalidate the dirstate, so cache normal _after_
>                   # taking the lock
>                   with self._repo.wlock(False):
> -                    normal = self._repo.dirstate.normal
> -                    for f in fixup:
> -                        normal(f)
> -                    # write changes out explicitly, because nesting
> -                    # wlock at runtime may prevent 'wlock.release()'
> -                    # after this block from doing so for subsequent
> -                    # changing files
> -                    self._repo.dirstate.write(self._repo.currenttransaction())
> +                    if self._repo.dirstate.identity() == oldid:
> +                        normal = self._repo.dirstate.normal
> +                        for f in fixup:
> +                            normal(f)
> +                        # write changes out explicitly, because nesting
> +                        # wlock at runtime may prevent 'wlock.release()'
> +                        # after this block from doing so for subsequent
> +                        # changing files
> +                        tr = self._repo.currenttransaction()
> +                        self._repo.dirstate.write(tr)
> +                    else:
> +                        # in this case, writing changes out breaks
> +                        # consistency, because .hg/dirstate was
> +                        # already changed simultaneously after last
> +                        # caching (see also issue5584 for detail)
> +                        self._repo.ui.debug('skip updating dirstate: '
> +                                            'identity mismatch\n')
>               except error.LockError:
>                   pass
>           return modified, deleted, fixup
> diff --git a/tests/test-dirstate-race.t b/tests/test-dirstate-race.t
> --- a/tests/test-dirstate-race.t
> +++ b/tests/test-dirstate-race.t
> @@ -95,3 +95,65 @@ anyway.
>     ! d
>     ! dir1/c
>     ! e
> +
> +  $ rmdir d e
> +  $ hg update -C -q .
> +
> +Test that dirstate changes aren't written out at the end of "hg
> +status", if .hg/dirstate is already changed simultaneously before
> +acquisition of wlock in workingctx._checklookup().
> +
> +This avoidance is important to keep consistency of dirstate in race
> +condition (see issue5584 for detail).
> +
> +  $ hg parents -q
> +  1:* (glob)
> +
> +  $ hg debugrebuilddirstate
> +  $ hg debugdirstate
> +  n   0         -1 unset               a
> +  n   0         -1 unset               b
> +  n   0         -1 unset               d
> +  n   0         -1 unset               dir1/c
> +  n   0         -1 unset               e
> +
> +  $ cat > $TESTTMP/dirstaterace.sh <<EOF
> +  > # This script assumes timetable of typical issue5584 case below:
> +  > #
> +  > # 1. "hg status" loads .hg/dirstate
> +  > # 2. "hg status" confirms clean-ness of FILE
> +  > # 3. "hg update -C 0" updates the working directory simultaneously
> +  > #    (FILE is removed, and FILE is dropped from .hg/dirstate)
> +  > # 4. "hg status" acquires wlock
> +  > #    (.hg/dirstate is re-loaded = no FILE entry in dirstate)
> +  > # 5. "hg status" marks FILE in dirstate as clean
> +  > #    (FILE entry is added to in-memory dirstate)
> +  > # 6. "hg status" writes dirstate changes into .hg/dirstate
> +  > #    (FILE entry is written into .hg/dirstate)
> +  > #
> +  > # To reproduce similar situation easily and certainly, #2 and #3
> +  > # are swapped.  "hg cat" below ensures #2 on "hg status" side.
> +  >
> +  > hg update -q -C 0
> +  > hg cat -r 1 b > b
> +  > EOF
> +
> +"hg status" below should excludes "e", of which exec flag is set, for
> +portability of test scenario, because unsure but missing "e" is
> +treated differently in _checklookup() according to runtime platform.
> +
> +- "missing(!)" on POSIX, "pctx[f].cmp(self[f])" raises ENOENT
> +- "modified(M)" on Windows, "self.flags(f) != pctx.flags(f)" is True
> +
> +  $ hg status --config extensions.dirstaterace=$TESTTMP/dirstaterace.py --debug -X path:e
> +  skip updating dirstate: identity mismatch
> +  M a
> +  ! d
> +  ! dir1/c
> +
> +  $ hg parents -q
> +  0:* (glob)
> +  $ hg files
> +  a
> +  $ hg debugdirstate
> +  n * * * a (glob)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel




More information about the Mercurial-devel mailing list