[PATCH 2 of 5] localrepo: invoke dirstate.unsureifambig in wwrite for safety

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Mon Jun 1 13:07:12 CDT 2015


At Mon, 01 Jun 2015 17:02:10 +0000,
Martin von Zweigbergk wrote:
> 
> [1  <text/plain; UTF-8 (7bit)>]
> Thanks. I think I knew most of that, but the subrepo case for interrupted
> update was a good example I had not thought about. My question remains,
> however. In the sequence in this test case, you do this:
> 
> $ hg update -q -C 3
> $ touch -t 200001010000 b
> $ hg status -A b
> $ hg --config extensions.abort=$TESTTMP/abort.py merge 5
> $ touch -t 200001010000 b
> 
> What I'm wondering is what real-world scenarios could lead to file b's
> timestamp being the same in those two cases (where you artificially touch
> them here). Since we said that 'hg status' is run strictly after (at
> whatever granularity is recorded by the FS, perhaps second-level), how can
> the 'hg merge' produce a timestamp for file b that's the same as what 'hg
> update' does?

`hg merge` can be aborted by same reason at `hg update` before
updating dirstate:

  (0) cleanup working directory for examination of `hg merge`
      (let's ignore this while focusing on examination of `hg merge`)
> $ hg update -q -C 3

  (1) ensure forcibly that `repo.dirstate['b']` has valid timestamp
> $ touch -t 200001010000 b
> $ hg status -A b

  (2) execute `hg merge`, but it is aborted before "update dirstate"
      for example:
      - invalid updating in subrepositories
      - keyboard interruption by user, file I/O error and so on
> $ hg --config extensions.abort=$TESTTMP/abort.py merge 5

  (3) (re-)set timestamp of 'b' on the filesystem for mis-leading
      this emulates that `hg merge` update 'b' before 200001010000(00)+1sec
> $ touch -t 200001010000 b


In addition to it, '.hgsubstate' may have to be focused on.

Regardless of `hg update`/`hg merge` result, size of it on the
filesystem is often kept, because '.hgsubstate' consists of "hashid
subrepopath": "hashid" of hg subrepo is always 40 letters :-)


(Do I understand your question correctly ?)


> On Sun, May 31, 2015 at 10:24 PM FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> wrote:
> 
> > At Fri, 29 May 2015 18:05:18 +0000,
> > Martin von Zweigbergk wrote:
> > >
> > > [1  <text/plain; UTF-8 (7bit)>]
> > > On Wed, May 27, 2015 at 10:06 AM FUJIWARA Katsunori <
> > foozy at lares.dti.ne.jp>
> > > wrote:
> > >
> > > > # HG changeset patch
> > > > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > > > # Date 1432745859 -32400
> > > > #      Thu May 28 01:57:39 2015 +0900
> > > > # Node ID b82f32b8734b109b84f533dc62dd3f23195d1e0a
> > > > # Parent  ab9f120295b59933d1acd72771f01b5fac8d221d
> > > > localrepo: invoke dirstate.unsureifambig in wwrite for safety
> > > >
> > > > Modified file may be mis-recognized as clean by dirstate, if mode,
> > > > size and timestamp of it on the filesystem aren't changed.
> > > >
> > > > To avoid such ambiguous situation, this patch invokes
> > > > `dirstate.unsureifambig()` at the end of `localrepository.wwrite()`.
> > > >
> > > > When file is cleanly reverted or updated, subsequent
> > > > `dirstate.lookup()` makes `dirstate.unsureifambig()` a little
> > > > redundant. But it is enough cheap for keeping consistency.
> > > >
> > > > diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> > > > --- a/mercurial/localrepo.py
> > > > +++ b/mercurial/localrepo.py
> > > > @@ -927,7 +927,9 @@
> > > >              self.wvfs.write(filename, data)
> > > >              if 'x' in flags:
> > > >                  self.wvfs.setflags(filename, False, True)
> > > > -        return len(data)
> > > > +        wsize = len(data)
> > > > +        self.dirstate.unsureifambig(filename, wsize)
> > > > +        return wsize
> > > >
> > > >      def wwritedata(self, filename, data):
> > > >          return self._filter(self._decodefilterpats, filename, data)
> > > > diff --git a/tests/test-merge1.t b/tests/test-merge1.t
> > > > --- a/tests/test-merge1.t
> > > > +++ b/tests/test-merge1.t
> > > > @@ -206,4 +206,60 @@
> > > >    $ hg revert -r -2 b
> > > >    $ hg up -q -- -2
> > > >
> > > > +Test for ambiguity from same size, timestamp and mode
> > > > +
> > > > +  $ cat > $TESTTMP/abort.py <<EOF
> > > > +  > from mercurial import extensions, merge, util
> > > > +  > def applyupdates(orig, *args, **kwargs):
> > > > +  >     orig(*args, **kwargs)
> > > > +  >     # emulate aborting before "recordupdates()"
> > > > +  >     # => files are changed without updating dirstate
> > > > +  >     raise util.Abort('intentional aborting')
> > > > +  > def extsetup(ui):
> > > > +  >     extensions.wrapfunction(merge, "applyupdates", applyupdates)
> > > > +  > EOF
> > > > +
> > > > +(file gotten from other revision)
> > > > +
> > > > +  $ hg update -q -C 2
> > > > +  $ echo 'THIS IS FILE B5' > b
> > > > +  $ hg commit -m 'commit #5'
> > > > +
> > > > +  $ hg update -q -C 3
> > > > +  $ touch -t 200001010000 b
> > > > +  $ hg status -A b
> > > > +  C b
> > > > +  $ cat b
> > > > +  This is file b1
> > > > +
> > > > +  $ hg --config extensions.abort=$TESTTMP/abort.py merge 5
> > > > +  abort: intentional aborting
> > > > +  [255]
> > > > +  $ touch -t 200001010000 b
> > > >
> > >
> > > I'm sorry I'm slow, but I'm still trying to understand this series. This
> > is
> > > of course an artificial way of reproducing the problem you were seeing,
> > but
> > > how does it sometimes happen in real life? Since the dirstate is written
> > on
> > > the 'hg status' call above, and that command is run after (not within the
> > > same second, as we discussed on path 1/5) the file was modified on disk,
> > > how can the 'hg merge' command be run at the earlier time again (which
> > you
> > > seem to be simulating here by setting the same timestamp)?
> >
> > Sorry for not enough explanation. I'll add more detailed explanation
> > in revised version.
> >
> > Core of `hg update` and `hg merge` consists of steps below:
> >
> >   1. calculate result of updating and examine some additional points below
> >
> >      - `_checkunknownfiles()` (implied by `calculateupdates()`)
> >      - `_checkcollision()`
> >
> >   2. update files in the working directory in `merge.applyupdates()`
> >
> >      this also implies updating files in subrepositories recursively
> >      (= `preupdate`/`update` hooks in them are also executed)
> >
> >   3. update dirstate in `merge.recordupdates()`
> >
> >   4. write in-memory dirstate changes out at releasing wlock
> >
> >
> > Tests in this patch emulate that `hg update` and `hg merge` (at top of
> > subrepo tree) are aborted between (2) and (3) by:
> >
> >   - invalid updating in subrepositories
> >
> >     for example:
> >     - uncommitted changes
> >     - collision against unknown files in the working directory
> >     - case-insensitive collision between files in new revision (and
> >       current one at merging)
> >     - failure of `preupdate` or `update` hooks
> >
> >   - keyboard interruption by user, file I/O error and so on
> >
> > `test-subrepo.t` actually treats the former situation, and
> > 6becb9dbca25 tried to fix (part of) this issue. Tests in this patch
> > can be said as "generalized version of test in 6becb9dbca25".
> >
> >
> > Even though some existing tests are enough as a white box test
> > examining whether `repo.wwrite()` invokes `dirstate.unsureifambig()`
> > correctly, testing in multiple situations should be useful to avoid
> > unexpected regression in the future, IMHO.
> >
> >
> >
> > > > +  $ cat b
> > > > +  THIS IS FILE B5
> > > > +  $ hg status -A b
> > > > +  M b
> > > > +
> > > > +(file merged from other revision)
> > > > +
> > > > +  $ hg update -q -C 3
> > > > +  $ echo 'this is file b6' > b
> > > > +  $ hg commit -m 'commit #6'
> > > > +  created new head
> > > > +  $ touch -t 200001010000 b
> > > > +  $ hg status -A b
> > > > +  C b
> > > > +  $ cat b
> > > > +  this is file b6
> > > > +
> > > > +  $ hg --config extensions.abort=$TESTTMP/abort.py merge --tool
> > > > internal:other 5
> > > > +  abort: intentional aborting
> > > > +  [255]
> > > > +  $ touch -t 200001010000 b
> > > > +  $ cat b
> > > > +  THIS IS FILE B5
> > > > +  $ hg status -A b
> > > > +  M b
> > > > +
> > > >    $ cd ..
> > > > _______________________________________________
> > > > Mercurial-devel mailing list
> > > > Mercurial-devel at selenic.com
> > > > https://selenic.com/mailman/listinfo/mercurial-devel
> > > >
> > > [2  <text/html; UTF-8 (quoted-printable)>]
> > >
> >
> > ----------------------------------------------------------------------
> > [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
> >
> [2  <text/html; UTF-8 (quoted-printable)>]
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list