[PATCH 8 of 8 V2] cmdutil: apply dirstate.normallookup on (maybe partially) committed files

Martin von Zweigbergk martinvonz at google.com
Thu Jul 9 11:58:52 CDT 2015


On Thu, Jul 9, 2015 at 8:45 AM FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
wrote:

> At Wed, 08 Jul 2015 20:28:36 +0000,
> Martin von Zweigbergk wrote:
>
> > On Wed, Jul 8, 2015 at 1:15 AM FUJIWARA Katsunori <foozy at lares.dti.ne.jp
> >
> > wrote:
> >
> > > # HG changeset patch
> > > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > > # Date 1436342865 -32400
> > > #      Wed Jul 08 17:07:45 2015 +0900
> > > # Node ID 17ca03748ee914f9696dcec623bc2bc3977d8c38
> > > # Parent  a3f47d16a56c9c64f45b928749de85386b8a5dc9
> > > cmdutil: apply dirstate.normallookup on (maybe partially) committed
> files
> > >
> > > To detect change of a file without redundant comparison of file
> > > content, dirstate recognizes a file as certainly clean, if:
> > >
> > >   (1) it is already known as "normal",
> > >   (2) dirstate entry for it has valid (= not "-1") timestamp, and
> > >   (3) mode, size and timestamp of it on the filesystem are as same as
> > >       ones expected in dirstate
> > >
> > > This works as expected in many cases, but doesn't in the corner case
> > > that changing a file keeps mode, size and timestamp of it on the
> > > filesystem.
> > >
> > > The timetable below shows steps in one of typical such situations:
> > >
> > >   ---- ----------------------------------- ----------------
> > >                                            timestamp of "f"
> > >                                            ----------------
> > >                                            dirstate   file-
> > >   time          action                     mem  file  system
> > >   ---- ----------------------------------- ---- ----- -----
> > >   N                                              ***   ***
> > >        - change "f"                                     N
> > >
> > >        - execute `hg commit -i`
> > >          - backup "f" with timestamp N
> > >          - revert "f" by `merge.update()`               N
> > >            with `partially`
> > >          - apply selected hunks                         N
> > >            by `patch.patch()`
> > >
> > >          - `repo.commit()`
> > >            - `dirstate.normal("f")`         N
> > >   N+1
> > >            - `dirstate.write()`             N    N
> > >
> > >          - restore "f"                                  N+1
> > >          - restore timestamp of "f"                     N
> > >
> > >        - `hg status` shows "f" as "clean"   N    N      N
> > >   ---- ----------------------------------- ---- ----- -----
> > >
> > > The most important point is that `dirstate.write()` is executed at N+1
> > > or later. This causes writing dirstate timestamp N of "f" out
> > > successfully. If it is executed at N, `parsers.pack_dirstate()`
> > > replaces timestamp N with "-1" before actual writing dirstate out.
> > >
> > > This issue can occur when `hg commit -i` satisfies conditions below:
> > >
> > >   - the file is committed partially, and
> > >   - mode and size of the file aren't changed before and after
> committing
> > >
> > > The root cause of this issue is that (maybe partially changed) files
> > > are restored with original timestamp but dirstate isn't updated for
> > > them.
> > >
> > > To detect changes of files correctly, this patch applies
> > > `dirstate.normallookup()` on restored files. Status check is needed
> > > before `dirstate.normallookup()`, because status other than "n(ormal)"
> > > should be kept at failure of committing.
> > >
> > > This patch doesn't examine whether each files are committed fully or
> > > partially, because interactive hunk selection makes it difficult.
> > >
> > > After this change, timetable is changed as below:
> > >
> > >   ---- ----------------------------------- ----------------
> > >                                            timestamp of "f"
> > >                                            ----------------
> > >                                            dirstate   file-
> > >   time          action                     mem  file  system
> > >   ---- ----------------------------------- ---- ----- -----
> > >   N                                              ***   ***
> > >        - change "f"                                     N
> > >
> > >        - execute `hg commit -i`
> > >          - backup "f" with timestamp N
> > >          - revert "f" by `merge.update()`               N
> > >            with `partially`
> > >          - apply selected hunks                         N
> > >            by `patch.internalpatch()`
> > >
> > >          - `repo.commit()`
> > >            - `dirstate.normal("f")`         N
> > >   N+1
> > >            - `dirstate.write()`             N    N
> > >
> > >          - restore "f"                                  N+1
> > >          - restore timestamp of "f"                     N
> > >        ----------------------------------- ---- ----- -----
> > >          - normallookup("f")                -1
> > >          - release wlock
> > >            - `dirstate.write()`             -1   -1     N
> > >        ----------------------------------- ---- ----- -----
> > >
> > >        - `hg status` shows "f" as "clean"   -1   -1     N
> > >   ---- ----------------------------------- ---- ----- -----
>
> (snip)
>
> > >
> > > diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> > > --- a/mercurial/cmdutil.py
> > > +++ b/mercurial/cmdutil.py
> > > @@ -203,8 +203,16 @@
> > >          finally:
> > >              # 5. finally restore backed-up files
> > >              try:
> > > +                dirstate = repo.dirstate
> > >                  for realname, tmpname in backups.iteritems():
> > >                      ui.debug('restoring %r to %r\n' % (tmpname,
> realname))
> > > +
> > > +                    if dirstate[realname] == 'n':
> > > +                        # without normallookup, restoring timestamp
> > > +                        # may cause that partially committed files
> > > +                        # aren't treated as modified
> > > +                        dirstate.normallookup(realname)
> > > +
> > >
> >
> > Should this be done after writing the file as we do elsewhere? Maybe it
> > doesn't matter since the file will get an old timestamp anyway (because
> of
> > shutil.copystat)?
>
> Yes, exactly speaking, 'lookupnormal()' should be invoked after
> writing the file out.
>

IIUC, the only effect it might have is when lookupnormal() would give it
timestamp N and the file got timestamp N+1, in which case the file would
have to be unnecessarily checked for dirtiness. But since the timestamp
gets replaced anyway, it doesn't matter. So leave it as is, I think that's
fine.

Btw, I suppose this hack (the existing copystat() hack) means that all
involved files would have to be checked for dirtiness.


>
> On the other hand, IMHO, splitting below code block to place
> 'lookupnormal()' after 'util.copyfile()' seemed to weaken logical
> meaning "restore file with original timestamp" of it.
>
>                     util.copyfile(tmpname, repo.wjoin(realname))
>                     # Our calls to copystat() here and above are a
>                     # hack to trick any editors that have f open that
>                     # we haven't modified them.
>                     #
>                     # Also note that this racy as an editor could
>                     # notice the file's mtime before we've finished
>                     # writing it.
>                     shutil.copystat(tmpname, repo.wjoin(realname))
>                     os.unlink(tmpname)
>
> Should I move 'lookupnormal()' invocation ?
>
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150709/197945ee/attachment.html>


More information about the Mercurial-devel mailing list