[PATCH 2 of 5] localrepo: invoke dirstate.unsureifambig in wwrite for safety

Wed Jun 3 13:32:39 CDT 2015

On Wed, Jun 3, 2015 at 10:32 AM FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
wrote:

>
> At Tue, 02 Jun 2015 19:40:26 +0000,
> Martin von Zweigbergk wrote:
>
> > On Tue, Jun 2, 2015 at 9:27 AM FUJIWARA Katsunori <foozy at lares.dti.ne.jp
> >
> > wrote:
>
> > > At Mon, 01 Jun 2015 18:57:16 +0000,
> > > Martin von Zweigbergk wrote:
>
> > > > On Mon, Jun 1, 2015 at 11:07 AM FUJIWARA Katsunori <
> > > foozy at lares.dti.ne.jp>
> > > > wrote:
> > > >
> > > > > At Mon, 01 Jun 2015 17:02:10 +0000,
> > > > > Martin von Zweigbergk wrote:
>
> (snip)
>
> > > > >   (2) execute `hg merge`, but it is aborted before "update
> dirstate"
> > > > >       for example:
> > > > >       - invalid updating in subrepositories
> > > > >       - keyboard interruption by user, file I/O error and so on
> > > > > > $ hg --config extensions.abort=$TESTTMP/abort.py merge 5
> > > > >
> > > > >   (3) (re-)set timestamp of 'b' on the filesystem for mis-leading
> > > > >       this emulates that `hg merge` update 'b' before
> > > 200001010000(00)+1sec
> > > > > > $ touch -t 200001010000 b
> > > > >
> > > >
> > > > Since 'hg status' above was run at 200001010000(00)+1sec (or later),
> how
> > > > can this (interrupted) merge be run at 200001010000(00)?
> > >
> > > Oh, your wondering is natural. My explanation was wrong :-<
> > >
> > > In real world, steps below reproduce this issue:
> > >
> >
> > Great explanation! Thanks!
> >
> > So the problem is that we mark the file clean in dirstate as a result of
> > checking its status, then we update it on disk, abort the update, write
> the
> > clean dirstate to disk and stay on the old revision. It seems a little
> > strange to me that we write the dirstate even on failure. Do you know if
> > there is a good reason for that?
>
> I don't know actual reason. It may be:
>
>   Invalidating in-memory changes at aborting has subsequent hg
>   commands pay cost of "checking dirty-ness" again.
>
>   The more "checking dirty-ness" costs, the more and larger possibly
>   dirty files there are.
>
> On the other hand, Pierre-Yves suggested me to discard all dirstate
> changes inside transaction at failure in the article about making
> in-memory dirstate changes visible to external process (in this case,
> important point is "dirstate may refer rollback-ed revisions").
>
> So, I'll re-examine what we should treat in-memory dirstate changes at
> failure widely (maybe after fixing this issue :-))
>

I was hoping that doing "discard dirstate on abort" would make this series
unnecessary, but perhaps it's not generic enough for a real solution. More
about that further down.

>
>
> > Perhaps another option is to mark all files in the merge normallookup()
> > before applyupdates()? That should mean that an abort later on would be
> > safe, no?
>
> `dirstate.normallookup()` always discards all of mode, size and
> timestamp. This forces subsequent status examinations to compare file
> contents until it marked as clean, because lack of "mode and size"
> marks such files as "unsure" until it is marked as "clean".
>
> This costs so much with large "possibly dirty" files.
>

Sorry, I wasn't clear (and maybe I'm also missing something). What I meant
was to mark each file in the dict of actions as normallookup() before
applyupdates() (or at least the actions that lead to writing of a file).
Then the dirstate entry would be overwritten in recordupdates(), so I was
hoping that that would put the dirstate in the same end state. Does that
make sense?

>
>
> BTW, I hit an idea of dirstate functionality to discard timestamp but
> keep expected mode/size (current `normallookup` discards all of them).
>
> When at least one of mode/size of file is changed, this functionality
> can avoid comparison of file content at subsequent status checking.
>

Even when we discard all of them like we currently do, can we not already
avoid comparison of file content if the mode/size is changed? We are
comparing to the file content to current revision (from filelog), right? So
can't we compare mode/size to the manifest/filelog data before comparing
the contents too? Maybe we already do?

>
>
> > >   == @200001010000(00)
> > >
> > >   (A) 'b' should be modified at this point, but
> > >
> > >   (B) it is assumed that dirstate timestamp of 'b' is -1 at this point
> > >
> > >   (C) invoke `hg merge`
> > >
> > >     (C-1) get wlock
> > >
> > >     (C-2) check uncommitted changes in `merge.update()` by
> `wc.files()`:
> > >
> > >           this indirectly sets dirstate timestamp of 'b' as
> > > 200001010000(00)
> > >
> > >             wc.files() => wc._status() => repo.status() => ... =>
> > >             wc._dirstatestatus() => wc._checklookup() =>
> dirstate.normal()
> > >
> > >
> > Would an alternative fix be to write the dirstate here (and after
> > recordupdates(), of course)? Would that be unacceptably slow, perhaps?
>
> It should be cheaper than merge/update itself.
>
> But we may have to put explicit `dirstate.write()` into many code
> paths for safety. Today, we know that putting it before
> `applyupdates()` avoids this issue. Then, tomorrow, we may have to put
> it into other places to avoid similar issues.
>
> So, I want to find generalized fixing.
>

Good point. IIUC, the general problem is that dirstate gets updated but not
written to disk before working copy files get written to disk. Just for
fun, I added a check in localrepo.wwrite() checking self.dirstate._dirty.
There seem to be a dozen or so direct callers that run into this problem. I
suspect there are fewer "real" problematic code paths. For example, there
are a few callers in merge.py that may all be taken care of if we add a
call to write dirstate before applyupdates(). Once we have addressed the
callers we know about, we could put the check in there and print a warning
to the user to report a bug? What do you think? Maybe there's a better
solution? Or maybe your solution is all we need.

>
> > >     (C-3) `merge.applyupates()` overwrites 'b' by the data in same size
> > >
> > >           FS timestamp of it is still 200001010000(00)
> > >
> > >     (C-3) fail before `merge.recordupdates()`
> > >
> > >   == @200001010000(00) + 1
> > >
> > >     (C-4) release wlock, and this causes written dirty dirstate out
> > >
> > >           this also writes out 200001010000(00) of 'b' successfully,
> > >           because FS timestamp of dirstate is 200001010000(00) + 1
> > >
> > >   (D) `hg status` treats 'b' as CLEAN, because FS timestamp/size of it
> > >       aren't changed
> > >
> > > Then, steps below (= without 'hg status -A b') can reproduce this
> > > issue almost always:
> > >
> > >   $ hg update -q -C 3
> > >   $ touch -t 200001010000 b
> > >   $ hg --config extensions.abort=$TESTTMP/abort.py merge 5
> > >   $ touch -t 200001010000 b
> > >
> > > But timestamp of dirstate may be set accidentally at `hg update`, and
> > > this breaks assumption (B) above.
> > >
> > > This is timing critical problem and we can't fully control it :-<
> > >
> > > On the other hand, tests in my patch execute steps below to make
> > > similar (but a little different in detail) situation:
> > >
> > >   == @200001010000(00)
> > >
> > >   (A) `touch -t 200001010000 b` emulates that 'b' is modified at this
> point
> > >
> > >   (B') `hg status` can write 200001010000(00) of 'b' into dirstate
> > > successfully
> > >
> > >       because `hg status` isn't actually executed at 200001010000(00)
> :-)
> > >
> > >   (C) invoke `hg merge`
> > >
> > >     (C-1) get wlock
> > >
> > >     (C-2) check uncommitted changes in `merge.update()` by `wc.files()`
> > >
> > >           'b' is already known as CLEAN by 200001010000(00)
> > >
> > >     (C-3) `merge.applyupates()` overwrites 'b' by the data in same size
> > >
> > >           FS timestamp of it isn't 200001010000(00), because `hg
> > >           merge` isn't actually executed at 200001010000(00)
> > >
> > >           but `touch -t 200001010000 b` after `hg merge` emulates it.
> > >
> > >     (C-3) fail before `merge.recordupdates()`
> > >
> > >   == @200001010000(00) + 1
> > >
> > >     (C-4) release wlock, and this causes written dirty dirstate out
> > >
> > >           dirstate timestamp of 'b' is still 200001010000(00)
> > >
> > >   (D) `hg status` treats 'b' as CLEAN, because FS timestamp/size of it
> > >       aren't changed (only if this issue isn't fixed well)
> > >
>
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150603/6214a2c5/attachment.html>