[PATCH 2 of 5] localrepo: invoke dirstate.unsureifambig in wwrite for safety

Martin von Zweigbergk martinvonz at google.com
Mon Jun 22 13:23:29 CDT 2015


On Mon, Jun 22, 2015 at 9:19 AM FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
wrote:

>
> At Fri, 12 Jun 2015 02:55:04 +0900,
> FUJIWARA Katsunori wrote:
> >
> > At Wed, 10 Jun 2015 11:32:25 -0700,
> > Pierre-Yves David wrote:
> > >
> > >
> > >
> > > On 06/10/2015 09:39 AM, Martin von Zweigbergk wrote:
> > > >
> > > >
> > > > On Tue, Jun 9, 2015 at 12:58 PM Pierre-Yves David
> > > > <pierre-yves.david at ens-lyon.org <mailto:
> pierre-yves.david at ens-lyon.org>>
> > > > wrote:
>
> [snip]
>
> > > >     Wait, is this series directly related with the dirstate
> > > >     guard/transaction thing ? (as in, the next step we must do) or
> is it an
> > > >     unrelated fix to dirstate.
> > > >
> > > >
> > > > I think it is related, but I'll let Fujiwara confirm.
> > > >
> > > >     If it is unrelated, we should probably focus on fixing the
> dirstate
> > > >     write scheduling before working on this issue.
> > > >
> > > > And what do you recommend if it is related?
> > >
> > > I've to admit I stopped following your discussion closely after roudn
> > > trip. If this is on the critical path for dirstateguard, we should
> > > probably start building a solid repro before taking care of it.
> >
> > At first, I thought that this issue can be fixed fully separately from
> > fixing issue4378 or so.
> >
> > But for total consistency of behavior at failure, we should make
> > consensus about handling dirstate changes generally, before working
> > more about issue4378. At least, deciding direction can avoid
> > meaningless side trip or so for me :-)
> >
> > I'll post summary of issue4378 and this issue from the same point of
> > view as a basis for discussion.
>
> Sorry for late response.
>
> I summarized about "dirstate changes at failure" as a basis for
> discussion. Please comment !
>
> BTW, it is assumed that `dirstate.normallookup` is invoked at
> `repo.wwrite` or so, because, without these changes, problems like
> issue4583 ("changed files are treated as clean") appear occasionally,
> even if processing terminates successfully (e.g, at `hg revert`, `hg
> import` and so on):
>

As I wrote before, I think the general problem is "that dirstate gets
updated but not written to disk before working copy files get written to
disk". I hope that is a correct assessment. I'm not convinced that calling
dirstate.normallookup() is the right solution (but I'm also not convinced
it's not). It seems like an alternative would be to not write working copy
files while the dirstate is dirty. Hopefully someone else can have an
opinion on this. It took me a long time to understand the issue even just
somewhat well, so I understand that it's a lot to ask, but I think we need
more than just two people thinking about this.

I haven't yet read the remainder of the email.


>
>   https://selenic.com/pipermail/mercurial-devel/2015-May/070384.html
>   https://selenic.com/pipermail/mercurial-devel/2015-May/070386.html
>
>
> --------------
> 1. `hg commit`
> --------------
>
> Let's see typical case `hg commit`.
>
> Lines ending with "#N" means "dirstate is changed here". Scope of
> wlock/transaction is described in "with xxxx as yyyy:" style for
> convenience.
>
>     ================
>     commands.commit():
>       if opts['addremove']:
>         scmutil.addremove():
>           dirstate.update('files {added|removed|copied}') #1
>           # involving dirstate.write() at wlock.release()
>
>       repo.commit():
>         with repo.wlock() as wlock:
>           repo.status('to get current status') #2
>
>           editor('for commit message')
>           wctx.sub(s).commit('recursively') for subs
>           hook('precommit')
>
>           with repo.transaction('commit') as tr:
>             node = createcommit()
>             hook('pretxncommit')
>             dirstate.update(node) #3
>             hook('pretxnclose')
>     ================
>
> Changes at #3 above should be discarded for consistency of subsequent
> `hg status` and so on, because:
>
>   - `dirstate.parents` refers rolled back (= not-existing) revision
>   - files are already `dirstate.normal`-ed
>
> This is reason why (1) dirstate should be saved just before starting
> transaction, and (2) restored if transaction is rolled back.
>
> This was already discussed for "making in-memory dirstate changes
> visible to external process" to fix issue4378.
>
>
> http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/80003/focus=80151
>
> Then, how should we treat #1 and #2 ?
>
>   1. restore dirstate at failure of command
>
>      This policy isn't suitable for `hg commit --addremove` case,
>      because dirstate changes at #1 above should be kept for backward
>      compatibility, even if `hg commit` itself fails.
>
>      At least, the way to avoid restoring dirstate at failure (a kind
>      of the inverse of `dirstateguard` ?) is required for this policy.
>
>      For example, commands below expect that dirstate changes are kept
>      even if command itself is terminated with non-0 exit code
>      (detailed flows are shown in subsequent section):
>
>        - aborting by exception
>          `hg graft`, `hg rebase` and so on
>
>        - terminating normally with non-0 exit code
>          conflicts at `hg backout` without --merge
>
>   2. restore dirstate at failure in wlock scope
>
>      Changes at #2 (just `normal`-ing) can be discarded at failure in
>      wlock scope.
>
>      This policy is suitable at least for `hg commit` case.
>
>      But some other commands expect that dirstate changes are kept
>      even if command itself is aborted by exception, as described
>      above.
>
>      So, the way to avoid restoring dirstate at failure is required
>      for this policy, too. In addition to it, code path enclosed by
>      `repo.wlock()` + `wlock.release` may be nested at runtime, and
>      this may increase complexity.
>
>   3. keep dirstate changes (and write them out) even at failure
>
>      This is also suitable for `hg commit` case, because this is as
>      same as current behavior :-)
>
> Let's see other cases, too.
>
>
> --------------
> 2. `hg update`
> --------------
>
>     ================
>     commands.update():
>       merge.update():
>         with repo.wlock() as wlock:
>           wctx.files() / wctx.dirty() #1
>           hook('preupdate')
>
>           calculateupdates()
>
>           # critical region >>>>
>           repo.wwrite('all files to be updated')
>           wctx.sub(s).get('update recursively') for subs
>           dirstate.change('all updated files') #2
>           # critical region <<<<
>
>           hook('postupdate')
>     ================
>
> Changes at #1 above can be treated as similarly as ones at #2 in `hg
> commit` case. So, let's ignore it here :-)
>
>
> Aborting in "critical region" above causes the issue that changed
> files are treated as clean, as I described in previous post.
>
>
> http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/80342/focus=80673
>
> There are some solutions to fix this problem:
>
>   1. restore dirstate at failure of command
>
>   2. restore dirstate at failure in wlock scope
>
>   3. enclose "critical region" by `dirstateguard` to restore dirstate
>      at failure (but keep other dirstate changes even at failure)
>
>   4. apply `dirstate.normallookup` or so explicitly on files to be
>      updated before entering "critical region" (and keep dirstate
>      changes even at failure)
>
>      This is needed, even if `repo.wwrite` involves
>      `dirstate.normallookup`, because `repo.wwrite` itself may be
>      invoked in FORKED processes.
>
> In fact, when `partial` argument of `merge.update` is specified
> explicitly, (4) is needed, even if one of other than (4) is applied,
> because updating dirstate (at #2 above) is omitted in such case (this
> is used as a kind of "revert").
>
> Then, (4) seems suitable for "critical region" in `hg update`.
>
>
> --------------------------
> 3. Users of `merge.update`
> --------------------------
>
> There are some cases using `merge.update` directly or indirectly
> (histedit and so on, too)
>
>     ================
>     commands.backout():
>       with repo.wlock() as wlock:
>         if linearmerging:
>           # critical region >>>>
>           merge.update('for merging linearly') #1
>           repo.setparents() #2
>           # critical region <<<<
>
>           if conflicted:
>             return 1 # expecting dirstate.write
>           elif not commit:
>             return 0 # expecting dirstate.write
>         else:
>           merge.update('to the revision to be backed out') #3
>           cmduit.revert('to backout') #4
>
>         repo.commit('the revision backing out') #5
>
>          if merging:
>            merge.update('to the original parent') #6
>            merge.update('for merging with the revision backing out') #7
>     ================
>
>     ================
>     commands.graft():
>       with repo.wlock() as wlock:
>         for rev in revs:
>           merge.graft('to graft rev'):
>             # critical region >>>>
>             merge.update('for pseudo merging with graft source') #1
>             repo.setparents() #2
>             dirstate.copy('to duplicate copy information') #3
>             # critical region <<<<
>
>           if conflict:
>             throw abort() # expecting dirstate.write
>
>           repo.commit('the revision grafted') #4
>     ================
>     (BTW, `merge.update` isn't yet included into "critical region" of
>     `hg graft`)
>
>     ================
>     rebase.rebase():
>       with repo.wlock() as wlock:
>         merge.update('to destination') #1
>
>         for rev in revs:
>           # critical region >>>>
>           merge.update('for merging the revision to be rebased') #2
>           dirstate.copy('to duplicate copy information') #3
>           # critical region <<<<
>
>           if conflict:
>             raise InterventionRequired # expecting dirstate.write
>
>           with dirstateguard('rebase'):
>             repo.setparents('to drop 2nd parent of pseudo merge') #4
>             repo.commit('the revision rebased') #5
>     ================
>     (BTW, "critical region" of `hg rebase` isn't yet guarded at all)
>
> We can find something below out:
>
>   a. `merge.update` may be invoked multiple times in a wlock scope
>
>      This also means that "critical region" may be also invoked
>      multiple times in a wlock scope, and increases complexity of "a
>      kind of the inverse of `dirstateguard` or so".
>
>   b. additional processing may be needed after `merge.update` to use
>      result of it correctly
>
>      Result of each commands isn't valid, if a command is aborted in
>      "critical region". For example, lack of `repo.setparents()` at #2
>      of `hg graft` may cause committing invalid revision at subsequent
>      `hg graft --continue`.
>
>      So, dirstate changes in "critical region" should be discarded at
>      failure.
>
> Then, enclosing "critical region" by `dirstateguard` seems easier than
> "a kind of the inverse of `dirstateguard` or so" to restore dirstate
> at failure.
>
>
> -------------
> 4. Conclusion
> -------------
>
> According to cases above, "restore dirstate at failure of command (or
> in wlock scope)" policy doesn't seem suitable for current Mercurial
> implementation.
>
> Then, what about the policy below ?
>
>   1. treat dirstate changes in transaction as "all or nothing"
>
>      - save dirstate just before starting transaction, and
>      - restore it if transaction is rolled back
>
>   2. keep dirstate changes outside transaction (and write them out)
>      even at failure
>
>   3. enclose "critical region" by `dirstateguard` to discard dirstate
>      changes at failure
>
>   4. but in `merge.update`, apply `dirstate.normallookup` explicitly
>      on files to be updated before entering into "critical region"
>      instead of enclosing it by `dirstateguard` because:
>
>      - `repo.wwrite` in FORKED processes doesn't write
>        `normallookup`-ed result into `.hg/dirstate`, and
>
>      - explicit `partial` argument of `merge.update` always omits
>        updating dirstate
>
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150622/ad2e275e/attachment.html>


More information about the Mercurial-devel mailing list