[PATCH STABLE] verify: don't init a new subrepo when a missing one is referenced (issue5128)

Pierre-Yves David pierre-yves.david at ens-lyon.org
Sat Apr 16 19:49:44 EDT 2016



On 04/13/2016 06:01 PM, Matt Harbison wrote:
> On Wed, 13 Apr 2016 02:09:08 -0400, Pierre-Yves David 
> <pierre-yves.david at ens-lyon.org> wrote:
>
>>
>>
>> On 04/11/2016 08:20 PM, Matt Harbison wrote:
>>> On Mon, 04 Apr 2016 03:50:49 -0400, Pierre-Yves David 
>>> <pierre-yves.david at ens-lyon.org> wrote:
>>>> On 04/03/2016 01:15 PM, Matt Harbison wrote:
>>>>> # HG changeset patch
>>>>> # User Matt Harbison <matt_harbison at yahoo.com>
>>>>> # Date 1459708994 14400
>>>>> #      Sun Apr 03 14:43:14 2016 -0400
>>>>> # Branch stable
>>>>> # Node ID 63ce31ee0e582bc25327b27c0584445b28a90686
>>>>> # Parent  2d39f987f0bacc39a8319cb6c84b2d38c1991028
>>>>> verify: don't init a new subrepo when a missing one is referenced 
>>>>> (issue5128)
>>>>>
>>>>> Initializing a subrepo when one doesn't exist is the right thing 
>>>>> to do when the
>>>>> parent is being updated, but in few other cases. Unfortunately, 
>>>>> there isn't
>>>>> enough context in the subrepo module to distinguish this case.  
>>>>> This same issue
>>>>> can be caused with other subrepo aware commands, so there is a 
>>>>> general issue
>>>>> here beyond the scope of this fix.
>>>>>
>>>>> A simpler attempt I tried was to add an '_updating' boolean to 
>>>>> localrepo, and
>>>>> set/clear it around the call to mergemod.update() in 
>>>>> hg.updaterepo().  That
>>>>> mostly worked, but doesn't handle the case where archive will 
>>>>> clone the subrepo
>>>>> if it is missing.  (I vaguely recall that there may be other 
>>>>> commands that will
>>>>> clone if needed like this, but certainly not all do.  It seems 
>>>>> both handy, and a
>>>>> bit surprising for what should be a read only operation. It might 
>>>>> be nice if
>>>>> all commands did this consistently, but we probably need Angel's 
>>>>> subrepo caching
>>>>> first, to not make a mess of the working directory.)
>>>>>
>>>>> It was suggested in the bug discussion to skip looking at the 
>>>>> subrepo links
>>>>> unless -S is specified.  I don't really like that idea because 
>>>>> missing a subrepo
>>>>> or (less likely, but worse) a corrupt .hgsubstate is a problem of 
>>>>> the parent
>>>>> repo when checking out a revision.  The -S option seems like a 
>>>>> better fit for
>>>>> functionality that would recurse into each subrepo and do a full 
>>>>> verification.
>>>>>
>>>>> Ultimately, the default value for 'allowcreate' should probably be 
>>>>> flipped, but
>>>>> since the default behavior was to allow creation, this is less 
>>>>> risky for now.
>>>>
>>>> I think we need at least 2 extra pieces here:
>>>>
>>>> First, I don't think we should silently skip the subrepo, we want 
>>>> at least a warning to tell the user we did not verify one of them.
>>>>
>>>> Second, we probably want a way to disable this and still check all 
>>>> subrepository (at the expense of creating the directory, for now).
>>>>
>>>> (note: I think I prefered the -S approach, but I'm fine with this 
>>>> one).
>>>
>>> Any objection to something like this?  (I'll fold and resubmit of 
>>> course):
>>>
>>> diff --git a/mercurial/hg.py b/mercurial/hg.py
>>> --- a/mercurial/hg.py
>>> +++ b/mercurial/hg.py
>>> @@ -828,7 +828,10 @@
>>>              ctx = repo[rev]
>>>              try:
>>>                  for subpath in ctx.substate:
>>> -                    ret = ctx.sub(subpath, 
>>> allowcreate=False).verify() or ret
>>> +                    try:
>>> +                        ret = ctx.sub(subpath, 
>>> allowcreate=False).verify() or ret
>>> +                    except Exception as e:
>>> +                        repo.ui.warn(_('%s: %s\n') % (rev, e))
>>
>> That is super wide, any way we could narrow this?
>>
>> On the other hand, this mean we would really be prepared for all. 
>> However, we probably want to let some Mercurial specific (Abort, etc) 
>> through.
>
> That's what I was thinking.  It looks like the specific error raised 
> for the missing + no create case here is RepoError.
>
> The issue with handling Abort here specially by just rethrowing it (or 
> any other narrowing), is that it will immediately be caught by the 
> existing handler wrapping this loop.  Since this is nested in a 
> revision loop, it will simply carry on to the next revision instead of 
> bailing.  And the existing handler does need to be that wide there, 
> because subrepo.state() raises several Aborts.  I guess we could throw 
> a subclass of Abort from subrepo.state(). Handle that and Exception by 
> logging and continuing, but other Aborts by bailing.  But that seems 
> hacky, and would probably raise eyebrows on stable?

We are already using sub class of abort in multiple place (eg HookAbort) 
so having a subclass here seems fine. Could be a bit too big for stable 
not not necessary. Let's build something and we'll decide if we put in 
in 3.8 or 3.9.

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list