[PATCH STABLE] verify: don't init a new subrepo when a missing one is referenced (issue5128)

Matt Harbison mharbison72 at gmail.com
Wed Apr 13 21:01:22 EDT 2016


On Wed, 13 Apr 2016 02:09:08 -0400, Pierre-Yves David  
<pierre-yves.david at ens-lyon.org> wrote:

>
>
> On 04/11/2016 08:20 PM, Matt Harbison wrote:
>> On Mon, 04 Apr 2016 03:50:49 -0400, Pierre-Yves David  
>> <pierre-yves.david at ens-lyon.org> wrote:
>>> On 04/03/2016 01:15 PM, Matt Harbison wrote:
>>>> # HG changeset patch
>>>> # User Matt Harbison <matt_harbison at yahoo.com>
>>>> # Date 1459708994 14400
>>>> #      Sun Apr 03 14:43:14 2016 -0400
>>>> # Branch stable
>>>> # Node ID 63ce31ee0e582bc25327b27c0584445b28a90686
>>>> # Parent  2d39f987f0bacc39a8319cb6c84b2d38c1991028
>>>> verify: don't init a new subrepo when a missing one is referenced  
>>>> (issue5128)
>>>>
>>>> Initializing a subrepo when one doesn't exist is the right thing to  
>>>> do when the
>>>> parent is being updated, but in few other cases. Unfortunately, there  
>>>> isn't
>>>> enough context in the subrepo module to distinguish this case.  This  
>>>> same issue
>>>> can be caused with other subrepo aware commands, so there is a  
>>>> general issue
>>>> here beyond the scope of this fix.
>>>>
>>>> A simpler attempt I tried was to add an '_updating' boolean to  
>>>> localrepo, and
>>>> set/clear it around the call to mergemod.update() in  
>>>> hg.updaterepo().  That
>>>> mostly worked, but doesn't handle the case where archive will clone  
>>>> the subrepo
>>>> if it is missing.  (I vaguely recall that there may be other commands  
>>>> that will
>>>> clone if needed like this, but certainly not all do.  It seems both  
>>>> handy, and a
>>>> bit surprising for what should be a read only operation.  It might be  
>>>> nice if
>>>> all commands did this consistently, but we probably need Angel's  
>>>> subrepo caching
>>>> first, to not make a mess of the working directory.)
>>>>
>>>> It was suggested in the bug discussion to skip looking at the subrepo  
>>>> links
>>>> unless -S is specified.  I don't really like that idea because  
>>>> missing a subrepo
>>>> or (less likely, but worse) a corrupt .hgsubstate is a problem of the  
>>>> parent
>>>> repo when checking out a revision.  The -S option seems like a better  
>>>> fit for
>>>> functionality that would recurse into each subrepo and do a full  
>>>> verification.
>>>>
>>>> Ultimately, the default value for 'allowcreate' should probably be  
>>>> flipped, but
>>>> since the default behavior was to allow creation, this is less risky  
>>>> for now.
>>>
>>> I think we need at least 2 extra pieces here:
>>>
>>> First, I don't think we should silently skip the subrepo, we want at  
>>> least a warning to tell the user we did not verify one of them.
>>>
>>> Second, we probably want a way to disable this and still check all  
>>> subrepository (at the expense of creating the directory, for now).
>>>
>>> (note: I think I prefered the -S approach, but I'm fine with this one).
>>
>> Any objection to something like this?  (I'll fold and resubmit of  
>> course):
>>
>> diff --git a/mercurial/hg.py b/mercurial/hg.py
>> --- a/mercurial/hg.py
>> +++ b/mercurial/hg.py
>> @@ -828,7 +828,10 @@
>>              ctx = repo[rev]
>>              try:
>>                  for subpath in ctx.substate:
>> -                    ret = ctx.sub(subpath, allowcreate=False).verify()  
>> or ret
>> +                    try:
>> +                        ret = ctx.sub(subpath,  
>> allowcreate=False).verify() or ret
>> +                    except Exception as e:
>> +                        repo.ui.warn(_('%s: %s\n') % (rev, e))
>
> That is super wide, any way we could narrow this?
>
> On the other hand, this mean we would really be prepared for all.  
> However, we probably want to let some Mercurial specific (Abort, etc)  
> through.

That's what I was thinking.  It looks like the specific error raised for  
the missing + no create case here is RepoError.

The issue with handling Abort here specially by just rethrowing it (or any  
other narrowing), is that it will immediately be caught by the existing  
handler wrapping this loop.  Since this is nested in a revision loop, it  
will simply carry on to the next revision instead of bailing.  And the  
existing handler does need to be that wide there, because subrepo.state()  
raises several Aborts.  I guess we could throw a subclass of Abort from  
subrepo.state().  Handle that and Exception by logging and continuing, but  
other Aborts by bailing.  But that seems hacky, and would probably raise  
eyebrows on stable?

>>              except Exception:
>>                  repo.ui.warn(_('.hgsubstate is corrupt in revision  
>> %s\n') %
>>                               node.short(ctx.node()))
>> diff --git a/tests/test-subrepo-missing.t b/tests/test-subrepo-missing.t
>> --- a/tests/test-subrepo-missing.t
>> +++ b/tests/test-subrepo-missing.t
>> @@ -131,10 +131,10 @@
>>    checking files
>>    2 files, 5 changesets, 5 total revisions
>>    checking subrepo links
>> -  .hgsubstate is corrupt in revision ef278ff32036
>> -  .hgsubstate is corrupt in revision a66de08943b6
>> -  .hgsubstate is corrupt in revision 674d05939c1e
>> -  .hgsubstate is corrupt in revision a7d05d9055a4
>> +  0: repository $TESTTMP\repo\subrepo not found
>> +  1: repository $TESTTMP\repo\subrepo not found
>> +  3: repository $TESTTMP\repo\subrepo not found
>> +  4: repository $TESTTMP\repo\subrepo not found
>
> That's an improvement, so we can probably take that until someone have a  
> clear idea of a potential better message.
>
> (beware, your path use \ )
>
>>    $ find subrepo
>>    find: `subrepo': * (glob)
>>    [1]
>>
>> This way, the rather uninformative "corrupt" message is limited to  
>> failures parsing .hgsubstate, but failures in one subrepo don't block  
>> checking others in the same rev.  Also, no need for a flag to force the  
>> (bad) repo creation behavior.
>>
>> My concern would be can we rely on exceptions always having a useful  
>> message?  I suppose we could handle Abort separate from Exception, but  
>> I'm not sure what the text should be.
>>
>> I was going to try to fixup the "corrupt" message on default, but this  
>> seems to take care of that in a way that is suitable for stable.
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list