[PATCH STABLE] verify: don't init a new subrepo when a missing one is referenced (issue5128)

Matt Harbison mharbison72 at gmail.com
Mon Apr 11 23:20:54 EDT 2016


On Mon, 04 Apr 2016 03:50:49 -0400, Pierre-Yves David  
<pierre-yves.david at ens-lyon.org> wrote:

>
>
> On 04/03/2016 01:15 PM, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison at yahoo.com>
>> # Date 1459708994 14400
>> #      Sun Apr 03 14:43:14 2016 -0400
>> # Branch stable
>> # Node ID 63ce31ee0e582bc25327b27c0584445b28a90686
>> # Parent  2d39f987f0bacc39a8319cb6c84b2d38c1991028
>> verify: don't init a new subrepo when a missing one is referenced  
>> (issue5128)
>>
>> Initializing a subrepo when one doesn't exist is the right thing to do  
>> when the
>> parent is being updated, but in few other cases.  Unfortunately, there  
>> isn't
>> enough context in the subrepo module to distinguish this case.  This  
>> same issue
>> can be caused with other subrepo aware commands, so there is a general  
>> issue
>> here beyond the scope of this fix.
>>
>> A simpler attempt I tried was to add an '_updating' boolean to  
>> localrepo, and
>> set/clear it around the call to mergemod.update() in hg.updaterepo().   
>> That
>> mostly worked, but doesn't handle the case where archive will clone the  
>> subrepo
>> if it is missing.  (I vaguely recall that there may be other commands  
>> that will
>> clone if needed like this, but certainly not all do.  It seems both  
>> handy, and a
>> bit surprising for what should be a read only operation.  It might be  
>> nice if
>> all commands did this consistently, but we probably need Angel's  
>> subrepo caching
>> first, to not make a mess of the working directory.)
>>
>> It was suggested in the bug discussion to skip looking at the subrepo  
>> links
>> unless -S is specified.  I don't really like that idea because missing  
>> a subrepo
>> or (less likely, but worse) a corrupt .hgsubstate is a problem of the  
>> parent
>> repo when checking out a revision.  The -S option seems like a better  
>> fit for
>> functionality that would recurse into each subrepo and do a full  
>> verification.
>>
>> Ultimately, the default value for 'allowcreate' should probably be  
>> flipped, but
>> since the default behavior was to allow creation, this is less risky  
>> for now.
>
> I think we need at least 2 extra pieces here:
>
> First, I don't think we should silently skip the subrepo, we want at  
> least a warning to tell the user we did not verify one of them.
>
> Second, we probably want a way to disable this and still check all  
> subrepository (at the expense of creating the directory, for now).
>
> (note: I think I prefered the -S approach, but I'm fine with this one).

Any objection to something like this?  (I'll fold and resubmit of course):

diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -828,7 +828,10 @@
              ctx = repo[rev]
              try:
                  for subpath in ctx.substate:
-                    ret = ctx.sub(subpath, allowcreate=False).verify() or  
ret
+                    try:
+                        ret = ctx.sub(subpath,  
allowcreate=False).verify() or ret
+                    except Exception as e:
+                        repo.ui.warn(_('%s: %s\n') % (rev, e))
              except Exception:
                  repo.ui.warn(_('.hgsubstate is corrupt in revision %s\n')  
%
                               node.short(ctx.node()))
diff --git a/tests/test-subrepo-missing.t b/tests/test-subrepo-missing.t
--- a/tests/test-subrepo-missing.t
+++ b/tests/test-subrepo-missing.t
@@ -131,10 +131,10 @@
    checking files
    2 files, 5 changesets, 5 total revisions
    checking subrepo links
-  .hgsubstate is corrupt in revision ef278ff32036
-  .hgsubstate is corrupt in revision a66de08943b6
-  .hgsubstate is corrupt in revision 674d05939c1e
-  .hgsubstate is corrupt in revision a7d05d9055a4
+  0: repository $TESTTMP\repo\subrepo not found
+  1: repository $TESTTMP\repo\subrepo not found
+  3: repository $TESTTMP\repo\subrepo not found
+  4: repository $TESTTMP\repo\subrepo not found
    $ find subrepo
    find: `subrepo': * (glob)
    [1]

This way, the rather uninformative "corrupt" message is limited to  
failures parsing .hgsubstate, but failures in one subrepo don't block  
checking others in the same rev.  Also, no need for a flag to force the  
(bad) repo creation behavior.

My concern would be can we rely on exceptions always having a useful  
message?  I suppose we could handle Abort separate from Exception, but I'm  
not sure what the text should be.

I was going to try to fixup the "corrupt" message on default, but this  
seems to take care of that in a way that is suitable for stable.


More information about the Mercurial-devel mailing list