[PATCH 4 of 4] verify: also check full manifest validity during verify runs

Martin von Zweigbergk martinvonz at google.com
Wed Apr 17 09:06:47 EDT 2019


On Wed, Apr 17, 2019, 03:34 Pierre-Yves David <
pierre-yves.david at ens-lyon.org> wrote:

>
>
> On 4/17/19 4:59 AM, Martin von Zweigbergk wrote:
> >
> >
> > On Tue, Apr 16, 2019, 16:46 Pierre-Yves David
> > <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>>
>
> > wrote:
> >
> >     # HG changeset patch
> >     # User Pierre-Yves David <pierre-yves.david at octobus.net
> >     <mailto:pierre-yves.david at octobus.net>>
> >     # Date 1551881213 -3600
> >     #      Wed Mar 06 15:06:53 2019 +0100
> >     # Node ID ed796867a06764cd78a57b2ed0249353f5809319
> >     # Parent  9bec7491e9b4cabdfa4d264e5213b1f416ec2607
> >     # EXP-Topic verify
> >     # Available At https://bitbucket.org/octobus/mercurial-devel/
> >     #              hg pull
> >     https://bitbucket.org/octobus/mercurial-devel/ -r ed796867a067
> >     verify: also check full manifest validity during verify runs
> >
> >     Before this changes, `hg verify` only checked if a manifest revision
> >     existed and
> >     referenced the proper files. However it never checked the manifest
> >     revision
> >     content itself.
> >
> >     Mercurial is expecting manifest entries to be sorted and will crash
> >     otherwise.
> >     Since `hg verify` did not attempted a full restoration of manifest
> >     entry, it
> >     could ignore this kind of corruption.
> >
> >     This new check significantly increases the cost of a `hg verify`
> >     run. This
> >     especially affects large repository not using `sparse-revlog`. For
> >     now, this is
> >     hidden behind the `--full` experimental flag.
> >
> >     diff --git a/mercurial/verify.py b/mercurial/verify.py
> >     --- a/mercurial/verify.py
> >     +++ b/mercurial/verify.py
> >     @@ -337,6 +337,16 @@ class verifier(object):
> >                               filenodes.setdefault(fullpath,
> >     {}).setdefault(fn, lr)
> >                   except Exception as inst:
> >                       self._exc(lr, _("reading delta %s") % short(n),
> >     inst, label)
> >     +            if not dir and self._level >= VERIFY_FULL:
> >
> >
> > What does the "not dir" mean? I guess it's to do this check only for the
> > root directory when using tree manifests. Should we do it for all
> > directories?
>
> That is used earlier in the same function to denote "the root manifest".
> I think check the root manifest will trigger checks of the sub manifest
> but I am not sure, I am not too familiar tree manifest.


I'm pretty sure it's about tree manifests. I asked just to make sure since
it surprised me that you used that as part of the condition here.

Can we double
> check/fix this as a follow up ?
>

It should be easy to check (just add a print statement and run tests, for
example), so I don't see much reason to fix such a simple thing in a
follow-up.

Btw, it would be nice to have tests too, but I understand that that's
harder to do. Thoughts on how it could be done? Prepared bundle or some
python code for writing out a bad manifest entry should work, I guess.


> >
> >     +                try:
> >     +                    # Manifest not in sorted order are invalid and
> >     will crash
> >     +                    # Mercurial. We restore each entry to make sure
> >     they are
> >
> >
> > Nit: "restore" almost makes it sound like this is fixing broken entries.
> > Maybe something like "We read the full manifest at each revision to..."?
>
> Good point, V2, coming
>
> --
> Pierre-Yves David
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20190417/046e98f9/attachment.html>


More information about the Mercurial-devel mailing list