[PATCH 1 of 2] manifest: make revlog verification optional

Martin von Zweigbergk martinvonz at google.com
Tue Nov 15 16:02:49 EST 2016


On Tue, Nov 15, 2016 at 9:59 AM, Gregory Szorc <gregory.szorc at gmail.com> wrote:
> On Mon, Nov 14, 2016 at 3:27 PM, Durham Goode <durham at fb.com> wrote:
>>
>> # HG changeset patch
>> # User Durham Goode <durham at fb.com>
>> # Date 1479165447 28800
>> #      Mon Nov 14 15:17:27 2016 -0800
>> # Node ID 27209d52a5865422c5ef4ba05cedb28ce32919ed
>> # Parent  046a7e828ea63ec940ffae1089a33fae7954da2e
>> manifest: make revlog verification optional
>>
>> This patches adds an parameter to manifestlog.get() to disable hash
>> checking.
>> This will be used in an upcoming patch to support treemanifestctx reading
>> sub-trees without loading them from the revlog. (This is already supported
>> but
>> does not go through the manifestlog.get() code path)
>
>
> I could leverage this on the base revlog class because `hg debugupgraderepo`
> can spend >50% of its CPU time doing SHA-1 verification (most of that when
> converting manifests - there are tens of gigabytes of raw manifest text that
> needs to be hashed when converting the revlog). That could be a follow-up,
> of course.

I suspect you thought the "revlog verification" was about hash
verification. So did I when read the subject line. I was confused why
that would be relevant, but then I looked at the patch and forgot
about that. It's queued already, so it doesn't seem worth updating it
at this point.

>
>>
>>
>> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
>> --- a/mercurial/manifest.py
>> +++ b/mercurial/manifest.py
>> @@ -1278,9 +1278,12 @@ class manifestlog(object):
>>          """
>>          return self.get('', node)
>>
>> -    def get(self, dir, node):
>> +    def get(self, dir, node, verify=True):
>>          """Retrieves the manifest instance for the given node. Throws a
>>          LookupError if not found.
>> +
>> +        `verify` - if True an exception will be thrown if the node is not
>> in
>> +                   the revlog
>>          """
>>          if node in self._dirmancache.get(dir, ()):
>>              cachemf = self._dirmancache[dir][node]
>> @@ -1292,19 +1295,21 @@ class manifestlog(object):
>>
>>          if dir:
>>              if self._revlog._treeondisk:
>> -                dirlog = self._revlog.dirlog(dir)
>> -                if node not in dirlog.nodemap:
>> -                    raise LookupError(node, dirlog.indexfile,
>> -                                      _('no node'))
>> +                if verify:
>> +                    dirlog = self._revlog.dirlog(dir)
>> +                    if node not in dirlog.nodemap:
>> +                        raise LookupError(node, dirlog.indexfile,
>> +                                          _('no node'))
>>                  m = treemanifestctx(self._repo, dir, node)
>>              else:
>>                  raise error.Abort(
>>                          _("cannot ask for manifest directory '%s' in a
>> flat "
>>                            "manifest") % dir)
>>          else:
>> -            if node not in self._revlog.nodemap:
>> -                raise LookupError(node, self._revlog.indexfile,
>> -                                  _('no node'))
>> +            if verify:
>> +                if node not in self._revlog.nodemap:
>> +                    raise LookupError(node, self._revlog.indexfile,
>> +                                      _('no node'))
>>              if self._treeinmem:
>>                  m = treemanifestctx(self._repo, '', node)
>>              else:
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>


More information about the Mercurial-devel mailing list