[PATCH 1 of 3 V2] revlog: raise an exception earlier if an entry is too large

Yuya Nishihara yuya at tcha.org
Fri May 29 09:16:41 CDT 2015


On Thu, 28 May 2015 16:48:58 -0400, Jordi GutiƩrrez Hermoso wrote:
> # HG changeset patch
> # User Jordi GutiƩrrez Hermoso <jordigh at octave.org>
> # Date 1432819730 14400
> #      Thu May 28 09:28:50 2015 -0400
> # Node ID 1f8879ee5b6e98b832caefc9ceb1e506b0b23ec4
> # Parent  bcb17d7dbec25088eaec5e4d34dedbd7057c5d68
> revlog: raise an exception earlier if an entry is too large
> 
> Before we were relying on _pack to error out when trying to pass an
> integer that was too large for the "i" format specifier. Now we check
> this earlier so we can form a better error message.
> 
> The error message unfortunately must exclude the filename at this
> level of the call stack. The problem is that this name is not
> available here, and the error can be triggered by a large manifest or
> by a large file itself. Although perhaps we could provide the name of
> a revlog index file (from the revlog object, instead of the revlogio
> object), this seems like too much leakage of internal data structures.
> It's not ideal already that an error message even mentions revlogs,
> but this does seem unavoidable here.
> 
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -153,6 +153,10 @@ indexformatng = ">Qiiiiii20s12x"
>  ngshaoffset = 32
>  versionformat = ">I"
>  
> +# corresponds to uncompressed length of indexformatng (2 gigs, 4-byte
> +# signed integer)
> +_maxentrysize = 0x7fffffff
> +
>  class revlogio(object):
>      def __init__(self):
>          self.size = struct.calcsize(indexformatng)
> @@ -163,6 +167,11 @@ class revlogio(object):
>          return index, getattr(index, 'nodemap', None), cache
>  
>      def packentry(self, entry, node, version, rev):
> +        # uncompressed length
> +        if entry[2] > _maxentrysize:
> +            raise RevlogError(
> +                _("maximum revlog storage (%d) exceeded" % _maxentrysize))

It should be _("... %d ...") % _maxentrysize.

BTW, I found a similar message that says "32 bits" with the actual value.

    if abs(when) > 0x7fffffff:
        raise Abort(_('date exceeds 32 bits: %d') % when)


More information about the Mercurial-devel mailing list