[PATCH 1 of 3 V3] revlog: raise an exception earlier if an entry is too large (issue4675)

Jordi GutiƩrrez Hermoso jordigh at octave.org
Tue Jun 2 14:07:08 CDT 2015

# HG changeset patch
# User Jordi GutiƩrrez Hermoso <jordigh at octave.org>
# Date 1433271879 14400
#      Tue Jun 02 15:04:39 2015 -0400
# Node ID b7f66f13bd44581a107f2cb8f5655ea05aa06245
# Parent  eb52de500d2a308761b65bc9efaf85272c27eca5
revlog: raise an exception earlier if an entry is too large (issue4675)

Before we were relying on _pack to error out when trying to pass an
integer that was too large for the "i" format specifier. Now we check
this earlier so we can form a better error message.

The error message unfortunately must exclude the filename at this
level of the call stack. The problem is that this name is not
available here, and the error can be triggered by a large manifest or
by a large file itself. Although perhaps we could provide the name of
a revlog index file (from the revlog object, instead of the revlogio
object), this seems like too much leakage of internal data structures.
It's not ideal already that an error message even mentions revlogs,
but this does seem unavoidable here.

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -153,6 +153,10 @@ indexformatng = ">Qiiiiii20s12x"
 ngshaoffset = 32
 versionformat = ">I"
+# corresponds to uncompressed length of indexformatng (2 gigs, 4-byte
+# signed integer)
+_maxentrysize = 0x7fffffff
 class revlogio(object):
     def __init__(self):
         self.size = struct.calcsize(indexformatng)
@@ -163,6 +167,12 @@ class revlogio(object):
         return index, getattr(index, 'nodemap', None), cache
     def packentry(self, entry, node, version, rev):
+        uncompressedlength = entry[2]
+        if uncompressedlength > _maxentrysize:
+            raise RevlogError(
+                _("size of %d bytes exceeds maximum revlog storage of %d")
+                % (uncompressedlength, _maxentrysize))
         p = _pack(indexformatng, *entry)
         if rev == 0:
             p = _pack(versionformat, version) + p[4:]

