[PATCH] revlog: skeleton support for version 2 revlogs

Gregory Szorc gregory.szorc at gmail.com
Fri May 19 05:14:38 UTC 2017


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1495170858 25200
#      Thu May 18 22:14:18 2017 -0700
# Node ID 5fd154712c0ae7876c3a6427ae33acd5c2afea5e
# Parent  cf5f7044094cc79de3e80efcd0bc5d22630c90dd
revlog: skeleton support for version 2 revlogs

There are a number of improvements we want to make to revlogs
that will require a new version - version 2. It is unclear what the
full set of improvements will be or when we'll be done with them.
What I do know is that the process will likely take longer than a
single release, will require input from various stakeholders to
evaluate changes, and will have many contentious debates and
bikeshedding.

It is unrealistic to develop revlog version 2 up front: there
are just too many uncertainties that we won't know until things
are implemented and experiments are run. Some changes will also
be invasive and prone to bit rot, so sitting on dozens of patches
is not practical.

This commit introduces skeleton support for version 2 revlogs in
a way that is flexible and not bound by backwards compatibility
concerns.

An experimental repo requirement for denoting revlog v2 has been
added. The requirement string has a sub-version component to it.
This will allow us to declare multiple requirements in the course
of developing revlog v2. Whenever we change the in-development
revlog v2 format, we can tweak the string, creating a new
requirement and locking out old clients. This will allow us to
make as many backwards incompatible changes and experiments to
revlog v2 as we want. In other words, we can land code and make
meaningful progress towards revlog v2 while still maintaining
extreme format flexibility up until the point we freeze the
format and remove the experimental labels.

To enable the new repo requirement, you must supply an experimental
and undocumented config option. But not just any boolean flag
will do: you need to explicitly use a value that no sane person
should ever type. This is an additional guard against enabling
revlog v2 on an installation it shouldn't be enabled on. The
specific scenario I'm trying to prevent is say a user with a
4.4 client with a frozen format enabling the option but then
downgrading to 4.3 and accidentally creating repos with an
outdated and unsupported repo format. Requiring a "challenge"
string should prevent this.

Because the format is not yet finalized and I don't want to take
any chances, revlog v2's version is currently 0xDEAD. I figure
squatting on a value we're likely never to use as an actual revlog
version to mean "internal testing only" is acceptable. And
"dead" is easily recognized as something meaningful.

There is a bunch of cleanup that is needed before work on revlog
v2 begins in earnest. I plan on doing that work once this patch
is accepted and we're comfortable with the idea of starting down
this path.

diff --git a/mercurial/help/internals/revlogs.txt b/mercurial/help/internals/revlogs.txt
--- a/mercurial/help/internals/revlogs.txt
+++ b/mercurial/help/internals/revlogs.txt
@@ -46,6 +46,10 @@ 1
    RevlogNG (*next generation*). It replaced version 0 when it was
    implemented in 2006.
 
+57005 (0xdead)
+   Reserved for internal testing of new versions. No defined format
+   beyond 32-bit header.
+
 The feature flags short consists of bit flags. Where 0 is the least
 significant bit, the following bit offsets define flags:
 
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -245,6 +245,10 @@ class locallegacypeer(localpeer):
     def changegroupsubset(self, bases, heads, source):
         return changegroup.changegroupsubset(self._repo, bases, heads, source)
 
+# Increment the sub-version when the revlog v2 format changes to lock out old
+# clients.
+REVLOGV2_REQUIREMENT = 'exp-revlogv2.0'
+
 class localrepository(object):
 
     supportedformats = {
@@ -252,6 +256,7 @@ class localrepository(object):
         'generaldelta',
         'treemanifest',
         'manifestv2',
+        REVLOGV2_REQUIREMENT,
     }
     _basesupported = supportedformats | {
         'store',
@@ -441,6 +446,10 @@ class localrepository(object):
             if r.startswith('exp-compression-'):
                 self.svfs.options['compengine'] = r[len('exp-compression-'):]
 
+        # TODO move "revlogv2" to openerreqs once finalized.
+        if REVLOGV2_REQUIREMENT in self.requirements:
+            self.svfs.options['revlogv2'] = True
+
     def _writerequirements(self):
         scmutil.writerequires(self.vfs, self.requirements)
 
@@ -2059,4 +2068,11 @@ def newreporequirements(repo):
     if ui.configbool('experimental', 'manifestv2', False):
         requirements.add('manifestv2')
 
+    revlogv2 = ui.config('experimental', 'revlogv2')
+    if revlogv2 == 'enable-unstable-format-and-corrupt-my-data':
+        requirements.remove('revlogv1')
+        # generaldelta is implied by revlogv2.
+        requirements.discard('generaldelta')
+        requirements.add(REVLOGV2_REQUIREMENT)
+
     return requirements
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -46,12 +46,16 @@ from . import (
 # revlog header flags
 REVLOGV0 = 0
 REVLOGV1 = 1
+# Dummy value until file format is finalized.
+# Reminder: change the bounds check in revlog.__init__ when this is changed.
+REVLOGV2 = 0xDEAD
 FLAG_INLINE_DATA = (1 << 16)
 FLAG_GENERALDELTA = (1 << 17)
 REVLOG_DEFAULT_FLAGS = FLAG_INLINE_DATA
 REVLOG_DEFAULT_FORMAT = REVLOGV1
 REVLOG_DEFAULT_VERSION = REVLOG_DEFAULT_FORMAT | REVLOG_DEFAULT_FLAGS
 REVLOGV1_FLAGS = FLAG_INLINE_DATA | FLAG_GENERALDELTA
+REVLOGV2_FLAGS = REVLOGV1_FLAGS
 
 # revlog index flags
 REVIDX_ISCENSORED = (1 << 15) # revision has censor metadata, must be verified
@@ -286,7 +290,10 @@ class revlog(object):
         v = REVLOG_DEFAULT_VERSION
         opts = getattr(opener, 'options', None)
         if opts is not None:
-            if 'revlogv1' in opts:
+            if 'revlogv2' in opts:
+                # version 2 revlogs always use generaldelta.
+                v = REVLOGV2 | FLAG_GENERALDELTA | FLAG_INLINE_DATA
+            elif 'revlogv1' in opts:
                 if 'generaldelta' in opts:
                     v |= FLAG_GENERALDELTA
             else:
@@ -332,6 +339,11 @@ class revlog(object):
         elif fmt == REVLOGV1 and flags & ~REVLOGV1_FLAGS:
             raise RevlogError(_("index %s unknown flags %#04x for revlogng")
                               % (self.indexfile, flags >> 16))
+        elif fmt == REVLOGV2:
+            if flags & ~REVLOGV2_FLAGS:
+                raise RevlogError(_('index %s unknown flags %04x for version '
+                                    '2 revlogs') %
+                                  (self.indexfile, flags >> 16))
         elif fmt > REVLOGV1:
             raise RevlogError(_("index %s unknown format %d")
                               % (self.indexfile, fmt))


More information about the Mercurial-devel mailing list