[PATCH V3] revlog: skeleton support for version 2 revlogs

Augie Fackler raf at durin42.com
Tue Jun 6 19:33:00 UTC 2017


On Mon, May 22, 2017 at 11:40:25PM -0700, Jun Wu wrote:
> I think this is an obvious good first step. So marked as pre-reviewed.
> I don't see better approaches to start moving forward.
>
> The choice of using a header magic that will change when the format gets
> formalized is brilliant.

Agreed, I've queued this, let's go ahead and start making progress
here. Sorry for the very slow reply.

>
> Excerpts from Gregory Szorc's message of 2017-05-22 15:20:12 -0700:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc at gmail.com>
> > # Date 1495250951 25200
> > #      Fri May 19 20:29:11 2017 -0700
> > # Node ID 2af0a53c6f744a55dc8effaeb57185c0db30fd88
> > # Parent  f40dc6f7c12f36fb56dbadeb32a96ee25e6bacbb
> > revlog: skeleton support for version 2 revlogs
> >
> > There are a number of improvements we want to make to revlogs
> > that will require a new version - version 2. It is unclear what the
> > full set of improvements will be or when we'll be done with them.
> > What I do know is that the process will likely take longer than a
> > single release, will require input from various stakeholders to
> > evaluate changes, and will have many contentious debates and
> > bikeshedding.
> >
> > It is unrealistic to develop revlog version 2 up front: there
> > are just too many uncertainties that we won't know until things
> > are implemented and experiments are run. Some changes will also
> > be invasive and prone to bit rot, so sitting on dozens of patches
> > is not practical.
> >
> > This commit introduces skeleton support for version 2 revlogs in
> > a way that is flexible and not bound by backwards compatibility
> > concerns.
> >
> > An experimental repo requirement for denoting revlog v2 has been
> > added. The requirement string has a sub-version component to it.
> > This will allow us to declare multiple requirements in the course
> > of developing revlog v2. Whenever we change the in-development
> > revlog v2 format, we can tweak the string, creating a new
> > requirement and locking out old clients. This will allow us to
> > make as many backwards incompatible changes and experiments to
> > revlog v2 as we want. In other words, we can land code and make
> > meaningful progress towards revlog v2 while still maintaining
> > extreme format flexibility up until the point we freeze the
> > format and remove the experimental labels.
> >
> > To enable the new repo requirement, you must supply an experimental
> > and undocumented config option. But not just any boolean flag
> > will do: you need to explicitly use a value that no sane person
> > should ever type. This is an additional guard against enabling
> > revlog v2 on an installation it shouldn't be enabled on. The
> > specific scenario I'm trying to prevent is say a user with a
> > 4.4 client with a frozen format enabling the option but then
> > downgrading to 4.3 and accidentally creating repos with an
> > outdated and unsupported repo format. Requiring a "challenge"
> > string should prevent this.
> >
> > Because the format is not yet finalized and I don't want to take
> > any chances, revlog v2's version is currently 0xDEAD. I figure
> > squatting on a value we're likely never to use as an actual revlog
> > version to mean "internal testing only" is acceptable. And
> > "dead" is easily recognized as something meaningful.
> >
> > There is a bunch of cleanup that is needed before work on revlog
> > v2 begins in earnest. I plan on doing that work once this patch
> > is accepted and we're comfortable with the idea of starting down
> > this path.
> >
> > diff --git a/mercurial/help/internals/revlogs.txt b/mercurial/help/internals/revlogs.txt
> > --- a/mercurial/help/internals/revlogs.txt
> > +++ b/mercurial/help/internals/revlogs.txt
> > @@ -45,6 +45,12 @@ 0
> >  1
> >     RevlogNG (*next generation*). It replaced version 0 when it was
> >     implemented in 2006.
> > +2
> > +   In-development version incorporating accumulated knowledge and
> > +   missing features from 10+ years of revlog version 1.
> > +57005 (0xdead)
> > +   Reserved for internal testing of new versions. No defined format
> > +   beyond 32-bit header.
> >
> >  The feature flags short consists of bit flags. Where 0 is the least
> >  significant bit, the following bit offsets define flags:
> > @@ -142,6 +148,14 @@ length from bytes 8-11 define how to acc
> >  The first 4 bytes of the revlog are shared between the revlog header
> >  and the 6 byte absolute offset field from the first revlog entry.
> >
> > +Version 2 Format
> > +================
> > +
> > +(In development. Format not finalized or stable.)
> > +
> > +Version 2 is currently identical to version 1. This will obviously
> > +change.
> > +
> >  Delta Chains
> >  ============
> >
> > diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> > --- a/mercurial/localrepo.py
> > +++ b/mercurial/localrepo.py
> > @@ -245,6 +245,10 @@ class locallegacypeer(localpeer):
> >      def changegroupsubset(self, bases, heads, source):
> >          return changegroup.changegroupsubset(self._repo, bases, heads, source)
> >
> > +# Increment the sub-version when the revlog v2 format changes to lock out old
> > +# clients.
> > +REVLOGV2_REQUIREMENT = 'exp-revlogv2.0'
> > +
> >  class localrepository(object):
> >
> >      supportedformats = {
> > @@ -252,6 +256,7 @@ class localrepository(object):
> >          'generaldelta',
> >          'treemanifest',
> >          'manifestv2',
> > +        REVLOGV2_REQUIREMENT,
> >      }
> >      _basesupported = supportedformats | {
> >          'store',
> > @@ -441,6 +446,10 @@ class localrepository(object):
> >              if r.startswith('exp-compression-'):
> >                  self.svfs.options['compengine'] = r[len('exp-compression-'):]
> >
> > +        # TODO move "revlogv2" to openerreqs once finalized.
> > +        if REVLOGV2_REQUIREMENT in self.requirements:
> > +            self.svfs.options['revlogv2'] = True
> > +
> >      def _writerequirements(self):
> >          scmutil.writerequires(self.vfs, self.requirements)
> >
> > @@ -2062,4 +2071,11 @@ def newreporequirements(repo):
> >      if ui.configbool('experimental', 'manifestv2', False):
> >          requirements.add('manifestv2')
> >
> > +    revlogv2 = ui.config('experimental', 'revlogv2')
> > +    if revlogv2 == 'enable-unstable-format-and-corrupt-my-data':
> > +        requirements.remove('revlogv1')
> > +        # generaldelta is implied by revlogv2.
> > +        requirements.discard('generaldelta')
> > +        requirements.add(REVLOGV2_REQUIREMENT)
> > +
> >      return requirements
> > diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> > --- a/mercurial/revlog.py
> > +++ b/mercurial/revlog.py
> > @@ -49,12 +49,16 @@ parsers = policy.importmod(r'parsers')
> >  # revlog header flags
> >  REVLOGV0 = 0
> >  REVLOGV1 = 1
> > +# Dummy value until file format is finalized.
> > +# Reminder: change the bounds check in revlog.__init__ when this is changed.
> > +REVLOGV2 = 0xDEAD
> >  FLAG_INLINE_DATA = (1 << 16)
> >  FLAG_GENERALDELTA = (1 << 17)
> >  REVLOG_DEFAULT_FLAGS = FLAG_INLINE_DATA
> >  REVLOG_DEFAULT_FORMAT = REVLOGV1
> >  REVLOG_DEFAULT_VERSION = REVLOG_DEFAULT_FORMAT | REVLOG_DEFAULT_FLAGS
> >  REVLOGV1_FLAGS = FLAG_INLINE_DATA | FLAG_GENERALDELTA
> > +REVLOGV2_FLAGS = REVLOGV1_FLAGS
> >
> >  # revlog index flags
> >  REVIDX_ISCENSORED = (1 << 15) # revision has censor metadata, must be verified
> > @@ -289,7 +293,10 @@ class revlog(object):
> >          v = REVLOG_DEFAULT_VERSION
> >          opts = getattr(opener, 'options', None)
> >          if opts is not None:
> > -            if 'revlogv1' in opts:
> > +            if 'revlogv2' in opts:
> > +                # version 2 revlogs always use generaldelta.
> > +                v = REVLOGV2 | FLAG_GENERALDELTA | FLAG_INLINE_DATA
> > +            elif 'revlogv1' in opts:
> >                  if 'generaldelta' in opts:
> >                      v |= FLAG_GENERALDELTA
> >              else:
> > @@ -339,6 +346,11 @@ class revlog(object):
> >                  raise RevlogError(_('unknown flags (%#04x) in version %d '
> >                                      'revlog %s') %
> >                                    (flags >> 16, fmt, self.indexfile))
> > +        elif fmt == REVLOGV2:
> > +            if flags & ~REVLOGV2_FLAGS:
> > +                raise RevlogError(_('unknown flags (%#04x) in version %d '
> > +                                    'revlog %s') %
> > +                                  (flags >> 16, fmt, self.indexfile))
> >          else:
> >              raise RevlogError(_('unknown version (%d) in revlog %s') %
> >                                (fmt, self.indexfile))
> > diff --git a/tests/test-revlog-v2.t b/tests/test-revlog-v2.t
> > new file mode 100644
> > --- /dev/null
> > +++ b/tests/test-revlog-v2.t
> > @@ -0,0 +1,62 @@
> > +A repo with unknown revlogv2 requirement string cannot be opened
> > +
> > +  $ hg init invalidreq
> > +  $ cd invalidreq
> > +  $ echo exp-revlogv2.unknown >> .hg/requires
> > +  $ hg log
> > +  abort: repository requires features unknown to this Mercurial: exp-revlogv2.unknown!
> > +  (see https://mercurial-scm.org/wiki/MissingRequirement  for more information)
> > +  [255]
> > +  $ cd ..
> > +
> > +Can create and open repo with revlog v2 requirement
> > +
> > +  $ cat >> $HGRCPATH << EOF
> > +  > [experimental]
> > +  > revlogv2 = enable-unstable-format-and-corrupt-my-data
> > +  > EOF
> > +
> > +  $ hg init empty-repo
> > +  $ cd empty-repo
> > +  $ cat .hg/requires
> > +  dotencode
> > +  exp-revlogv2.0
> > +  fncache
> > +  store
> > +
> > +  $ hg log
> > +
> > +Unknown flags to revlog are rejected
> > +
> > +  >>> with open('.hg/store/00changelog.i', 'wb') as fh:
> > +  ...     fh.write('\x00\x04\xde\xad')
> > +
> > +  $ hg log
> > +  abort: unknown flags (0x04) in version 57005 revlog 00changelog.i!
> > +  [255]
> > +
> > +  $ cd ..
> > +
> > +Writing a simple revlog v2 works
> > +
> > +  $ hg init simple
> > +  $ cd simple
> > +  $ touch foo
> > +  $ hg -q commit -A -m initial
> > +
> > +  $ hg log
> > +  changeset:   0:96ee1d7354c4
> > +  tag:         tip
> > +  user:        test
> > +  date:        Thu Jan 01 00:00:00 1970 +0000
> > +  summary:     initial
> > +
> > +Header written as expected (changelog always disables generaldelta)
> > +
> > +  $ f --hexdump --bytes 4 .hg/store/00changelog.i
> > +  .hg/store/00changelog.i:
> > +  0000: 00 01 de ad                                     |....|
> > +
> > +  $ f --hexdump --bytes 4 .hg/store/data/foo.i
> > +  .hg/store/data/foo.i:
> > +  0000: 00 03 de ad                                     |....|
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list