[PATCH RFC need advice] changegroup: introduce cg3, which has support for exchanging treemanifests

Gregory Szorc gregory.szorc at gmail.com
Thu Oct 15 13:07:40 CDT 2015


On Thu, Oct 15, 2015 at 7:57 AM, Augie Fackler <raf at durin42.com> wrote:

> # HG changeset patch
> # User Augie Fackler <augie at google.com>
> # Date 1444327710 14400
> #      Thu Oct 08 14:08:30 2015 -0400
> # Node ID 890de0a21cd7cfdceaef55564d6ac010ad27c32a
> # Parent  83b738235f1bfc33a943a63a5d0f5527ac2da736
> changegroup: introduce cg3, which has support for exchanging treemanifests
>
> Open questions:
>   1) Are we happy with just a 't' or 'f' for flagging type of manifest?
>

This is a binary format. So we have 256 values to play with here. I think
there is room for expansion. Only question would be whether you are using
't' and 'f' instead of say 0x01 and 0x02. But that's cosmetic.


>   2) How can we not have the grotesque requirements hack during apply()?
>

I have a similar issue for stream clones. Currently, the "perform a stream
clone" code updates the requirements when data comes in. Even in my RFC
stream clone bundle series, this isn't done at bundle application though:
only as part of consuming a stream clone from the wire protocol. It does,
however, verify the incoming requirements are supported.

Both of us seem to be grappling with the problem that "unbundle" can occur
both during clone and during an incremental "pull." It's almost like we
need an "intent" to the unbundler that refuses to perform clone-time
actions (like requirements adjusting) on post-clone unbundles.


>
> Many thanks to adgar at google.com for helping me with various odd
> corners of the API.
>
> diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
> --- a/mercurial/changegroup.py
> +++ b/mercurial/changegroup.py
> @@ -8,6 +8,7 @@
>  from __future__ import absolute_import
>
>  import os
> +import posixpath
>  import struct
>  import tempfile
>  import weakref
> @@ -496,6 +497,53 @@ class cg2unpacker(cg1unpacker):
>          node, p1, p2, deltabase, cs = headertuple
>          return node, p1, p2, deltabase, cs
>
> +class cg3unpacker(cg2unpacker):
> +    deltaheader = _CHANGEGROUPV2_DELTA_HEADER
> +    deltaheadersize = struct.calcsize(deltaheader)
> +    version = '03'
> +
> +    def __init__(self, *args, **kwargs):
> +        super(cg3unpacker, self).__init__(*args, **kwargs)
> +        self._trees = False
> +
> +    def manifestheader(self):
> +        """Read the manifest header.
> +
> +        cg3 supports treemanifests as well as flatmanifests, so we
> +        need to read the header for manifests and find out which we
> +        should read.
> +        """
> +        flagsize = self._chunklength()
> +        assert flagsize == 1, 'flag size should be 1'
> +        flag = readexactly(self, 1)
>
+        if flag not in 'ft':
> +            raise error.Abort(
> +                'Unknown manifest format %r in cg3unpacker.' % flag)
> +        return flag
> +
> +    def _unpackmanifests(self, repo, revmap, trp, prog, numchanges):
> +        # TODO: include number of manifest nodes in header for better
> +        # progress?
> +        flag = self.manifestheader()
> +        if flag == 't':
> +            # TODO this is applying the requirements far too late :(
> +            repo.requirements.add('treemanifest')
> +            repo._applyopenerreqs()
> +            repo._writerequirements()
> +        self.callback = prog(_('manifests'), numchanges)
> +        repo.manifest.addgroup(self, revmap, trp)
> +        dirlog = repo.manifest.dirlog
> +        if flag == 't':
> +            while True:
> +                # TODO consider not (ab)using fileheader for
> +                # treemanifest headers
> +                hdr = self.filelogheader()
> +                if not hdr:
> +                    break
> +                dirname = hdr['filename']
> +                dl = dirlog(dirname)
> +                dl.addgroup(self, revmap, trp)
> +
>  class headerlessfixup(object):
>      def __init__(self, fh, h):
>          self._h = h
> @@ -593,8 +641,9 @@ class cg1packer(object):
>          rr, rl = revlog.rev, revlog.linkrev
>          return [n for n in missing if rl(rr(n)) not in commonrevs]
>
> -    def _packmanifests(self, mfnodes, lookuplinknode):
> +    def _packmanifests(self, mfnodes, tmfnodes, lookuplinknode):
>          """Pack flat manifests into a changegroup stream."""
> +        assert not tmfnodes
>          ml = self._repo.manifest
>          size = 0
>          for chunk in self.group(
> @@ -611,6 +660,7 @@ class cg1packer(object):
>
>          clrevorder = {}
>          mfs = {} # needed manifests
> +        tmfnodes = {}
>          fnodes = {} # needed file nodes
>          changedfiles = set()
>
> @@ -648,6 +698,11 @@ class cg1packer(object):
>          # simply take the slowpath, which already has the 'clrevorder'
> logic.
>          # This was also fixed in cc0ff93d0c0c.
>          fastpathlinkrev = fastpathlinkrev and not self._reorder
> +        # Treemanifests don't work correctly with fastpathlinkrev
> +        # either, because we don't discover which directory nodes to
> +        # send along with files. This could probably be fixed.
> +        fastpathlinkrev = fastpathlinkrev and (
> +            'treemanifest' not in self._repo.requirements)
>          # Callback for the manifest, used to collect linkrevs for filelog
>          # revisions.
>          # Returns the linkrev node (collected in lookupcl).
> @@ -663,10 +718,26 @@ class cg1packer(object):
>                          fclnode = fclnodes.setdefault(n, clnode)
>                          if clrevorder[clnode] < clrevorder[fclnode]:
>                              fclnodes[n] = clnode
> +                        # gather list of changed treemanifest nodes
> +                        if '/' in f and ('treemanifest' in
> +                                         self._repo.requirements):
> +                            dirs = f.split('/')[:-1]
> +                            submf = mdata
> +                            path = []
> +                            for sd in dirs:
> +                                submf = submf._dirs[sd + '/']
> +                                fullpath = posixpath.join(*(path + [sd]))
> +                                path.append(sd)
> +                                tmfclnodes = tmfnodes.setdefault(
> +                                    fullpath, {})
> +                                tmfclnodes.setdefault(submf.node(),
> clnode)
> +                                if clrevorder[clnode] <
> clrevorder[fclnode]:
> +                                    tmfclnodes[n] = clnode
>              return clnode
>
>          mfnodes = self.prune(ml, mfs, commonrevs)
> -        for x in self._packmanifests(mfnodes, lookupmflinknode):
> +        for x in self._packmanifests(
> +            mfnodes, tmfnodes, lookupmflinknode):
>              yield x
>
>          mfs.clear()
> @@ -784,9 +855,38 @@ class cg2packer(cg1packer):
>      def builddeltaheader(self, node, p1n, p2n, basenode, linknode):
>          return struct.pack(self.deltaheader, node, p1n, p2n, basenode,
> linknode)
>
> +class cg3packer(cg2packer):
> +    version = '03'
> +    deltaheader = _CHANGEGROUPV2_DELTA_HEADER
> +
> +    def _packmanifests(self, mfnodes, tmfnodes, lookuplinknode):
> +        yield chunkheader(1)
> +        if 'treemanifest' not in self._repo.requirements:
> +            yield 'f'
> +            assert not tmfnodes
> +            for x in super(cg3packer, self)._packmanifests(
> +                    mfnodes, {}, lookuplinknode):
> +                yield x
> +            return
> +        yield 't'
> +        for x in super(cg3packer, self)._packmanifests(
> +            mfnodes, {}, lookuplinknode):
> +            yield x
> +        dirlog = self._repo.manifest.dirlog
> +        for name, nodes in tmfnodes.iteritems():
> +            # TODO consider defining a separate method for this
> +            yield self.fileheader(name)
> +            dl = dirlog(name)
> +            for chunk in self.group(nodes, dl, nodes.get):
> +                yield chunk
> +        yield self.close()
> +
> +
>  packermap = {'01': (cg1packer, cg1unpacker),
>               # cg2 adds support for exchanging generaldelta
>               '02': (cg2packer, cg2unpacker),
> +             # cg3 adds support for exchanging treemanifests
> +             '03': (cg3packer, cg3unpacker),
>  }
>
>  def _changegroupinfo(repo, nodes, source):
> diff --git a/tests/test-treemanifest.t b/tests/test-treemanifest.t
> --- a/tests/test-treemanifest.t
> +++ b/tests/test-treemanifest.t
> @@ -384,3 +384,45 @@ within that.
>    $ mv oldmf2 .hg/store/meta/b/foo
>    $ mv oldmf3 .hg/store/meta/b/bar/orange
>
> +Test cloning a treemanifest repo over http.
> +  $ hg serve -p $HGPORT -d --pid-file=hg.pid --errorlog=errors.log
> +  $ cat hg.pid >> $DAEMON_PIDS
> +  $ cd ..
> +  $ hg clone http://localhost:$HGPORT deepclone
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 1 changesets with 8 changes to 8 files
> +  updating to branch default
> +  8 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +No server errors.
> +  $ cat deeprepo/errors.log
> +Tree manifest revlogs exist.
> +  $ find deepclone/.hg/store/meta | sort
> +  deepclone/.hg/store/meta
> +  deepclone/.hg/store/meta/a
> +  deepclone/.hg/store/meta/a/00manifest.i
> +  deepclone/.hg/store/meta/b
> +  deepclone/.hg/store/meta/b/00manifest.i
> +  deepclone/.hg/store/meta/b/bar
> +  deepclone/.hg/store/meta/b/bar/00manifest.i
> +  deepclone/.hg/store/meta/b/bar/orange
> +  deepclone/.hg/store/meta/b/bar/orange/00manifest.i
> +  deepclone/.hg/store/meta/b/bar/orange/fly
> +  deepclone/.hg/store/meta/b/bar/orange/fly/00manifest.i
> +  deepclone/.hg/store/meta/b/foo
> +  deepclone/.hg/store/meta/b/foo/00manifest.i
> +  deepclone/.hg/store/meta/b/foo/apple
> +  deepclone/.hg/store/meta/b/foo/apple/00manifest.i
> +  deepclone/.hg/store/meta/b/foo/apple/bees
> +  deepclone/.hg/store/meta/b/foo/apple/bees/00manifest.i
> +Verify passes.
> +  $ cd deepclone
> +  $ hg verify
> +  checking changesets
> +  checking manifests
> +  crosschecking files in changesets and manifests
> +  checking files
> +  8 files, 1 changesets, 8 total revisions
> +  $ cd ..
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151015/7d5a8738/attachment.html>


More information about the Mercurial-devel mailing list