[PATCH 2 of 3] bundle: add config option to include phases

Martin von Zweigbergk martinvonz at google.com
Sun Jun 18 01:12:44 EDT 2017


On Sat, Jun 17, 2017 at 4:46 PM, Pierre-Yves David
<pierre-yves.david at ens-lyon.org> wrote:
>
>
> On 06/17/2017 07:19 AM, Martin von Zweigbergk via Mercurial-devel wrote:
>>
>> # HG changeset patch
>> # User Martin von Zweigbergk <martinvonz at google.com>
>> # Date 1497460519 25200
>> #      Wed Jun 14 10:15:19 2017 -0700
>> # Node ID 792544200a5a7e21bcc8646604d85b62db277d79
>> # Parent  123eb8b3b913585cc0a29c14c232ab6fc78978b3
>> bundle: add config option to include phases
>>
>> This adds an experimental.bundle-phases config option to include phase
>> information in bundles. As with the recently added support for
>> bundling obsmarkers, the support for bundling phases is hidden behind
>> the config option until we decide to make a bundle format v3 that
>> includes phases (and obsmarkers and ...).
>>
>> We cannot use the same listkeys bundle part as is used on the wire,
>> because we want different semantics. Specifically, we never want the
>> unbundled phases to override the phase of an existing changeset. This
>> is different from exchange where a phase advancement on the remote
>> will propagate to the local repo. For the non-exchange usecase, we
>> instead want to be able to bundle a draft changeset, change its parent
>> to secret and not have the unbundling reset the parent's phase back to
>> draft. For this reason, we use a new bundle part instead. The new
>> bundle part contains the phase roots that are internal to the set of
>> bundled revisions. It also contains the phase for each root of the set
>> of bundled revisions.
>
>
> `hg bundle` primary goal and usage is to allow exchange of data between
> repositories (according usage[1] and doc[2][3]). So moving away from the
> standard phase semantic seems problematic. I think we should stick to that
> usual behavior of "move to lowest phase seen".

I did note while working on this that it this was the first bundle
part that was not designed for exchange. So I did notice that it was
unusual in that way, but I didn't see anything problematic about it.
But let's put that discussion on hold until we have consensus on the
right user-visible behavior for this. If the consensus is that
unbundling should be allowed to affect the phase of changesets already
in the repo, I'll happily switch to using the listkeys part.

>
> In addition this semantic will provide challenging consistency. The unbundle
> behavior can see large changes implied by small "unrelated" details. Here is
> a quick example (I have others):
>
>  * bundle X contains changeset (A,B,C) all public.
>  * Alice repo contains (A, B, C) all draft
>  * Bob   repo contains (A, B) all draft
>
> With the proposed semantic running `hg unbundle X`:
>
>  * in Alice repository: no changes,
> * in Bob repository: C is added (as public), all of A,B,C is marked public.

No, the proposed semantics is to leave the phase on existing
changesets unchanged ("Specifically, we never want the unbundled
phases to override the phase of an existing changeset."). This is
covered by the test called "Phase of parent of stripped node doesn't
get decreased to accomodate child; instead, the child adapts by
increasing its phase".

>
> I'm not sure what is the actual usecase for "bundle; backward phase change;
> unbundle" do you have anything in mind? The "draft" → "secret" movement is
> not natural and mercurial does not provide clean ways to do it (requires
> --force) one should not expect it to be preserved.

I have no actual usecase in mind. It's more a consequence of the
principle stated above about not affecting the phase of a changeset
already in the repo.

>
> I could keep going for a while, but Martin requested a short feedback ;-)

Thanks!

>
> [1] default to compare with another repositories:
>
> https://www.mercurial-scm.org/repo/hg/file/29558247b00e/mercurial/commands.py#l1247
> [2]
> https://www.mercurial-scm.org/repo/hg/file/29558247b00e/mercurial/commands.py#l1250
> [3]
> https://www.mercurial-scm.org/repo/hg/file/29558247b00e/mercurial/commands.py#l1263
>
>
>> For now, phases are only included by "hg bundle", and not by
>> e.g. strip and rebase.
>>
>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
>> --- a/mercurial/bundle2.py
>> +++ b/mercurial/bundle2.py
>> @@ -158,6 +158,7 @@
>>      changegroup,
>>      error,
>>      obsolete,
>> +    phases,
>>      pushkey,
>>      pycompat,
>>      tags,
>> @@ -165,6 +166,11 @@
>>      util,
>>  )
>>
>> +from .node import (
>> +    bin,
>> +    hex,
>> +)
>> +
>>  urlerr = util.urlerr
>>  urlreq = util.urlreq
>>
>> @@ -1386,6 +1392,13 @@
>>          obsmarkers = repo.obsstore.relevantmarkers(outgoing.missing)
>>          buildobsmarkerspart(bundler, obsmarkers)
>>
>> +    if opts.get('phases', False):
>> +        keys = []
>> +        rootphases = phases.subsetphases(repo, outgoing.missing)
>> +        for root, phase in rootphases.iteritems():
>> +            keys.append((hex(root), '%d' % phase))
>> +        bundler.newpart('phases', data=pushkey.encodekeys(keys))
>
>
> Small note: I would prefer to not see the pushkey encoding leaks here. the
> pushkey encoding is text based and requires escaping as soon as the data
> becomes a bit complicated. Since the rest of bundle2 is using binary data I
> would prefer to see binary used here too. Bundle parts are exposed over the
> wires so we have a strong BC on them. As soon as release a version using
> pushey, we'll have to maintain it forever.

As you noted above, these bundles will not actually be sent over the
wire. However, they can of course be stored and shared outside of the
hg wire protocol, so compatibility is still required once this moves
out of experimental.

Speaking of that, I guess I should add "version" part parameter in
case we ever need to change the format.

>
> (note: having a generic utility to encode/decode mapping in bundle part
> would be useful to more than phase)
>
>
>> +
>>  def addparttagsfnodescache(repo, bundler, outgoing):
>>      # we include the tags fnode cache for the bundle changeset
>>      # (as an optional parts)
>> @@ -1730,6 +1743,18 @@
>>                  kwargs[key] = inpart.params[key]
>>          raise error.PushkeyFailed(partid=str(inpart.id), **kwargs)
>>
>> + at parthandler('phases')
>> +def handlephases(op, inpart):
>> +    """apply phases from bundle part to repo"""
>> +    phaseroots = {}
>> +    for nodehex, phasestr in
>> pushkey.decodekeys(inpart.read()).iteritems():
>> +        phaseroots[bin(nodehex)] = int(phasestr)
>> +    addednodes = []
>> +    for entry in op.records['changegroup']:
>> +        addednodes.extend(entry['addednodes'])
>> +    phases.updatephases(op.repo.unfiltered(), op.gettransaction(),
>> phaseroots,
>> +                        addednodes)
>> +
>>  @parthandler('reply:pushkey', ('return', 'in-reply-to'))
>>  def handlepushkeyreply(op, inpart):
>>      """retrieve the result of a pushkey request"""
>> diff --git a/mercurial/commands.py b/mercurial/commands.py
>> --- a/mercurial/commands.py
>> +++ b/mercurial/commands.py
>> @@ -1351,6 +1351,8 @@
>>      contentopts = {'cg.version': cgversion}
>>      if repo.ui.configbool('experimental', 'evolution.bundle-obsmarker',
>> False):
>>          contentopts['obsolescence'] = True
>> +    if repo.ui.configbool('experimental', 'bundle-phases', False):
>> +        contentopts['phases'] = True
>>      bundle2.writenewbundle(ui, repo, 'bundle', fname, bversion, outgoing,
>>                             contentopts, compression=bcompression,
>>                             compopts=compopts)
>> diff --git a/mercurial/phases.py b/mercurial/phases.py
>> --- a/mercurial/phases.py
>> +++ b/mercurial/phases.py
>> @@ -430,6 +430,62 @@
>>          else:
>>              return False
>>
>> +def subsetphases(repo, subset):
>> +    """Finds the phase roots for a subset of a history
>> +
>> +    Returns a dict with a root as key and phase as value. The keys will
>> +    be the phases root within the nodes subset, plus the roots of the
>> subset.
>> +    """
>> +    phases = {}
>> +    for phase in trackedphases:
>> +        for root in repo._phasecache.phaseroots[phase]:
>> +            if root in subset:
>> +                phases[root] = phase
>> +    for ctx in repo.set('roots(%ln)', subset):
>> +        phases[ctx.node()] = ctx.phase()
>> +    return phases
>> +
>> +def updatephases(repo, tr, phaseroots, addednodes):
>> +    """Updates the phase roots for changesets added in the transaction
>> +
>> +    Applies the given phase roots, but makes sure to only change the
>> phase
>> +    of a changeset if it's been added in the transaction
>> (tr.changes['rev']).
>> +    """
>> +    cl = repo.changelog
>> +    rootrevs = [cl.rev(n) for n in phaseroots]
>> +
>> +    # First update the phase for the roots of the set of added revisions
>> +    for rev in repo.revs('roots(%ln)', addednodes):
>> +        node = cl.node(rev)
>> +        ctx = repo[node]
>> +        if node in phaseroots:
>> +            phase = phaseroots[node]
>> +            # remove it so we don't process it in the loop below
>> +            del phaseroots[node]
>> +        else:
>> +            # Getting here means that the parent of 'rev' was already in
>> the
>> +            # repo. In this case, we find the phase to use by looking
>> +            # for the highest phase among the closest phase roots in the
>> +            # bundle. We will always find an ancestor here.
>> +            roots = cl.reachableroots(minroot=0, roots=rootrevs,
>> heads=[rev])
>> +            phase = max((phaseroots[cl.node(root)] for root in roots))
>> +        parentphases = [p.phase() for p in ctx.parents()]
>> +        phase = max(phase, *parentphases)
>> +        nodes = [cl.node(r) for r in cl.descendants([rev])]
>> +        nodes.append(node)
>> +        if phase < ctx.phase():
>> +            advanceboundary(repo, tr, phase, nodes)
>> +        elif phase > ctx.phase():
>> +            retractboundary(repo, tr, phase, nodes)
>> +
>> +    # Now update the phase for revisions internal to the set of added
>> revisions
>> +    for node, phase in phaseroots.iteritems():
>> +        if node not in addednodes:
>> +            continue
>> +        ctx = repo[node]
>> +        if phase > ctx.phase():
>> +            retractboundary(repo, tr, phase, [node])
>> +
>>  def analyzeremotephases(repo, subset, roots):
>>      """Compute phases heads and root in a subset of node from root dict
>>
>> diff --git a/tests/test-bundle-phases.t b/tests/test-bundle-phases.t
>> new file mode 100644
>> --- /dev/null
>> +++ b/tests/test-bundle-phases.t
>> @@ -0,0 +1,149 @@
>> +  $ cat >> $HGRCPATH <<EOF
>> +  > usegeneraldelta=yes
>> +  > [experimental]
>> +  > bundle-phases=yes
>> +  > [extensions]
>> +  > strip=
>> +  > drawdag=$TESTDIR/drawdag.py
>> +  > EOF
>> +
>> +Set up repo with linear history
>> +  $ hg init linear
>> +  $ cd linear
>> +  $ hg debugdrawdag <<'EOF'
>> +  > E
>> +  > |
>> +  > D
>> +  > |
>> +  > C
>> +  > |
>> +  > B
>> +  > |
>> +  > A
>> +  > EOF
>> +  $ hg phase --public A
>> +  $ hg phase --force --secret D
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C draft
>> +  |
>> +  o  B draft
>> +  |
>> +  o  A public
>> +
>> +Phases are restored when unbundling
>> +  $ hg bundle --base B -r E bundle
>> +  3 changesets found
>> +  $ hg strip --no-backup C
>> +  $ hg unbundle -q bundle
>> +  $ rm bundle
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C draft
>> +  |
>> +  o  B draft
>> +  |
>> +  o  A public
>> +
>> +Root revision's phase is preserved
>> +  $ hg bundle -a bundle
>> +  5 changesets found
>> +  $ hg strip --no-backup C
>> +  $ hg unbundle -q bundle
>> +  $ rm bundle
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C draft
>> +  |
>> +  o  B draft
>> +  |
>> +  o  A public
>> +
>> +Revisions within bundle preserve their phase even if parent changes its
>> phase
>> +  $ hg bundle --base B -r E bundle
>> +  3 changesets found
>> +  $ hg strip --no-backup C
>> +  $ hg phase --public B
>> +  $ hg unbundle -q bundle
>> +  $ rm bundle
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C draft
>> +  |
>> +  o  B public
>> +  |
>> +  o  A public
>> +
>> +Phase of parent of stripped node doesn't get decreased to accomodate
>> child;
>> +instead, the child adapts by increasing its phase
>> +  $ hg bundle --base B -r E bundle
>> +  3 changesets found
>> +  $ hg strip --no-backup C
>> +  $ hg phase --force --secret B
>> +  $ hg unbundle -q bundle
>> +  $ rm bundle
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C secret
>> +  |
>> +  o  B secret
>> +  |
>> +  o  A public
>> +
>> +Unbundling should not affect phase of changeset already in the repo (even
>> +if there's a phase root in the bundle pointing to it).
>> +  $ hg bundle --base B -r E bundle
>> +  3 changesets found
>> +  $ hg strip --no-backup D
>> +  $ hg phase --public C
>> +  $ hg unbundle -q bundle
>> +  $ rm bundle
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C public
>> +  |
>> +  o  B public
>> +  |
>> +  o  A public
>> +
>> +Here we set D to draft, bundle it and then set the phase of C to secret.
>> We
>> +then check that phase root in the bundle saying "D: draft" does not mean
>> that
>> +the draft boundary is advanced to D (since C was not unbundled and it was
>> +in the secret phase).
>> +  $ hg phase --draft D
>> +  $ hg bundle --base C -r E bundle
>> +  2 changesets found
>> +  $ hg strip --no-backup D
>> +  $ hg phase --force --secret C
>> +  $ hg unbundle -q bundle
>> +  $ rm bundle
>> +  $ hg log -G -T '{desc} {phase}\n'
>> +  o  E secret
>> +  |
>> +  o  D secret
>> +  |
>> +  o  C secret
>> +  |
>> +  o  B public
>> +  |
>> +  o  A public
>> +
>> +  $ cd ..
>
>
> --
> Pierre-Yves David


More information about the Mercurial-devel mailing list