[PATCH 2 of 3] bundle: add config option to include phases

Pierre-Yves David pierre-yves.david at ens-lyon.org
Mon Jun 19 12:06:27 EDT 2017



On 06/18/2017 07:12 AM, Martin von Zweigbergk wrote:
> On Sat, Jun 17, 2017 at 4:46 PM, Pierre-Yves David
> <pierre-yves.david at ens-lyon.org> wrote:
>>
>>
>> On 06/17/2017 07:19 AM, Martin von Zweigbergk via Mercurial-devel wrote:
>>>
>>> # HG changeset patch
>>> # User Martin von Zweigbergk <martinvonz at google.com>
>>> # Date 1497460519 25200
>>> #      Wed Jun 14 10:15:19 2017 -0700
>>> # Node ID 792544200a5a7e21bcc8646604d85b62db277d79
>>> # Parent  123eb8b3b913585cc0a29c14c232ab6fc78978b3
>>> bundle: add config option to include phases
>>>
>>> This adds an experimental.bundle-phases config option to include phase
>>> information in bundles. As with the recently added support for
>>> bundling obsmarkers, the support for bundling phases is hidden behind
>>> the config option until we decide to make a bundle format v3 that
>>> includes phases (and obsmarkers and ...).
>>>
>>> We cannot use the same listkeys bundle part as is used on the wire,
>>> because we want different semantics. Specifically, we never want the
>>> unbundled phases to override the phase of an existing changeset. This
>>> is different from exchange where a phase advancement on the remote
>>> will propagate to the local repo. For the non-exchange usecase, we
>>> instead want to be able to bundle a draft changeset, change its parent
>>> to secret and not have the unbundling reset the parent's phase back to
>>> draft. For this reason, we use a new bundle part instead. The new
>>> bundle part contains the phase roots that are internal to the set of
>>> bundled revisions. It also contains the phase for each root of the set
>>> of bundled revisions.
>>
>>
>> `hg bundle` primary goal and usage is to allow exchange of data between
>> repositories (according usage[1] and doc[2][3]). So moving away from the
>> standard phase semantic seems problematic. I think we should stick to that
>> usual behavior of "move to lowest phase seen".
>
> I did note while working on this that it this was the first bundle
> part that was not designed for exchange. So I did notice that it was
> unusual in that way, but I didn't see anything problematic about it.
> But let's put that discussion on hold until we have consensus on the
> right user-visible behavior for this. If the consensus is that
> unbundling should be allowed to affect the phase of changesets already
> in the repo, I'll happily switch to using the listkeys part.

Now that you point it out, I'm not sure why we are skipping push/pull 
usage for the new parts? We currently use pushkey-over-bundle2 to 
exchange phases, the pushkey protocol is antic and not very adequate. 
This new part should be a good opportunity to fix that.

For example: one of the problem with pushkey is that it list all remote 
phase roots on pull. Not just the relevant one. In my Mercurial 
repository this means sending about 32K of useless data over the wire 
with each pull.

>> In addition this semantic will provide challenging consistency. The unbundle
>> behavior can see large changes implied by small "unrelated" details. Here is
>> a quick example (I have others):
>>
>>  * bundle X contains changeset (A,B,C) all public.
>>  * Alice repo contains (A, B, C) all draft
>>  * Bob   repo contains (A, B) all draft
>>
>> With the proposed semantic running `hg unbundle X`:
>>
>>  * in Alice repository: no changes,
>> * in Bob repository: C is added (as public), all of A,B,C is marked public.
>
> No, the proposed semantics is to leave the phase on existing
> changesets unchanged ("Specifically, we never want the unbundled
> phases to override the phase of an existing changeset."). This is
> covered by the test called "Phase of parent of stripped node doesn't
> get decreased to accomodate child; instead, the child adapts by
> increasing its phase".

Ha, sorry for processing that part a bit too fast. That is more 
consistent in itself :-).

However this still violates the core behavior and definition of phases. 
If a "public" changeset is sent in a bundle, increasing its phase to 
"draft" on unbundling is illegal[1]. Public phase is an important 
security feature, we do not want to be less consistent in enforcing it.

[1] 
https://www.mercurial-scm.org/repo/hg/file/a3a36bcf122e/mercurial/help/phases.txt#l16

This change in phase behavior also introduces inconsistency with other 
bundle related data: "obsolescence markers". The bundle contains markers 
relevant to the bundled changesets, and these get added to the 
repository in all cases (changesets known or not).

Going further down that road, the exchange of obsolescence markers 
without propagation of the phases information will produce bad result in 
the case of obsmarkers affecting (should-be) public changesets.


Use-cases-wise, I just had one Yesterday. Boris and I were working in a 
plane (wifi banned) and we exchanged changesets with bundle over an USB 
dongle. Changeset and obsmarkers information came along, but we had to 
synchronize phases by hand (annoying and error prone).

>> I'm not sure what is the actual usecase for "bundle; backward phase change;
>> unbundle" do you have anything in mind? The "draft" → "secret" movement is
>> not natural and mercurial does not provide clean ways to do it (requires
>> --force) one should not expect it to be preserved.
>
> I have no actual usecase in mind. It's more a consequence of the
> principle stated above about not affecting the phase of a changeset
> already in the repo.
>
>>
>> I could keep going for a while, but Martin requested a short feedback ;-)
>
> Thanks!
>
>>
>> [1] default to compare with another repositories:
>>
>> https://www.mercurial-scm.org/repo/hg/file/29558247b00e/mercurial/commands.py#l1247
>> [2]
>> https://www.mercurial-scm.org/repo/hg/file/29558247b00e/mercurial/commands.py#l1250
>> [3]
>> https://www.mercurial-scm.org/repo/hg/file/29558247b00e/mercurial/commands.py#l1263
>>
>>
>>> For now, phases are only included by "hg bundle", and not by
>>> e.g. strip and rebase.
>>>
>>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
>>> --- a/mercurial/bundle2.py
>>> +++ b/mercurial/bundle2.py
>>> @@ -158,6 +158,7 @@
>>>      changegroup,
>>>      error,
>>>      obsolete,
>>> +    phases,
>>>      pushkey,
>>>      pycompat,
>>>      tags,
>>> @@ -165,6 +166,11 @@
>>>      util,
>>>  )
>>>
>>> +from .node import (
>>> +    bin,
>>> +    hex,
>>> +)
>>> +
>>>  urlerr = util.urlerr
>>>  urlreq = util.urlreq
>>>
>>> @@ -1386,6 +1392,13 @@
>>>          obsmarkers = repo.obsstore.relevantmarkers(outgoing.missing)
>>>          buildobsmarkerspart(bundler, obsmarkers)
>>>
>>> +    if opts.get('phases', False):
>>> +        keys = []
>>> +        rootphases = phases.subsetphases(repo, outgoing.missing)
>>> +        for root, phase in rootphases.iteritems():
>>> +            keys.append((hex(root), '%d' % phase))
>>> +        bundler.newpart('phases', data=pushkey.encodekeys(keys))
>>
>>
>> Small note: I would prefer to not see the pushkey encoding leaks here. the
>> pushkey encoding is text based and requires escaping as soon as the data
>> becomes a bit complicated. Since the rest of bundle2 is using binary data I
>> would prefer to see binary used here too. Bundle parts are exposed over the
>> wires so we have a strong BC on them. As soon as release a version using
>> pushey, we'll have to maintain it forever.
>
> As you noted above, these bundles will not actually be sent over the
> wire. However, they can of course be stored and shared outside of the
> hg wire protocol, so compatibility is still required once this moves
> out of experimental.

Yes, it is going on disk and is readable by other mercurial process so 
it get the same compatibility constraint that the other ones. (It seems 
like we agree here).

If you make your current series available somewhere, I'm ready to offer 
you a small patch switching the encoding to binary to fold in.

> Speaking of that, I guess I should add "version" part parameter in
> case we ever need to change the format.

You don't really needs too. The lack of "version" parameters indicate 
the first version. If we needs to make a new version, we can add a 
mandatory parameters that older client will choke on.

>> (note: having a generic utility to encode/decode mapping in bundle part
>> would be useful to more than phase)
>>
>>  […]

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list