[PATCH 3 of 6] phases: add basic pushkey support

Wed Dec 14 16:43:20 CST 2011

On Wed, 2011-12-14 at 00:38 +0100, Pierre-Yves David wrote:
> On 13 déc. 2011, at 23:51, Matt Mackall wrote:
> 
> > On Tue, 2011-12-13 at 00:52 +0100, Pierre-Yves David wrote:
> >> +
> >> +def listphaseroots(repo):
> >> +    """List phases root for serialisation over pushkey"""
> >> +    keys = {}
> >> +    for phase in trackedphases:
> >> +        for root in repo._phaseroots[phase]:
> >> +            keys[hex(root)] = '%i' % phase
> >> +    if repo.ui.configbool('phases', 'publish', True):
> >> +        # Add an extra data to let remote know we are a publishing repo.
> >> +        # Publishing repo can't just pretend they are old repo. When pushing
> >> +        # to a publishing repo, the client still need to push phase boundary
> > 
> > Why is that? If the client sends a changegroup, the server can advance
> > the phases on its own, right?
> 
> Wrong, the server can not do it on its own for phases boundary related to common
> changeset without ancestor in the pushed changegroup.

Why? Every changeset in a changegroup that is pushed in a changegroup to
a publishing server and all their ancestors are public by definition.

Perhaps you're presupposing that I've already agree with this case:

> > If the client /does not/ send a changegroup, does it have any business
> > moving phases on the server? That's not immediately obvious.
> 
> This is related to the "bug" which triggered our last week discussion about
> publishing repo.
> 
> Now push don't only push changeset. It also push phase data. New phase data
> may apply to common changeset which won't be push (as they are common).
> Here is a very simple example:
> 
> 1) repo A push changeset X as draft to repo B
> 2) repo B make changeset X public 
> 3) repo B push to repo A. X is not pushed but the data that X as now public should

Understood, but unconvinced. Let me tweak your example a bit:

1) Alice pulls draft cset X from repo A
2) Alice pushes draft cset X to publishing repo B, now marked public in
Alice's repo
3) Alice does "empty" push to A again - should X become public or should
this be a no-op?

Now there are four cases here that you need to argue!

Assumption:
New commits are assumed to be draft regardless of the publish
setting/default.

(active = "used for local pulls and/or commits, etc., not just pushes")

Case 1: repo A is passive and publishing (usual server case)

It is impossible here for draft changesets to appear. No need for remote
clients to tweak phases because all changesets pushed are public.

Case 2: repo A is passive and non-publishing (special server case)

Users can push and pull draft csets. Server exports phases, users tweak
them when needed on push.

Case 3: repo A is active and publishing (questionable user case)

subcase 3a: repo is used solely by Alice

This might happen for instance if Alice is pushing to a machine on a
workstation that only allows one-way firewall access or Alice simply
finds it more convenient to push than pull. This is a misconfiguration
of publish: the push should -not- be publishing her work here. 

subcase 3b: Alice pushes, Alice does local work, other users pull

If people other than Alice are pulling, arguably Alice's _local_ draft
commits should be public as soon as they're pulled by 'others'.
Otherwise, we have a weird problem where the commits Alice pushes to
this repo are public while the ones she commits directly are draft. So
either this case is also a misconfiguration, or inconsistent.

(Note that 'public when pulled' implies 'public when committed' because
pullers will generally not have write access! And "public when
committed" contradicts our "draft when committed" assumption.)

subcase 3c: repo is used locally by Bob

Bob is asking for interesting times by allowing other people to push
directly to his active repo. Every time he commits, he'll potentially be
racing with Alice and creating new heads on his branch and will never
get a hint that he needs to merge unless he tries to push somewhere
else. He'll also have surprises if he attempts to use rollback, rebase,
etc. because what's being rolled back or rebased may move under his
feet. Also, the "ought to be public on commit" observation from 3b
applies here too. 

Case 4: repo is active and non-publishing

subcase 4a: repo is used solely by Alice
subcase 4b: Alice pushes, Alice does local work, other users pull
subcase 4c: repo is used locally by Bob

All of these work fine when treated like case 2

Summary:

Cases 1, 2, and 4 are fine regardless of how we treat the "empty push to
publishing repo case". And all subcases of case 3 are problematic in
ways that suggest we're either using it wrong, or there's something
wrong with our model. So I'm not at all convinced we have the right
answer yet.

Meta:

Complex problems like this with many use cases -demand- exhaustive
analysis. Please MAINTAIN A TABLE to demonstrate that you've done the
exhaustive analysis, to allow other people to AUDIT your analysis and
respond to it, and to document the behavior for posterity (ie in a
comment or commit description). I'm super-frustrated every time I go to
the effort of creating a table of use cases and then the discussion
regresses to sloppy informality again.

-- 
Mathematics is the supreme nostalgia of our time.