[PATCH 7 of 8 V2] revset: add origins() predicate

Wed Jun 27 16:08:10 CDT 2012

On Sat, 2012-06-23 at 01:21 -0400, Matt Harbison wrote:
> > In no case should we have a "foo" and "foos" with opposite semantics.
> > That's just way too confusing to live. And if we have origin and
> > destination that aren't opposites, we've also got a problem. At least
> > for simple grafts, I'd expect origin(destination(x)) = x and
> > destination(origin(y)) = y.
> 
> Makes sense, though I don't think we can do that for graft specifically, 
> unless "simple graft" means that graft is only done once on a node, and 
> never on the node that gets created because of that graft.

Exactly.

> Graft just has a dangling "REVISION...", but the help text references 
> "source changesets".  A very quick scan of the source tree revealed a 
> few "...copy source..." messages in verify.py that seem file related, 
> and the choose local/remote prompt for subrepos.

No strong opinion, but see below..

> I think 'source' is a more natural complement to destination, and can 
> dig further to find other uses if it seems like a useful rename.
> 
> 
> Since this probably hasn't been on anybody's radar recently, and there 
> were surprises as I implemented this, let me restate/modify the goals 
> here, both for me and to get more opinions.  Apologies in advance, 
> because this is a bit long.
> 
> I think it is useful to be able to answer:
> 
> A) Where did cset X come from?
> B) Where did X go?
> C) What duplicates of X exist? (I don't know if duplicates is an 
> acceptable synonym for copies[2])
> D) Is this the very original X? (I don't think that the opposite, "the 
> very latest X" is meaningful or even possible, since the original date 
> is preserved for each)
> 
> 
> For the sake of this, let there be a cset X, which when copied creates 
> cset Y, which when itself is copied creates Z.  Copy can be any of the 
> three ops.  And let's use "origin(x)" means "the origin of x".  The 
> possible results follow, with alternate 2 being the naive/simplest result:
> 
> A) Where did X come from?
> 
>     Op      Preferred       Alternate 1    Alternate 2
> origin(X)    {}               same           same
> origin(Y)    {X}              same           same
> origin(Z)    {X, Y} *         {X}             {Y}

I tend to prefer {X} here.

> * I prefer this because I think it is useful to see the entire list of 
> the preceding hops Z took, without manually taking the result of one and 
> requerying until the trail ends.  Thg uses a hyperlink to work backward 
> one hop at a time, which is much easier when trying to follow the copies 
> than reentering a command with a new rev each time.
>    The downside is if Y and Z were created by grafting, the only choice 
> is {X} because of how graft is implemented (it preserves the 'source' 
> field if it already exists).  A second pass could be made to include all 
> revs with a 'source' field that matches Z's 'source' (i.e. X for a 
> graft), but it would have to be a special case for graft alone so as to 
> not follow an unrelated path that was _transplanted_ from X.
> 
> 
> B) Where did X go?
> 
>     Op             Preferred    Alternative 1     Alternative 2
> destination(X)     {Y, Z} *        {Z}               {Y}
> destination(Y)     {Z} **          same              same
> destination(Z)     {}              same              same

But not sure about this. If I have branches dev, 1.0 and 2.0, and I
graft dev->2.0->1.0, then ask about destinations of dev cset, I want to
hear about both.

> * I prefer this because it is symmetric with the preferred definition of 
> origin (i.e. showing all the hops), and it is the only result possible 
> if the copy operations were grafts (since Y and Z both point directly to 
> X).  I don't like exposing the implementation differences between copy 
> ops to the user.  The downside is "origin(destination(X))" == {X, Y}, 
> which Matt noted isn't desirable, though maybe this case isn't "simple" 
> because of the "copy of a copy" scenario.
> 
> ** Notice that if both copies were grafts, the result is {}, and there's 
> nothing that can be done about it.
> 
> 
> C) What duplicates of X exist?
> 
>     Op             Alternative 1    Alternative 2
> duplicate(X)      {Y, Z}           {X, Y, Z}
> duplicate(Y)      {X, Z}           {X, Y, Z}
> duplicate(Z)      {X, Y}           {X, Y, Z}
> 
> I don't have a preference here.  Excluding the parameter from the result 
> means that duplicate(all()) can't find all duplicates anywhere, because 
> the result is always {}.

We have some precedents here:

 ancestors(X) = {X, ...}

I don't have a strong opinion, though. But I'm really not sure about
this name. It's only going to find duplicates made by
graft/transplant/etc., not anything produced 'manually'. And that will
disappoint people. Silly people.

Also, are you aware of Pierre-Yves' evolution work? It has a rather
closely related notion of predecessors and successor changesets and will
probably want a notion of duplicates as well. So.. do we want to have
two sets of predicates? You probably need to talk to him about that.

> D) Is this the very original X?
> 
> foo(X) == {}
> foo(Y) == {X}
> foo(Z) == {X}

I don't think the question matches the result here, nor is it in the
right canonical form for a revset query. Better would be:

 "the set of all changesets that are not grafts/transplants"
 origin(all()) # alternative 1

or

 "the set of all changesets that are first origins of <set>"
 origin(all()) # alternative 1

or

 "X if X is not a graft/etc."
 X and origin(all())

or

 "X if X -is- a graft/etc."
 X and not origin(all())

I forget whether all() is the proposed default here, but it seems like
it'd be useful.

-- 
Mathematics is the supreme nostalgia of our time.