[PATCH 7 of 8 V2] revset: add origins() predicate

Matt Harbison matt_harbison at yahoo.com
Sat Jun 23 00:21:17 CDT 2012


Matt Mackall wrote:
> On Wed, 2012-06-20 at 22:34 -0400, Matt Harbison wrote:
>> A concrete example would be: 'hg graft 2' creates cset 3.  Then
>> origins(all()) == {2} while origin(all()) == {3}.
>
> Then we have a terminology problem.
>
> I read origin(x) as "the origin of x", not "things that have x as an
> origin". Now, granted, there's no perfectly consistent way to convert
> f(x) to English:
>

Ah, I see what happened, and completely agree with this.  I originally 
tried to propose this with examples instead of English phrases [1] to 
try to take any ambiguity out.  You replied with:

     I think you misread my proposal. "origin(2)" meant "anything that
     has 2 as its origin" much like "branch(foo)" means "anything that
     has foo as its branch". But you're right, we probably do want both
     operators.

I thought this was an awkward way to read it for origin, and I should 
have asked, but the branch explanation made sense so I ran with it. 
With this meaning, origins(X) == X was consistent with origin(X) because 
X _is_ the origin.  But you're right, it took some mental gymnastics to 
get there.

Not a big deal, the fix is to just switch the names.

> In no case should we have a "foo" and "foos" with opposite semantics.
> That's just way too confusing to live. And if we have origin and
> destination that aren't opposites, we've also got a problem. At least
> for simple grafts, I'd expect origin(destination(x)) = x and
> destination(origin(y)) = y.

Makes sense, though I don't think we can do that for graft specifically, 
unless "simple graft" means that graft is only done once on a node, and 
never on the node that gets created because of that graft.  See the 
result choice tables below.

> Lastly, we don't need any predicate of the form "baz(x)" which would be
> equivalent to "x and baz(all())". So if something takes a set, it should
> be filtering the input side, not the output side.

The point here being "why make a new predicate that can be built from 
other existing ones"?  That also makes sense.

> [2] That edge is named parent in our schema, which suggests the
> predicate here should be named "source" rather than origin. But I don't
> know if we expose the "source" terminology anywhere.
>

transplant and rebase options:

-s --source REPO       pull patches from REPO
-s --source REV        rebase from the specified changeset

Graft just has a dangling "REVISION...", but the help text references 
"source changesets".  A very quick scan of the source tree revealed a 
few "...copy source..." messages in verify.py that seem file related, 
and the choose local/remote prompt for subrepos.

I think 'source' is a more natural complement to destination, and can 
dig further to find other uses if it seems like a useful rename.


Since this probably hasn't been on anybody's radar recently, and there 
were surprises as I implemented this, let me restate/modify the goals 
here, both for me and to get more opinions.  Apologies in advance, 
because this is a bit long.

I think it is useful to be able to answer:

A) Where did cset X come from?
B) Where did X go?
C) What duplicates of X exist? (I don't know if duplicates is an 
acceptable synonym for copies[2])
D) Is this the very original X? (I don't think that the opposite, "the 
very latest X" is meaningful or even possible, since the original date 
is preserved for each)


For the sake of this, let there be a cset X, which when copied creates 
cset Y, which when itself is copied creates Z.  Copy can be any of the 
three ops.  And let's use "origin(x)" means "the origin of x".  The 
possible results follow, with alternate 2 being the naive/simplest result:

A) Where did X come from?

    Op      Preferred       Alternate 1    Alternate 2
origin(X)    {}               same           same
origin(Y)    {X}              same           same
origin(Z)    {X, Y} *         {X}             {Y}

* I prefer this because I think it is useful to see the entire list of 
the preceding hops Z took, without manually taking the result of one and 
requerying until the trail ends.  Thg uses a hyperlink to work backward 
one hop at a time, which is much easier when trying to follow the copies 
than reentering a command with a new rev each time.
   The downside is if Y and Z were created by grafting, the only choice 
is {X} because of how graft is implemented (it preserves the 'source' 
field if it already exists).  A second pass could be made to include all 
revs with a 'source' field that matches Z's 'source' (i.e. X for a 
graft), but it would have to be a special case for graft alone so as to 
not follow an unrelated path that was _transplanted_ from X.


B) Where did X go?

    Op             Preferred    Alternative 1     Alternative 2
destination(X)     {Y, Z} *        {Z}               {Y}
destination(Y)     {Z} **          same              same
destination(Z)     {}              same              same

* I prefer this because it is symmetric with the preferred definition of 
origin (i.e. showing all the hops), and it is the only result possible 
if the copy operations were grafts (since Y and Z both point directly to 
X).  I don't like exposing the implementation differences between copy 
ops to the user.  The downside is "origin(destination(X))" == {X, Y}, 
which Matt noted isn't desirable, though maybe this case isn't "simple" 
because of the "copy of a copy" scenario.

** Notice that if both copies were grafts, the result is {}, and there's 
nothing that can be done about it.


C) What duplicates of X exist?

    Op             Alternative 1    Alternative 2
duplicate(X)      {Y, Z}           {X, Y, Z}
duplicate(Y)      {X, Z}           {X, Y, Z}
duplicate(Z)      {X, Y}           {X, Y, Z}

I don't have a preference here.  Excluding the parameter from the result 
means that duplicate(all()) can't find all duplicates anywhere, because 
the result is always {}.  Not excluding the parameter probably means 
duplicate(not_copied) is {not_copied} for consistency, which is just 
silly.  I haven't given much thought how to implement this, but this 
isn't necessary if the preferred definitions of origin() and 
destination() above are taken, because this is really "origin(Y) or 
destination(Y)", with some lossiness for graft:

    origin(Z) or destination(Z) => {X} or {} => X

which has lost Y unless the workaround in the footnote of A above is 
implemented.


D) Is this the very original X?

foo(X) == {}
foo(Y) == {X}
foo(Z) == {X}

This is basically the alternate 1 definition of origin() above.  I don't 
see a way to apply existing filter predicates to get this with origin(). 
  Since all of the dates are the same, a first(sort()) isn't going to 
cut it.  If 'source' turns out to be an acceptable name, maybe the 
predicate in 'A' above should be named source, and this origin.


[1] http://markmail.org/message/5pxoaw2ylnsx3y7r
[2] http://markmail.org/message/jlrwl63madymfgnp


More information about the Mercurial-devel mailing list