[PATCH 1 of 2] strip: add a delayedstrip method that works in a transaction

Mon Jun 26 19:33:10 EDT 2017

Excerpts from Martin von Zweigbergk's message of 2017-06-26 09:26:56 -0700:
> [...]
> > +def delayedstrip(ui, repo, nodelist, topic=None):
> > +    """like strip, but works inside transaction and won't strip irreverent revs
> 
> I think you meant s/irreverent/irrelevant/ :-) But to make it clearer,
> maybe "descendants" is even better (assuming that's what it is about)?

Oops. I will fix the spelling as long as update the docstring.

The language is a bit tricker to write, it's not stripping descendants and
it won't remove ancestors of those descendants. If nodelist contains A and B
where B is a descendant of A, it's still possible to remove B, or A+B, or
nothing.

Maybe something like:

  won't strip anything that cause nodes outside nodelist to disappear

> > +
> > +    nodelist must explicitly contain all descendants.
> 
> That seems like a safer API in general. Would it be a lot of work to
> move this into the normal strip() method?

That is also what I have been considering. It seems to me that "strip()" is
a lower level function and we might want a low level function any way.

Maybe rename "strip" to "_strip" and make "strip" do the extra safe check?
I'll do that if it's simple.

> > Otherwise a warning will
> > +    be printed that some nodes are not stripped.
> 
> Maybe this should be a ProgrammingError?

No. This is not an error intentionally. It makes the API easier to use and
more flexible. For example,

  1. When rebaseset has a "hole", rebase won't need special handling before
     sending the rebaseset to delayedstrip.
  2. Even if you know that nodelist covers all descendants at the time you
     call "delayedstrip", because it's delayed, somebody else could still
     change the repo, and at the end of transaction, the original nodelist
     may have surprising new children and shouldn't be stripped.

I think supporting "2" is important. There is a test case for that (commit
after delayedstrip).

> > +    Always do a backup.
> 
> Why? Is that more necessary for posthook strips than other strips?

All the code needing to migrate to this code path does a backup.

> >The last non-None "topic" will be used as the backup
> > +    topic name. The default backup topic name is "backup".
> 
> Why not use a separate backup for each call to delayedstrip()?

For example, given the following simple DAG:

    B D
    | |
    A C

And somebody calls:

    delayedstrip(nodelist=['A', 'D'], topic='AD')
    delayedstrip(nodelist=['B', 'C'], topic='BC')

What do you think is the best behavior with that?

> > +    # transaction postclose callbacks are called in alphabet order.
> > +    # use '\xff' as prefix so we are likely to be called last.
> 
> Why do we want them to run a late as possible? Do we think other hooks
> would want the to-be-stripped changesets to still be there when they
> run?

Just to be safe in case some other postclose callbacks do crazy things (like
os.system("hg commit -m X")). They shouldn't but there is no easy way to
detect what other postclose callbacks will do, so I'd prefer safer by
default.