[PATCH V2] revert: add support for reverting subrepos

Tue Oct 11 09:44:48 CDT 2011

On Thu, 2011-10-06 at 17:48 +0200, Angel Ezquerra wrote:
> # HG changeset patch
> # User Angel Ezquerra <angel.ezquerra at gmail.com>
> # Date 1317712886 -7200
> # Node ID 48f9111208bfbbe831bb514bc133b854c3838d2d
> # Parent  6dc67dced8c122f6139ae20ccdc03a6b11e8b765
> revert: add support for reverting subrepos
> 
> Reverting a subrepo is done by updating it to the revision that is selected on
> the parent repo .hgsubstate file.

I've been putting off reviewing this a bit. Revert is one of my
least-favorite pieces of code and has been on my
needs-rewriting-from-scratch list for years and making it more
complicated just scares me. But this is surprisingly not so bad.

I suspect this doesn't handle subrepos getting added and removed at all.

> * ISSUES/TODO: - I've refactored the revert() function in commands.py to make
> this change less intrusive. I've moved most of the original revert function
> into a function called revertfiles(), except the option checking part of the
> code which is left in the revert() function, which will also call revertfiles()
> and a new function named revertsubrepos(). These and other helps functions
> should be moved to some other file? If so, where?

Helper functions definitely shouldn't live in commands.py. cmdutil.py is
the default place for commands.py's helpers.

If your "separating file and subrepo patterns" helper is useful, then it
should probably live in subrepo.py. It should probably work something
like this:

subpats, otherpats = subrepo.splitpats(pats)

> - This version of this patch only allows reverting a subrepo if the --no-backup
> flag is used, since no backups are performed on the contents of the subrepo. It
> could be possible to add support for backing up the subrepo contents by first
> performing a "revert --all" on the subrepo, and then updating the subrepo to
> the proper revision.

> - The behavior of the --all flag has not been changed: The --all flag will not
> revert the state of the subrepos. This could be changed as well, but it is left
> for a later patch (if considered appropriate).

Seems appropriate to me? Is there a reason it wouldn't be?

> - I'm calling update() on the subrepos while a wlock is active. I don't know if
> this is correct.

That's fine.

> - I've used ui.status() to show a message while the subrepos are being
> reverted. However TortoiseHg does not properly capture this message (it shows a
> dialog box rather than showing the message on its console).

Odd. But I bet I know why...

> diff --git a/mercurial/commands.py b/mercurial/commands.py
> --- a/mercurial/commands.py
> +++ b/mercurial/commands.py
> @@ -4419,11 +4419,31 @@
>          raise util.Abort(_('uncommitted merge with no revision specified'),
>                           hint=_('use "hg update" or see "hg help revert"'))
>  
> +    # Get the context of the target revision
>      ctx = scmutil.revsingle(repo, opts.get('rev'))

Please don't add trivial comments.

> -    node = ctx.node()
> -
> -    if not pats and not opts.get('all'):
> -        msg = _("no files or directories specified")
> +
> +    # reverting files and subrepos is a very different operation
> +    # separate subrepos from regular files from the pats list
> +    [filepats, subpats] = separatefilesandsubs(pats, ctx)
> +
> +    if subpats and not opts.get('no_backup'):
> +        # we cannot revert subrepos unless the no_backup flag is set!
> +        msg = _("cannot revert subrepos unless the no-backup flag is set")

Here's another one.

> +        hint = _("there are subrepos on the revert list, "
> +            "use --no-backup to revert them")
> +        raise util.Abort(msg, hint=hint)
> +
> +    # We currently do not support reverting non mercurial subrepos, so check
> +    # that all subrepos on the subrepo list are mercurial subrepos

Hmm, this says "this patch is doing something wrong".

> +    for sname in subpats:
> +        stype = ctx.substate[sname][2]
> +        if stype != "hg":
> +            msg = _("cannot revert %s subrepo '%s'") % (stype, sname)
> +            hint = _("reverting %s repositories is not supported") % stype
> +            raise util.Abort(msg, hint=hint)
> +
> +    if not pats and not substate and not opts.get('all'):
> +        msg = _("no files, directories or subrepos specified")
>          if p2 != nullid:
>              hint = _("uncommitted merge, use --all to discard all changes,"
>                       " or 'hg update -C .' to abort the merge")
> @@ -4442,6 +4462,71 @@
>              hint = _("use --all to revert all files")
>          raise util.Abort(msg, hint=hint)
>  
> +    # now we are ready to revert the files and subrepos on the list
> +    revertfiles(ui, repo, ctx, filepats, opts)
> +    revertsubrepos(ui, repo, ctx, subpats, opts)
> +
> +def separatefilesandsubs(pats, ctx):

Use underbars here to indicate that code outside this module shouldn't
be calling this. But this functions should really probably be rolled
into the top-level function.

> +    """
> +    Get a list of files, folders and subrepo names and a repository context and
> +    return two lists:
> +        - a list of files and folder names
> +        - a list of subrepo names
> +    """
> +
> +    if not ctx.substate:
> +        return pats, []
> +
> +    filepats = []
> +    subpats = []
> +    for p in pats:

Since this isn't using match objects, I'm guessing it doesn't work with
wildcards or other patterns. That needs fixing. Search around for
'match' in this file for examples of how to do this.

But this is probably harder than it looks. If you have libfoo (subrepo)
and libbar (normal) and someone reverts lib*, you've got a problem. So I
don't think this notion of splitting patterns actually makes sense.

A better approach is probably:

- move the pattern matcher that's built lower down up
- subs = [s for s in ctx.substate() if match(s)]

> +        if p in ctx.substate:
> +            subpats.append(p)
> +        else:
> +            filepats.append(p)
> +
> +    return (filepats, subpats)
> +
> +def revertsubrepos(ui, repo, ctx, subs, opts):
> +    """
> +    Revert a list of subrepos to the revision given by a particular context
> +
> +    This function assumes that all the items on the subs list are valid
> +    mercurial subrepos on the target context
> +    """
> +    # we ignore the 'all' option when reverting subrepos
> +    # so we don't check it here
> +    if not subs:
> +        return
> +
> +    print dir(ctx)

Never use print in Mercurial. This would be caught by the test suite -
always run the test suite.

> +    wlock = repo.wlock()

This lock should be in the top-level function. Releasing and retaking a
lock in the middle of an operation is just as bad as not taking it all.

> +    try:
> +        # Revert the subrepos on the revert list
> +        # reverting a subrepo is done by updating it to the revision specified
> +        # in the corresponding substate dictionary
> +        for sname in subs:

This doesn't honor -I and -X options. Again, you'll need a matcher for
that.

> +            ui.status(_('s reverting suprepo %s\n') % sname)
> +            if not opts.get('dry_run'):
> +                substate = ctx.substate[sname]
> +                srepo = ctx.sub(sname)._repo
> +                update(ui, srepo,
> +                    node=ctx.substate[sname][1], clean=True)

And this is the wrongness I figured I'd find above. There's no reason at
all this should be Mercurial-subrepo specific. I think you should just
be able to just do:

ctx.sub(sname).get(state, overwrite=True)

Read this, it's the closest thing we have to a fully-documented API in
the entire system:

http://www.selenic.com/hg/file/c81dce8a7bb6/mercurial/subrepo.py#l269

> +    finally:
> +        wlock.release()
> +
> +def revertfiles(ui, repo, ctx, pats, opts):
> +    """
> +    Revert a list of files to the revision given by a particular context
> +
> +    Assume that the revert options are valid.
> +    """
> +    if not pats and not opts.get('all'):
> +        return

If we're assuming all the options are valid, isn't this check in the
wrong place?

> +    parent, p2 = repo.dirstate.parents()
> +    node = ctx.node()
> +
>      mf = ctx.manifest()
>      if node == parent:
>          pmf = mf

I'd like you to approach this in a few patches:

patch 1: add code that finds the subs that match the given pattern, warn
that they're not reverted
patch 2: add code that walks the affected subrepos and reverts them

I think you'll find that both of these can be quite a bit simpler than
the above and can be done without adding helpers or reindenting.

-- 
Mathematics is the supreme nostalgia of our time.