[PATCH 1 of 5] pull: add --subrev option

Angel Ezquerra angel.ezquerra at gmail.com
Thu Jan 16 01:33:58 CST 2014


On Wed, Jan 15, 2014 at 2:15 PM, Pierre-Yves David
<pierre-yves.david at ens-lyon.org> wrote:
> On Sat, Nov 23, 2013 at 09:28:00PM +0100, Angel Ezquerra wrote:
>> # HG changeset patch
>> # User Angel Ezquerra <angel.ezquerra at gmail.com>
>> # Date 1372976561 -7200
>> #      Fri Jul 05 00:22:41 2013 +0200
>> # Node ID ecd1ca99f1b3e4d0f6ea8abd606990ef24667779
>> # Parent  1c46b18b0e1c47fa4cecf21b78c083a54ae9903f
>> pull: add --subrev option
>
> Why not --subrepos as for other -S option ?

I thought it would be confusing since usually the --subrepos flag
takes no arguments. Maybe what we should do is to _not_ have a short
-S form, to make it clear that is not just another --subrepo flag?

>>
>> The purpose of this new option is to make it possible to pull subrepos at the
>> same time that you pull the parent repository, instead of having to update to
>> the newly pulled revision to pull the corresponding subrepos.
>
> I love that!
>
>> This has multiple advantages. First of all it lets you pull the subrepos from
>> a non default path (when subrepos are pulled during udpate they are always
>> pulled from the default path). Also, it lets you pull subrepos referenced in all
>> new revisions in one go, rather than potentially having to update to every new
>> revision.
>>
>> The main use case for this new flag is to be able to ensure that your
>> repositories are alwasys "self contained".
>
> This is one of the biggest pain point of subrepo so its great to have it solved.

Yes! :-D

>> That is, that you will be able to
>> update to any incoming revision without requiring any network access. This can
>> be done by adding "--subrev .:tip" to your pull command (it would be nicer if
>> there was an "incoming()" revset expression, akin to the existing
>> "outgoing(path)" expression).
>
> Wait what? your new -S argument takes an argument to expression which revision
> you want to synchronise? why don't he just check all incoming one by default ?

The original version of this patch which I sent on the previous cycle
I believe did just that. I believe that it was Matt who said that you
cannot just blindly check subrepos found on all incoming revisions
because perhaps there could be conflicts, source repository changes,
etc. Instead it seems better to give users a way to specify which
revisions should be checked for subrepos to pull or clone.

>> When the --subrev flag is enabled, pull will also pull (or clone) all subrepos
>> that are present on the revisions matching the --subrev revset(s). Pulls are
>> recursive (i.e. subrepos withing subrepos will be also pulled as needed), but
>> clones are not recursive yet (it requires adding a similar --subrev flag to
>> clone, which will be done on another patch).
>>
>> If the selected --subrev revisions refer to subrepos that are not on the working
>> directory yet they will be cloned.
>
> What is the clone destination then? somewhere in .hg/ ?

No, with this patch series they are just placed on the working
directory. This is the same that would happen if you first pull and
then update to the corresponding revision.

If we used this series in combination with my "subrepo cache" series
then it could be possible and natural to simply get them into the
subrepo cache instead, which would definitely be cleaner. It would
perhaps also get rid of the main reason why we cannot just pull or
clone the subrepos found on all the incoming revisions...

>> If any of the subrepositories changes its
>> pull source (as defined on the .hgsub file) it will be pulled from the current
>> and the new source. If a pulled subrepo also refers to one of its own subrepos
>> on any of the incoming subrepo revisions those "sub-subrepos" will be pulled
>> too.
>>
>> This first patch only supports mercurial subrepos (a NotImplementedError
>> exception will be raised for all other subrepo types). Future patches will add
>> support for other subrepo types.
>>
>> diff --git a/mercurial/commands.py b/mercurial/commands.py
>> --- a/mercurial/commands.py
>> +++ b/mercurial/commands.py
>> @@ -4563,6 +4563,7 @@
>>      ('B', 'bookmark', [], _("bookmark to pull"), _('BOOKMARK')),
>>      ('b', 'branch', [], _('a specific branch you would like to pull'),
>>       _('BRANCH')),
>> +    ('S', 'subrev', [], _('pull subrepos from these revisions')),
>>      ] + remoteopts,
>>      _('[-u] [-f] [-r REV]... [-e CMD] [--remotecmd CMD] [SOURCE]'))
>>  def pull(ui, repo, source="default", **opts):
>> @@ -4608,7 +4609,8 @@
>>                      "so a rev cannot be specified.")
>>              raise util.Abort(err)
>>
>> -    modheads = repo.pull(other, heads=revs, force=opts.get('force'))
>> +    modheads = repo.pull(other, heads=revs, force=opts.get('force'),
>> +        subrev=opts.get('subrev'))
>>      bookmarks.updatefromremote(ui, repo, remotebookmarks, source)
>>      if checkout:
>>          checkout = str(repo.changelog.rev(other.lookup(checkout)))
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -11,6 +11,7 @@
>>  import lock as lockmod
>>  import transaction, store, encoding
>>  import scmutil, util, extensions, hook, error, revset
>> +import config, urllib2
>>  import match as matchmod
>>  import merge as mergemod
>>  import tags as tagsmod
>> @@ -1666,7 +1667,7 @@
>>
>>          return r
>>
>> -    def pull(self, remote, heads=None, force=False):
>> +    def pull(self, remote, heads=None, force=False, subrev=None):
>>          if remote.local():
>>              missing = set(remote.requirements) - self.supported
>>              if missing:
>> @@ -1757,8 +1758,49 @@
>>                  tr.release()
>>              lock.release()
>>
>> +        # update current and new subrepos
>> +        if subrev:
>> +            # pull (or clone) the subrepos that are on any of the revisions
>> +            # on the subrev list
>> +            self.getsubrepos(subrev=subrev, source=remote.url())
>> +
>>          return result
>>
>> +    def getsubrepos(self, subrev=None, source=None):
>
> 1) DO NOT ADD NEW FUNCTION IN LOCALREPOSITORY CLASS!
>
> If you function is not to be called all over the place by a lot of people it belong somewhere else. Please make it a `getsubrepos(repo, subrev=None, source=None)` function in the `mercurial.subrepo` module.

OK, good point. I'll look into it when I prepare a new version of this series.

> I also think the name in unfortunate: by "get" I would expect something just
> returning the list of subrepo object. while you don't return anything but
> change their content.
>
> I would rename that `pullsubrepos` (or something related)

The reason I named it this way is that in the subrepo.py module "get"
is used as meaning "get a subrepo" in the sense of getting a given
revision on a subrepo by either pulling or cloning it. Maybe if I move
getsubrepos to the subrepo.py module the name would not feel so
confusing?


>> +        """Get (clone or pull) the subrepos that are referenced
>> +        on any of the revisions on the given revision list
>> +        """
>> +        if not subrev:
>> +            return
>> +        try:
>> +            revs = self.revs('%lr', subrev)
>> +        except error.RepoLookupError, ex:
>> +            # No matches to the subrev revset
>> +            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))
>
> I think this should go to standard error output

I did not want to stop the pull if revs does not match any revisions.
Do you think it should? If not perhaps this should just be a warning?

>> +            return
>> +        revs.sort()
>> +        # use a sortdict to make sure that we get the subrepos
>> +        # in the order they are found
>> +        substopull = config.sortdict()
>> +        for rev in revs:
>> +            ctx = self[rev]
>> +            # read the substate items in alphabetical order to ensure
>> +            # that we always process the subrepos in the same order
>> +            for sname in sorted(ctx.substate):
>> +                sinfo = ctx.substate[sname]
>> +                substopull[(sname, sinfo[0], sinfo[2])] = ctx
>> +        try:
>> +            self._subtoppath = source
>> +            for (sname, ssource, stype), ctx in substopull.items():
>> +                try:
>> +                    ctx.sub(sname).pull(ssource)
>> +                except (error.RepoError, urllib2.HTTPError,
>> +                        subrepo.SubrepoAbort), ex:
>> +                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
>> +                                 % (sname, ssource, str(ex)))
>
> consider honoring the --traceback option here.

I'm not sure what you mean. Isn't --traceback a global option that is
automatically taken into account?


>> +    def getsubrepos(self, subrev=None, source=None):
>> +        """Get (clone or pull) the subrepos that are referenced
>> +        on any of the revisions on the given revision list
>> +        """
>> +        if not subrev:
>> +            return
>> +        try:
>> +            revs = self.revs('%lr', subrev)
>> +        except error.RepoLookupError, ex:
>> +            # No matches to the subrev revset
>> +            self.ui.note(_("no revisions found matching subrev revset '%s'\n"))
>> +            return
>> +        revs.sort()
>> +        # use a sortdict to make sure that we get the subrepos
>> +        # in the order they are found
>> +        substopull = config.sortdict()
>> +        for rev in revs:
>> +            ctx = self[rev]
>> +            # read the substate items in alphabetical order to ensure
>> +            # that we always process the subrepos in the same order
>> +            for sname in sorted(ctx.substate):
>> +                sinfo = ctx.substate[sname]
>> +                substopull[(sname, sinfo[0], sinfo[2])] = ctx
>> +        try:
>> +            self._subtoppath = source
>> +            for (sname, ssource, stype), ctx in substopull.items():
>> +                try:
>> +                    ctx.sub(sname).pull(ssource)
>> +                except (error.RepoError, urllib2.HTTPError,
>> +                        subrepo.SubrepoAbort), ex:
>> +                    self.ui.warn(_('could not pull subrepo %s from %s (%s)\n')
>> +                                 % (sname, ssource, str(ex)))
>> +        finally:
>> +            del self._subtoppath
>
> This _subtoppath stuff is weird. can't we do something cleaner ?

It is a bit weird but it is not something that this patch introduces.
The patch just works around the existing weirdness (perhaps making it
a little bit weirder in the process :-P ).

If I understood how it works correctly we must temporarily change the
_subtoppath (which is used to get an absolute path to a subrepo) to
the source of the pull, and clear it afterwards so that the next
operation does not use the wrong one to calculate the absoluate
subrepo paths.

In other words, I don't think we can make it cleaner, or if we do I
don't think it should be done in this series.

Thanks a lot for the detailed review!

Angel


More information about the Mercurial-devel mailing list