[PATCH V2] subrepo: config option to disable subrepositories

Angel Ezquerra angel.ezquerra at gmail.com
Sun Nov 5 17:40:43 EST 2017


Hi,

I'll get out of my long lurking period to agree with Yuya. I don't
think it's correct to consider subrepos a "lesser-used feature" but I
agree that "3rd party subrepos" might be. I'd prefer if rather than a
master switch we had a per subrepo type switch or perhaps a list of
enabled subrepo types (with default being "all").

Now back to lurk mode :-)

Angel


On Sun, Nov 5, 2017 at 6:25 AM, Gregory Szorc <gregory.szorc at gmail.com> wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1509855814 25200
> #      Sat Nov 04 21:23:34 2017 -0700
> # Branch stable
> # Node ID 5aa341ccc69b7d9e7e3fdf2fe6f24317a1c4658f
> # Parent  f445b10dc7fb3495d24d1c22b0996148864c77f7
> subrepo: config option to disable subrepositories
>
> Subrepositories are a lesser-used feature that has a history of security
> vulnerabilities. Previous subrepo vulnerabilities have resulted in
> arbitrary code execution during `hg clone`. This is one of the worst
> kind of vulnerabilities a version control system can have.
>
> An aspect of the security concern is that Mercurial supports
> non-Mercurial subrepositories. (Git and Subversion are supported by
> default.) While the Mercurial developers try to keep up with
> development of other version control systems, it is effectively
> impossible for us to keep tabs on all 3rd party changes and their
> security impact. Every time Mercurial attempts to call out into
> another [version control] tool, we run a higher risk of accidentally
> introducing a security vulnerability. This is what's referred to as
> "surface area for "attack" in computer security speak.
>
> Since subrepos have a history of vulnerabilities, increase our
> exposure to security issues in other tools, and aren't widely used
> or a critical feature to have enabled by default, it makes sense
> for the feature to be optional.
>
> This commit introduces a config flag to control whether subrepos are
> enabled. The default of having them enabled remains unchanged.
>
> The mechanism by which the patch works is two pronged:
>
> 1) It effectively short-circuits the parsing of .hgsubstate files.
>    Even if an .hgsubstate exists, it will be parsed as if it is
>    empty and Mercurial won't think there are subrepos to operate on.
> 2) It disables special functionality related to .hgsub and .hgsubstate
>    files. Normally, .hgsubstate is managed automatically. With the
>    subrepos feature disabled, this file is treated as a normal file.
>
> In the future, we'd like to introduce per VCS controls for subrepos.
> For example, it might be prudent to disable 3rd party subrepo types
> for security reasons but leave Mercurial subrepos enabled. As I thought
> about what these future config options would look like, I couldn't
> think of a great way to shoehorn them into [ui], where most existing
> subrepo options are (there is also [subpaths]). I think we'll
> eventually have sufficient complexity for subrepos to warrant their
> own config section. So that's what I implemented here.
>
> I also thought "enable" was too generic. The introduced flag
> essentially controls whether subrepos can be checked out. So I
> named the option "checkout." It takes a boolean value. For per-type
> options, I was thinking we could use e.g. "checkout:git = false."
>
> There are likely some corner cases with this patch. Known deficiencies
> are annotated as such in the added test. IMO the biggest deficiency
> is interaction with dirstate and status. I think subrepo paths should
> be ignored when subrepo checkout is disabled. Implementing that will
> require a different approach - one that doesn't virtually blank out
> .hgsubstate.
>
> This commit should receive extra review scrutiny because there
> are tons of random references to .hgsub, .hgsubstate, and ctx.substate
> throughout the code base. I'm not sure if I touched all the ones we
> need to touch and if the ones we did touch needed to be touched.
>
> diff --git a/hgext/mq.py b/hgext/mq.py
> --- a/hgext/mq.py
> +++ b/hgext/mq.py
> @@ -959,7 +959,8 @@ class queue(object):
>                      p1, p2 = repo.dirstate.parents()
>                      repo.setparents(p1, merge)
>
> -            if all_files and '.hgsubstate' in all_files:
> +            if (all_files and '.hgsubstate' in all_files
> +                and subrepo.checkoutenabled(repo.ui)):
>                  wctx = repo[None]
>                  pctx = repo['.']
>                  overwrite = False
> @@ -1205,8 +1206,11 @@ class queue(object):
>          if opts.get('include') or opts.get('exclude') or pats:
>              # detect missing files in pats
>              def badfn(f, msg):
> -                if f != '.hgsubstate': # .hgsubstate is auto-created
> -                    raise error.Abort('%s: %s' % (f, msg))
> +                # .hgsubstate is auto-created
> +                if f == '.hgsubstate' and subrepo.checkoutenabled(repo):
> +                    return
> +
> +                raise error.Abort('%s: %s' % (f, msg))
>              match = scmutil.match(repo[None], pats, opts, badfn=badfn)
>              changes = repo.status(match=match)
>          else:
> diff --git a/mercurial/configitems.py b/mercurial/configitems.py
> --- a/mercurial/configitems.py
> +++ b/mercurial/configitems.py
> @@ -811,6 +811,9 @@ coreconfigitem('smtp', 'username',
>  coreconfigitem('sparse', 'missingwarning',
>      default=True,
>  )
> +coreconfigitem('subrepos', 'checkout',
> +    default=True,
> +)
>  coreconfigitem('templates', '.*',
>      default=None,
>      generic=True,
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1893,6 +1893,20 @@ rewrite rules are then applied on the fu
>  doesn't match the full path, an attempt is made to apply it on the
>  relative path alone. The rules are applied in definition order.
>
> +``subrepos``
> +------------
> +
> +This section contains option that control the behavior of the
> +subrepositories feature. See also :hg:`help subrepos`.
> +
> +``checkout``
> +    Whether the checkout of subrepositories in the working directory is
> +    allowed.
> +
> +    When disabled, subrepositories will effectively be ignored by the
> +    Mercurial client.
> +    (default: True)
> +
>  ``templatealias``
>  -----------------
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1848,8 +1848,9 @@ class localrepository(object):
>              subs = []
>              commitsubs = set()
>              newstate = wctx.substate.copy()
> -            # only manage subrepos and .hgsubstate if .hgsub is present
> -            if '.hgsub' in wctx:
> +            # only manage subrepos and .hgsubstate if .hgsub is present and
> +            # feature enabled.
> +            if '.hgsub' in wctx and subrepo.checkoutenabled(self.ui):
>                  # we'll decide whether to track this ourselves, thanks
>                  for c in status.modified, status.added, status.removed:
>                      if '.hgsubstate' in c:
> @@ -1891,7 +1892,8 @@ class localrepository(object):
>                              _("can't commit subrepos without .hgsub"))
>                      status.modified.insert(0, '.hgsubstate')
>
> -            elif '.hgsub' in status.removed:
> +            elif ('.hgsub' in status.removed
> +                  and subrepo.checkoutenabled(self.ui)):
>                  # clean up .hgsubstate when .hgsub is removed
>                  if ('.hgsubstate' in wctx and
>                      '.hgsubstate' not in (status.modified + status.added +
> diff --git a/mercurial/merge.py b/mercurial/merge.py
> --- a/mercurial/merge.py
> +++ b/mercurial/merge.py
> @@ -999,7 +999,7 @@ def manifestmerge(repo, wctx, p2, pa, br
>      copied = set(copy.values())
>      copied.update(movewithdir.values())
>
> -    if '.hgsubstate' in m1:
> +    if '.hgsubstate' in m1 and subrepo.checkoutenabled(repo.ui):
>          # check whether sub state is modified
>          if any(wctx.sub(s).dirty() for s in wctx.substate):
>              m1['.hgsubstate'] = modifiednodeid
> @@ -1377,7 +1377,8 @@ def applyupdates(repo, actions, wctx, mc
>      mergeactions.extend(actions['m'])
>      for f, args, msg in mergeactions:
>          f1, f2, fa, move, anc = args
> -        if f == '.hgsubstate': # merged internally
> +        # merged internally
> +        if f == '.hgsubstate' and subrepo.checkoutenabled(repo.ui):
>              continue
>          if f1 is None:
>              fcl = filemerge.absentfilectx(wctx, fa)
> @@ -1412,8 +1413,9 @@ def applyupdates(repo, actions, wctx, mc
>      numupdates = sum(len(l) for m, l in actions.items() if m != 'k')
>      z = 0
>
> -    if [a for a in actions['r'] if a[0] == '.hgsubstate']:
> -        subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
> +    if subrepo.checkoutenabled(repo.ui):
> +        if [a for a in actions['r'] if a[0] == '.hgsubstate']:
> +            subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
>
>      # record path conflicts
>      for f, args, msg in actions['p']:
> @@ -1466,8 +1468,9 @@ def applyupdates(repo, actions, wctx, mc
>          progress(_updating, z, item=item, total=numupdates, unit=_files)
>      updated = len(actions['g'])
>
> -    if [a for a in actions['g'] if a[0] == '.hgsubstate']:
> -        subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
> +    if subrepo.checkoutenabled(repo.ui):
> +        if [a for a in actions['g'] if a[0] == '.hgsubstate']:
> +            subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
>
>      # forget (manifest only, just log it) (must come first)
>      for f, args, msg in actions['f']:
> @@ -1551,7 +1554,7 @@ def applyupdates(repo, actions, wctx, mc
>              repo.ui.debug(" %s: %s -> m (premerge)\n" % (f, msg))
>              z += 1
>              progress(_updating, z, item=f, total=numupdates, unit=_files)
> -            if f == '.hgsubstate': # subrepo states need updating
> +            if f == '.hgsubstate' and subrepo.checkoutenabled(repo.ui):
>                  subrepo.submerge(repo, wctx, mctx, wctx.ancestor(mctx),
>                                   overwrite, labels)
>                  continue
> @@ -1882,7 +1885,7 @@ def update(repo, node, branchmerge, forc
>          # Prompt and create actions. Most of this is in the resolve phase
>          # already, but we can't handle .hgsubstate in filemerge or
>          # subrepo.submerge yet so we have to keep prompting for it.
> -        if '.hgsubstate' in actionbyfile:
> +        if '.hgsubstate' in actionbyfile and subrepo.checkoutenabled(repo.ui):
>              f = '.hgsubstate'
>              m, args, msg = actionbyfile[f]
>              prompts = filemerge.partextras(labels)
> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> --- a/mercurial/subrepo.py
> +++ b/mercurial/subrepo.py
> @@ -43,6 +43,10 @@ propertycache = util.propertycache
>
>  nullstate = ('', '', 'empty')
>
> +def checkoutenabled(ui):
> +    """Whether subrepos can be checked out in the working directory."""
> +    return ui.configbool('subrepos', 'checkout')
> +
>  def _expandedabspath(path):
>      '''
>      get a path or url and if it is a path expand it and return an absolute path
> @@ -85,6 +89,14 @@ def state(ctx, ui):
>      to tuple: (source from .hgsub, revision from .hgsubstate, kind
>      (key in types dict))
>      """
> +    # Blank out knowledge about subrepos when the feature is disabled.
> +    # This effectively causes consumers of this data structure to think
> +    # there are no subrepos. This causes problems with the automatic
> +    # management of .hgsubstate (which is based on this parsed data)
> +    # and the subrepo() revset.
> +    if not checkoutenabled(ui):
> +        return {}
> +
>      p = config.config()
>      repo = ctx.repo()
>      def read(f, sections=None, remap=None):
> diff --git a/tests/test-subrepo-disable.t b/tests/test-subrepo-disable.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-subrepo-disable.t
> @@ -0,0 +1,206 @@
> +  $ hg init hgsub
> +  $ cd hgsub
> +  $ echo sub0 > foo
> +  $ hg -q commit -A -m 'subrepo initial'
> +  $ hg log -T '{node}\n'
> +  45cc468b8f18bee314935a4651bad80f9cb3b540
> +  $ cd ..
> +
> +  $ hg init testrepo0
> +  $ cd testrepo0
> +  $ cat >> .hg/hgrc << EOF
> +  > [subrepos]
> +  > checkout = false
> +  > EOF
> +
> +  $ echo 0 > foo
> +  $ hg -q commit -A -m initial
> +
> +Adding a .hgsub should result in no sign of the subrepo in the working directory
> +
> +  $ cat > .hgsub << EOF
> +  > hgsub = ../hgsub
> +  > EOF
> +
> +  $ hg add .hgsub
> +  $ hg commit -m 'add .hgsub'
> +
> +  $ hg files --subrepos
> +  .hgsub
> +  foo
> +
> +No .hgsubstate should have been created
> +
> +  $ hg files
> +  .hgsub
> +  foo
> +
> +Adding a nested hg repo will be treated like a normal nested hg repo: it will be ignored
> +
> +  $ hg init hgsub
> +  $ hg st
> +  $ rm -rf hgsub
> +
> +  $ cd ..
> +
> +Cloning this repo won't clone subrepo because no .hgsubstate
> +
> +  $ hg clone --pull testrepo0 no-substate-enabled
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  new changesets 68986213bd44:addf99df3e66
> +  updating to branch default
> +  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Cloning with subrepos disabled should be identical to above
> +
> +  $ hg --config subrepos.checkout=false clone --pull testrepo0 no-substate-disabled
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  new changesets 68986213bd44:addf99df3e66
> +  updating to branch default
> +  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Add an .hgsubstate manually to simulate a real repo with subrepos
> +
> +  $ cd testrepo0
> +  $ cat > .hgsubstate << EOF
> +  > 45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> +  > EOF
> +  $ hg add .hgsubstate
> +
> +  $ hg commit -m 'manually add .hgsubstate'
> +
> +subrepo() revset with no args works if feature is disabled
> +
> +  $ hg log -r 'subrepo()'
> +  changeset:   2:7647585deebb
> +  tag:         tip
> +  user:        test
> +  date:        Thu Jan 01 00:00:00 1970 +0000
> +  summary:     manually add .hgsubstate
> +
> +
> +But it doesn't work with patterns
> +(This is an implementation detail because of the way disabling subrepos
> +blanks out substate and could change in the future.)
> +
> +  $ hg --config subrepos.checkout=true log -r 'subrepo(hgsub)'
> +  changeset:   2:7647585deebb
> +  tag:         tip
> +  user:        test
> +  date:        Thu Jan 01 00:00:00 1970 +0000
> +  summary:     manually add .hgsubstate
> +
> +
> +  $ hg log -r 'subrepo(hgsub)'
> +
> +  $ cd ..
> +
> +Cloning with feature enabled should clone subrepos
> +
> +  $ hg clone --pull testrepo0 with-substate-enabled
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  new changesets 68986213bd44:7647585deebb
> +  updating to branch default
> +  cloning subrepo hgsub from $TESTTMP/hgsub
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Cloning with feature disabled should not clone subrepos
> +
> +  $ hg --config subrepos.checkout=false clone --pull testrepo0 with-substate-disabled
> +  requesting all changes
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 3 changesets with 3 changes to 3 files
> +  new changesets 68986213bd44:7647585deebb
> +  updating to branch default
> +  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Test a client with subrepos enabled on a clone missing them.
> +Important: the .hgsubstate file (which is normally automatically managed)
> +should be present and unmodified.
> +
> +  $ cd with-substate-disabled
> +  $ hg status
> +  $ cat .hgsubstate
> +  45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> +  $ hg up
> +  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +  $ cd ..
> +
> +Test a client with subrepos disabled on a clone with subrepos
> +
> +  $ cd with-substate-enabled
> +  $ hg --config subrepos.checkout=false status
> +  $ cat .hgsubstate
> +  45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> +  $ hg --config subrepos.checkout=false up
> +  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +  $ hg --config subrepos.checkout=false st
> +  $ cat .hgsubstate
> +  45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> +
> +  $ hg --config subrepos.checkout=false files -S
> +  .hgsub
> +  .hgsubstate
> +  foo
> +
> +  $ cd ..
> +
> +#if git
> +
> +  $ GIT_AUTHOR_NAME=author; export GIT_AUTHOR_NAME
> +  $ GIT_AUTHOR_EMAIL=someone at example.com; export GIT_AUTHOR_EMAIL
> +  $ GIT_AUTHOR_DATE='1234567891 +0000'; export GIT_AUTHOR_DATE
> +  $ GIT_COMMITTER_NAME=author; export GIT_COMMITTER_DATE
> +  $ GIT_COMMITTER_EMAIL=someone at example.com; export GIT_COMMITTER_EMAIL
> +  $ GIT_COMMITTER_DATE='1234567891 +0000'; export GIT_COMMITTER_DATE
> +
> +  $ git init -q gitsub
> +  $ cd gitsub
> +  $ echo 0 > foo
> +  $ git add foo
> +  $ git commit -q -m 'git subrepo initial'
> +  $ cd ..
> +
> +  $ hg init repo-with-git-subrepo
> +  $ cd repo-with-git-subrepo
> +  $ echo 0 > foo
> +  $ hg -q commit -A -m initial
> +
> +  $ cat > .hgsub << EOF
> +  > gitsub = [git]../gitsub
> +  > EOF
> +  $ git clone -q ../gitsub gitsub
> +  $ hg add .hgsub
> +  $ hg commit -m 'add git subrepo'
> +
> +`hg files` with feature disabled only shows Mercurial files
> +
> +  $ hg --config subrepos.checkout=false files -S
> +  .hgsub
> +  .hgsubstate
> +  foo
> +
> +`hg status` only shows Mercurial files
> +FUTURE consider ignoring paths beloning to parsed subrepo paths
> +
> +  $ hg --config subrepos.checkout=false status gitsub/foo gitsub/.git/config
> +  ? gitsub/.git/config
> +  ? gitsub/foo
> +
> +#endif
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list