[PATCH V2] subrepo: config option to disable subrepositories

Gregory Szorc gregory.szorc at gmail.com
Sun Nov 5 00:25:58 EDT 2017


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1509855814 25200
#      Sat Nov 04 21:23:34 2017 -0700
# Branch stable
# Node ID 5aa341ccc69b7d9e7e3fdf2fe6f24317a1c4658f
# Parent  f445b10dc7fb3495d24d1c22b0996148864c77f7
subrepo: config option to disable subrepositories

Subrepositories are a lesser-used feature that has a history of security
vulnerabilities. Previous subrepo vulnerabilities have resulted in
arbitrary code execution during `hg clone`. This is one of the worst
kind of vulnerabilities a version control system can have.

An aspect of the security concern is that Mercurial supports
non-Mercurial subrepositories. (Git and Subversion are supported by
default.) While the Mercurial developers try to keep up with
development of other version control systems, it is effectively
impossible for us to keep tabs on all 3rd party changes and their
security impact. Every time Mercurial attempts to call out into
another [version control] tool, we run a higher risk of accidentally
introducing a security vulnerability. This is what's referred to as
"surface area for "attack" in computer security speak.

Since subrepos have a history of vulnerabilities, increase our
exposure to security issues in other tools, and aren't widely used
or a critical feature to have enabled by default, it makes sense
for the feature to be optional.

This commit introduces a config flag to control whether subrepos are
enabled. The default of having them enabled remains unchanged.

The mechanism by which the patch works is two pronged:

1) It effectively short-circuits the parsing of .hgsubstate files.
   Even if an .hgsubstate exists, it will be parsed as if it is
   empty and Mercurial won't think there are subrepos to operate on.
2) It disables special functionality related to .hgsub and .hgsubstate
   files. Normally, .hgsubstate is managed automatically. With the
   subrepos feature disabled, this file is treated as a normal file.

In the future, we'd like to introduce per VCS controls for subrepos.
For example, it might be prudent to disable 3rd party subrepo types
for security reasons but leave Mercurial subrepos enabled. As I thought
about what these future config options would look like, I couldn't
think of a great way to shoehorn them into [ui], where most existing
subrepo options are (there is also [subpaths]). I think we'll
eventually have sufficient complexity for subrepos to warrant their
own config section. So that's what I implemented here.

I also thought "enable" was too generic. The introduced flag
essentially controls whether subrepos can be checked out. So I
named the option "checkout." It takes a boolean value. For per-type
options, I was thinking we could use e.g. "checkout:git = false."

There are likely some corner cases with this patch. Known deficiencies
are annotated as such in the added test. IMO the biggest deficiency
is interaction with dirstate and status. I think subrepo paths should
be ignored when subrepo checkout is disabled. Implementing that will
require a different approach - one that doesn't virtually blank out
.hgsubstate.

This commit should receive extra review scrutiny because there
are tons of random references to .hgsub, .hgsubstate, and ctx.substate
throughout the code base. I'm not sure if I touched all the ones we
need to touch and if the ones we did touch needed to be touched.

diff --git a/hgext/mq.py b/hgext/mq.py
--- a/hgext/mq.py
+++ b/hgext/mq.py
@@ -959,7 +959,8 @@ class queue(object):
                     p1, p2 = repo.dirstate.parents()
                     repo.setparents(p1, merge)
 
-            if all_files and '.hgsubstate' in all_files:
+            if (all_files and '.hgsubstate' in all_files
+                and subrepo.checkoutenabled(repo.ui)):
                 wctx = repo[None]
                 pctx = repo['.']
                 overwrite = False
@@ -1205,8 +1206,11 @@ class queue(object):
         if opts.get('include') or opts.get('exclude') or pats:
             # detect missing files in pats
             def badfn(f, msg):
-                if f != '.hgsubstate': # .hgsubstate is auto-created
-                    raise error.Abort('%s: %s' % (f, msg))
+                # .hgsubstate is auto-created
+                if f == '.hgsubstate' and subrepo.checkoutenabled(repo):
+                    return
+
+                raise error.Abort('%s: %s' % (f, msg))
             match = scmutil.match(repo[None], pats, opts, badfn=badfn)
             changes = repo.status(match=match)
         else:
diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -811,6 +811,9 @@ coreconfigitem('smtp', 'username',
 coreconfigitem('sparse', 'missingwarning',
     default=True,
 )
+coreconfigitem('subrepos', 'checkout',
+    default=True,
+)
 coreconfigitem('templates', '.*',
     default=None,
     generic=True,
diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
--- a/mercurial/help/config.txt
+++ b/mercurial/help/config.txt
@@ -1893,6 +1893,20 @@ rewrite rules are then applied on the fu
 doesn't match the full path, an attempt is made to apply it on the
 relative path alone. The rules are applied in definition order.
 
+``subrepos``
+------------
+
+This section contains option that control the behavior of the
+subrepositories feature. See also :hg:`help subrepos`.
+
+``checkout``
+    Whether the checkout of subrepositories in the working directory is
+    allowed.
+
+    When disabled, subrepositories will effectively be ignored by the
+    Mercurial client.
+    (default: True)
+
 ``templatealias``
 -----------------
 
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1848,8 +1848,9 @@ class localrepository(object):
             subs = []
             commitsubs = set()
             newstate = wctx.substate.copy()
-            # only manage subrepos and .hgsubstate if .hgsub is present
-            if '.hgsub' in wctx:
+            # only manage subrepos and .hgsubstate if .hgsub is present and
+            # feature enabled.
+            if '.hgsub' in wctx and subrepo.checkoutenabled(self.ui):
                 # we'll decide whether to track this ourselves, thanks
                 for c in status.modified, status.added, status.removed:
                     if '.hgsubstate' in c:
@@ -1891,7 +1892,8 @@ class localrepository(object):
                             _("can't commit subrepos without .hgsub"))
                     status.modified.insert(0, '.hgsubstate')
 
-            elif '.hgsub' in status.removed:
+            elif ('.hgsub' in status.removed
+                  and subrepo.checkoutenabled(self.ui)):
                 # clean up .hgsubstate when .hgsub is removed
                 if ('.hgsubstate' in wctx and
                     '.hgsubstate' not in (status.modified + status.added +
diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -999,7 +999,7 @@ def manifestmerge(repo, wctx, p2, pa, br
     copied = set(copy.values())
     copied.update(movewithdir.values())
 
-    if '.hgsubstate' in m1:
+    if '.hgsubstate' in m1 and subrepo.checkoutenabled(repo.ui):
         # check whether sub state is modified
         if any(wctx.sub(s).dirty() for s in wctx.substate):
             m1['.hgsubstate'] = modifiednodeid
@@ -1377,7 +1377,8 @@ def applyupdates(repo, actions, wctx, mc
     mergeactions.extend(actions['m'])
     for f, args, msg in mergeactions:
         f1, f2, fa, move, anc = args
-        if f == '.hgsubstate': # merged internally
+        # merged internally
+        if f == '.hgsubstate' and subrepo.checkoutenabled(repo.ui):
             continue
         if f1 is None:
             fcl = filemerge.absentfilectx(wctx, fa)
@@ -1412,8 +1413,9 @@ def applyupdates(repo, actions, wctx, mc
     numupdates = sum(len(l) for m, l in actions.items() if m != 'k')
     z = 0
 
-    if [a for a in actions['r'] if a[0] == '.hgsubstate']:
-        subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
+    if subrepo.checkoutenabled(repo.ui):
+        if [a for a in actions['r'] if a[0] == '.hgsubstate']:
+            subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
 
     # record path conflicts
     for f, args, msg in actions['p']:
@@ -1466,8 +1468,9 @@ def applyupdates(repo, actions, wctx, mc
         progress(_updating, z, item=item, total=numupdates, unit=_files)
     updated = len(actions['g'])
 
-    if [a for a in actions['g'] if a[0] == '.hgsubstate']:
-        subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
+    if subrepo.checkoutenabled(repo.ui):
+        if [a for a in actions['g'] if a[0] == '.hgsubstate']:
+            subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
 
     # forget (manifest only, just log it) (must come first)
     for f, args, msg in actions['f']:
@@ -1551,7 +1554,7 @@ def applyupdates(repo, actions, wctx, mc
             repo.ui.debug(" %s: %s -> m (premerge)\n" % (f, msg))
             z += 1
             progress(_updating, z, item=f, total=numupdates, unit=_files)
-            if f == '.hgsubstate': # subrepo states need updating
+            if f == '.hgsubstate' and subrepo.checkoutenabled(repo.ui):
                 subrepo.submerge(repo, wctx, mctx, wctx.ancestor(mctx),
                                  overwrite, labels)
                 continue
@@ -1882,7 +1885,7 @@ def update(repo, node, branchmerge, forc
         # Prompt and create actions. Most of this is in the resolve phase
         # already, but we can't handle .hgsubstate in filemerge or
         # subrepo.submerge yet so we have to keep prompting for it.
-        if '.hgsubstate' in actionbyfile:
+        if '.hgsubstate' in actionbyfile and subrepo.checkoutenabled(repo.ui):
             f = '.hgsubstate'
             m, args, msg = actionbyfile[f]
             prompts = filemerge.partextras(labels)
diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -43,6 +43,10 @@ propertycache = util.propertycache
 
 nullstate = ('', '', 'empty')
 
+def checkoutenabled(ui):
+    """Whether subrepos can be checked out in the working directory."""
+    return ui.configbool('subrepos', 'checkout')
+
 def _expandedabspath(path):
     '''
     get a path or url and if it is a path expand it and return an absolute path
@@ -85,6 +89,14 @@ def state(ctx, ui):
     to tuple: (source from .hgsub, revision from .hgsubstate, kind
     (key in types dict))
     """
+    # Blank out knowledge about subrepos when the feature is disabled.
+    # This effectively causes consumers of this data structure to think
+    # there are no subrepos. This causes problems with the automatic
+    # management of .hgsubstate (which is based on this parsed data)
+    # and the subrepo() revset.
+    if not checkoutenabled(ui):
+        return {}
+
     p = config.config()
     repo = ctx.repo()
     def read(f, sections=None, remap=None):
diff --git a/tests/test-subrepo-disable.t b/tests/test-subrepo-disable.t
new file mode 100644
--- /dev/null
+++ b/tests/test-subrepo-disable.t
@@ -0,0 +1,206 @@
+  $ hg init hgsub
+  $ cd hgsub
+  $ echo sub0 > foo
+  $ hg -q commit -A -m 'subrepo initial'
+  $ hg log -T '{node}\n'
+  45cc468b8f18bee314935a4651bad80f9cb3b540
+  $ cd ..
+
+  $ hg init testrepo0
+  $ cd testrepo0
+  $ cat >> .hg/hgrc << EOF
+  > [subrepos]
+  > checkout = false
+  > EOF
+
+  $ echo 0 > foo
+  $ hg -q commit -A -m initial
+
+Adding a .hgsub should result in no sign of the subrepo in the working directory
+
+  $ cat > .hgsub << EOF
+  > hgsub = ../hgsub
+  > EOF
+
+  $ hg add .hgsub
+  $ hg commit -m 'add .hgsub'
+
+  $ hg files --subrepos
+  .hgsub
+  foo
+
+No .hgsubstate should have been created
+
+  $ hg files
+  .hgsub
+  foo
+
+Adding a nested hg repo will be treated like a normal nested hg repo: it will be ignored
+
+  $ hg init hgsub
+  $ hg st
+  $ rm -rf hgsub
+
+  $ cd ..
+
+Cloning this repo won't clone subrepo because no .hgsubstate
+
+  $ hg clone --pull testrepo0 no-substate-enabled
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  new changesets 68986213bd44:addf99df3e66
+  updating to branch default
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Cloning with subrepos disabled should be identical to above
+
+  $ hg --config subrepos.checkout=false clone --pull testrepo0 no-substate-disabled
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  new changesets 68986213bd44:addf99df3e66
+  updating to branch default
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Add an .hgsubstate manually to simulate a real repo with subrepos
+
+  $ cd testrepo0
+  $ cat > .hgsubstate << EOF
+  > 45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
+  > EOF
+  $ hg add .hgsubstate
+
+  $ hg commit -m 'manually add .hgsubstate'
+
+subrepo() revset with no args works if feature is disabled
+
+  $ hg log -r 'subrepo()'
+  changeset:   2:7647585deebb
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     manually add .hgsubstate
+  
+
+But it doesn't work with patterns
+(This is an implementation detail because of the way disabling subrepos
+blanks out substate and could change in the future.)
+
+  $ hg --config subrepos.checkout=true log -r 'subrepo(hgsub)'
+  changeset:   2:7647585deebb
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     manually add .hgsubstate
+  
+
+  $ hg log -r 'subrepo(hgsub)'
+
+  $ cd ..
+
+Cloning with feature enabled should clone subrepos
+
+  $ hg clone --pull testrepo0 with-substate-enabled
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 3 files
+  new changesets 68986213bd44:7647585deebb
+  updating to branch default
+  cloning subrepo hgsub from $TESTTMP/hgsub
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Cloning with feature disabled should not clone subrepos
+
+  $ hg --config subrepos.checkout=false clone --pull testrepo0 with-substate-disabled
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 3 files
+  new changesets 68986213bd44:7647585deebb
+  updating to branch default
+  3 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Test a client with subrepos enabled on a clone missing them.
+Important: the .hgsubstate file (which is normally automatically managed)
+should be present and unmodified.
+
+  $ cd with-substate-disabled
+  $ hg status
+  $ cat .hgsubstate
+  45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
+  $ hg up
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ cd ..
+
+Test a client with subrepos disabled on a clone with subrepos
+
+  $ cd with-substate-enabled
+  $ hg --config subrepos.checkout=false status
+  $ cat .hgsubstate
+  45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
+  $ hg --config subrepos.checkout=false up
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ hg --config subrepos.checkout=false st
+  $ cat .hgsubstate
+  45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
+
+  $ hg --config subrepos.checkout=false files -S
+  .hgsub
+  .hgsubstate
+  foo
+
+  $ cd ..
+
+#if git
+
+  $ GIT_AUTHOR_NAME=author; export GIT_AUTHOR_NAME
+  $ GIT_AUTHOR_EMAIL=someone at example.com; export GIT_AUTHOR_EMAIL
+  $ GIT_AUTHOR_DATE='1234567891 +0000'; export GIT_AUTHOR_DATE
+  $ GIT_COMMITTER_NAME=author; export GIT_COMMITTER_DATE
+  $ GIT_COMMITTER_EMAIL=someone at example.com; export GIT_COMMITTER_EMAIL
+  $ GIT_COMMITTER_DATE='1234567891 +0000'; export GIT_COMMITTER_DATE
+
+  $ git init -q gitsub
+  $ cd gitsub
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -q -m 'git subrepo initial'
+  $ cd ..
+
+  $ hg init repo-with-git-subrepo
+  $ cd repo-with-git-subrepo
+  $ echo 0 > foo
+  $ hg -q commit -A -m initial
+
+  $ cat > .hgsub << EOF
+  > gitsub = [git]../gitsub
+  > EOF
+  $ git clone -q ../gitsub gitsub
+  $ hg add .hgsub
+  $ hg commit -m 'add git subrepo'
+
+`hg files` with feature disabled only shows Mercurial files
+
+  $ hg --config subrepos.checkout=false files -S
+  .hgsub
+  .hgsubstate
+  foo
+
+`hg status` only shows Mercurial files
+FUTURE consider ignoring paths beloning to parsed subrepo paths
+
+  $ hg --config subrepos.checkout=false status gitsub/foo gitsub/.git/config
+  ? gitsub/.git/config
+  ? gitsub/foo
+
+#endif


More information about the Mercurial-devel mailing list