[PATCH V2] subrepo: config option to disable subrepositories
Angel Ezquerra
angel.ezquerra at gmail.com
Sun Nov 5 17:40:43 EST 2017
Hi,
I'll get out of my long lurking period to agree with Yuya. I don't
think it's correct to consider subrepos a "lesser-used feature" but I
agree that "3rd party subrepos" might be. I'd prefer if rather than a
master switch we had a per subrepo type switch or perhaps a list of
enabled subrepo types (with default being "all").
Now back to lurk mode :-)
Angel
On Sun, Nov 5, 2017 at 6:25 AM, Gregory Szorc <gregory.szorc at gmail.com> wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1509855814 25200
> # Sat Nov 04 21:23:34 2017 -0700
> # Branch stable
> # Node ID 5aa341ccc69b7d9e7e3fdf2fe6f24317a1c4658f
> # Parent f445b10dc7fb3495d24d1c22b0996148864c77f7
> subrepo: config option to disable subrepositories
>
> Subrepositories are a lesser-used feature that has a history of security
> vulnerabilities. Previous subrepo vulnerabilities have resulted in
> arbitrary code execution during `hg clone`. This is one of the worst
> kind of vulnerabilities a version control system can have.
>
> An aspect of the security concern is that Mercurial supports
> non-Mercurial subrepositories. (Git and Subversion are supported by
> default.) While the Mercurial developers try to keep up with
> development of other version control systems, it is effectively
> impossible for us to keep tabs on all 3rd party changes and their
> security impact. Every time Mercurial attempts to call out into
> another [version control] tool, we run a higher risk of accidentally
> introducing a security vulnerability. This is what's referred to as
> "surface area for "attack" in computer security speak.
>
> Since subrepos have a history of vulnerabilities, increase our
> exposure to security issues in other tools, and aren't widely used
> or a critical feature to have enabled by default, it makes sense
> for the feature to be optional.
>
> This commit introduces a config flag to control whether subrepos are
> enabled. The default of having them enabled remains unchanged.
>
> The mechanism by which the patch works is two pronged:
>
> 1) It effectively short-circuits the parsing of .hgsubstate files.
> Even if an .hgsubstate exists, it will be parsed as if it is
> empty and Mercurial won't think there are subrepos to operate on.
> 2) It disables special functionality related to .hgsub and .hgsubstate
> files. Normally, .hgsubstate is managed automatically. With the
> subrepos feature disabled, this file is treated as a normal file.
>
> In the future, we'd like to introduce per VCS controls for subrepos.
> For example, it might be prudent to disable 3rd party subrepo types
> for security reasons but leave Mercurial subrepos enabled. As I thought
> about what these future config options would look like, I couldn't
> think of a great way to shoehorn them into [ui], where most existing
> subrepo options are (there is also [subpaths]). I think we'll
> eventually have sufficient complexity for subrepos to warrant their
> own config section. So that's what I implemented here.
>
> I also thought "enable" was too generic. The introduced flag
> essentially controls whether subrepos can be checked out. So I
> named the option "checkout." It takes a boolean value. For per-type
> options, I was thinking we could use e.g. "checkout:git = false."
>
> There are likely some corner cases with this patch. Known deficiencies
> are annotated as such in the added test. IMO the biggest deficiency
> is interaction with dirstate and status. I think subrepo paths should
> be ignored when subrepo checkout is disabled. Implementing that will
> require a different approach - one that doesn't virtually blank out
> .hgsubstate.
>
> This commit should receive extra review scrutiny because there
> are tons of random references to .hgsub, .hgsubstate, and ctx.substate
> throughout the code base. I'm not sure if I touched all the ones we
> need to touch and if the ones we did touch needed to be touched.
>
> diff --git a/hgext/mq.py b/hgext/mq.py
> --- a/hgext/mq.py
> +++ b/hgext/mq.py
> @@ -959,7 +959,8 @@ class queue(object):
> p1, p2 = repo.dirstate.parents()
> repo.setparents(p1, merge)
>
> - if all_files and '.hgsubstate' in all_files:
> + if (all_files and '.hgsubstate' in all_files
> + and subrepo.checkoutenabled(repo.ui)):
> wctx = repo[None]
> pctx = repo['.']
> overwrite = False
> @@ -1205,8 +1206,11 @@ class queue(object):
> if opts.get('include') or opts.get('exclude') or pats:
> # detect missing files in pats
> def badfn(f, msg):
> - if f != '.hgsubstate': # .hgsubstate is auto-created
> - raise error.Abort('%s: %s' % (f, msg))
> + # .hgsubstate is auto-created
> + if f == '.hgsubstate' and subrepo.checkoutenabled(repo):
> + return
> +
> + raise error.Abort('%s: %s' % (f, msg))
> match = scmutil.match(repo[None], pats, opts, badfn=badfn)
> changes = repo.status(match=match)
> else:
> diff --git a/mercurial/configitems.py b/mercurial/configitems.py
> --- a/mercurial/configitems.py
> +++ b/mercurial/configitems.py
> @@ -811,6 +811,9 @@ coreconfigitem('smtp', 'username',
> coreconfigitem('sparse', 'missingwarning',
> default=True,
> )
> +coreconfigitem('subrepos', 'checkout',
> + default=True,
> +)
> coreconfigitem('templates', '.*',
> default=None,
> generic=True,
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1893,6 +1893,20 @@ rewrite rules are then applied on the fu
> doesn't match the full path, an attempt is made to apply it on the
> relative path alone. The rules are applied in definition order.
>
> +``subrepos``
> +------------
> +
> +This section contains option that control the behavior of the
> +subrepositories feature. See also :hg:`help subrepos`.
> +
> +``checkout``
> + Whether the checkout of subrepositories in the working directory is
> + allowed.
> +
> + When disabled, subrepositories will effectively be ignored by the
> + Mercurial client.
> + (default: True)
> +
> ``templatealias``
> -----------------
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1848,8 +1848,9 @@ class localrepository(object):
> subs = []
> commitsubs = set()
> newstate = wctx.substate.copy()
> - # only manage subrepos and .hgsubstate if .hgsub is present
> - if '.hgsub' in wctx:
> + # only manage subrepos and .hgsubstate if .hgsub is present and
> + # feature enabled.
> + if '.hgsub' in wctx and subrepo.checkoutenabled(self.ui):
> # we'll decide whether to track this ourselves, thanks
> for c in status.modified, status.added, status.removed:
> if '.hgsubstate' in c:
> @@ -1891,7 +1892,8 @@ class localrepository(object):
> _("can't commit subrepos without .hgsub"))
> status.modified.insert(0, '.hgsubstate')
>
> - elif '.hgsub' in status.removed:
> + elif ('.hgsub' in status.removed
> + and subrepo.checkoutenabled(self.ui)):
> # clean up .hgsubstate when .hgsub is removed
> if ('.hgsubstate' in wctx and
> '.hgsubstate' not in (status.modified + status.added +
> diff --git a/mercurial/merge.py b/mercurial/merge.py
> --- a/mercurial/merge.py
> +++ b/mercurial/merge.py
> @@ -999,7 +999,7 @@ def manifestmerge(repo, wctx, p2, pa, br
> copied = set(copy.values())
> copied.update(movewithdir.values())
>
> - if '.hgsubstate' in m1:
> + if '.hgsubstate' in m1 and subrepo.checkoutenabled(repo.ui):
> # check whether sub state is modified
> if any(wctx.sub(s).dirty() for s in wctx.substate):
> m1['.hgsubstate'] = modifiednodeid
> @@ -1377,7 +1377,8 @@ def applyupdates(repo, actions, wctx, mc
> mergeactions.extend(actions['m'])
> for f, args, msg in mergeactions:
> f1, f2, fa, move, anc = args
> - if f == '.hgsubstate': # merged internally
> + # merged internally
> + if f == '.hgsubstate' and subrepo.checkoutenabled(repo.ui):
> continue
> if f1 is None:
> fcl = filemerge.absentfilectx(wctx, fa)
> @@ -1412,8 +1413,9 @@ def applyupdates(repo, actions, wctx, mc
> numupdates = sum(len(l) for m, l in actions.items() if m != 'k')
> z = 0
>
> - if [a for a in actions['r'] if a[0] == '.hgsubstate']:
> - subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
> + if subrepo.checkoutenabled(repo.ui):
> + if [a for a in actions['r'] if a[0] == '.hgsubstate']:
> + subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
>
> # record path conflicts
> for f, args, msg in actions['p']:
> @@ -1466,8 +1468,9 @@ def applyupdates(repo, actions, wctx, mc
> progress(_updating, z, item=item, total=numupdates, unit=_files)
> updated = len(actions['g'])
>
> - if [a for a in actions['g'] if a[0] == '.hgsubstate']:
> - subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
> + if subrepo.checkoutenabled(repo.ui):
> + if [a for a in actions['g'] if a[0] == '.hgsubstate']:
> + subrepo.submerge(repo, wctx, mctx, wctx, overwrite, labels)
>
> # forget (manifest only, just log it) (must come first)
> for f, args, msg in actions['f']:
> @@ -1551,7 +1554,7 @@ def applyupdates(repo, actions, wctx, mc
> repo.ui.debug(" %s: %s -> m (premerge)\n" % (f, msg))
> z += 1
> progress(_updating, z, item=f, total=numupdates, unit=_files)
> - if f == '.hgsubstate': # subrepo states need updating
> + if f == '.hgsubstate' and subrepo.checkoutenabled(repo.ui):
> subrepo.submerge(repo, wctx, mctx, wctx.ancestor(mctx),
> overwrite, labels)
> continue
> @@ -1882,7 +1885,7 @@ def update(repo, node, branchmerge, forc
> # Prompt and create actions. Most of this is in the resolve phase
> # already, but we can't handle .hgsubstate in filemerge or
> # subrepo.submerge yet so we have to keep prompting for it.
> - if '.hgsubstate' in actionbyfile:
> + if '.hgsubstate' in actionbyfile and subrepo.checkoutenabled(repo.ui):
> f = '.hgsubstate'
> m, args, msg = actionbyfile[f]
> prompts = filemerge.partextras(labels)
> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> --- a/mercurial/subrepo.py
> +++ b/mercurial/subrepo.py
> @@ -43,6 +43,10 @@ propertycache = util.propertycache
>
> nullstate = ('', '', 'empty')
>
> +def checkoutenabled(ui):
> + """Whether subrepos can be checked out in the working directory."""
> + return ui.configbool('subrepos', 'checkout')
> +
> def _expandedabspath(path):
> '''
> get a path or url and if it is a path expand it and return an absolute path
> @@ -85,6 +89,14 @@ def state(ctx, ui):
> to tuple: (source from .hgsub, revision from .hgsubstate, kind
> (key in types dict))
> """
> + # Blank out knowledge about subrepos when the feature is disabled.
> + # This effectively causes consumers of this data structure to think
> + # there are no subrepos. This causes problems with the automatic
> + # management of .hgsubstate (which is based on this parsed data)
> + # and the subrepo() revset.
> + if not checkoutenabled(ui):
> + return {}
> +
> p = config.config()
> repo = ctx.repo()
> def read(f, sections=None, remap=None):
> diff --git a/tests/test-subrepo-disable.t b/tests/test-subrepo-disable.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-subrepo-disable.t
> @@ -0,0 +1,206 @@
> + $ hg init hgsub
> + $ cd hgsub
> + $ echo sub0 > foo
> + $ hg -q commit -A -m 'subrepo initial'
> + $ hg log -T '{node}\n'
> + 45cc468b8f18bee314935a4651bad80f9cb3b540
> + $ cd ..
> +
> + $ hg init testrepo0
> + $ cd testrepo0
> + $ cat >> .hg/hgrc << EOF
> + > [subrepos]
> + > checkout = false
> + > EOF
> +
> + $ echo 0 > foo
> + $ hg -q commit -A -m initial
> +
> +Adding a .hgsub should result in no sign of the subrepo in the working directory
> +
> + $ cat > .hgsub << EOF
> + > hgsub = ../hgsub
> + > EOF
> +
> + $ hg add .hgsub
> + $ hg commit -m 'add .hgsub'
> +
> + $ hg files --subrepos
> + .hgsub
> + foo
> +
> +No .hgsubstate should have been created
> +
> + $ hg files
> + .hgsub
> + foo
> +
> +Adding a nested hg repo will be treated like a normal nested hg repo: it will be ignored
> +
> + $ hg init hgsub
> + $ hg st
> + $ rm -rf hgsub
> +
> + $ cd ..
> +
> +Cloning this repo won't clone subrepo because no .hgsubstate
> +
> + $ hg clone --pull testrepo0 no-substate-enabled
> + requesting all changes
> + adding changesets
> + adding manifests
> + adding file changes
> + added 2 changesets with 2 changes to 2 files
> + new changesets 68986213bd44:addf99df3e66
> + updating to branch default
> + 2 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Cloning with subrepos disabled should be identical to above
> +
> + $ hg --config subrepos.checkout=false clone --pull testrepo0 no-substate-disabled
> + requesting all changes
> + adding changesets
> + adding manifests
> + adding file changes
> + added 2 changesets with 2 changes to 2 files
> + new changesets 68986213bd44:addf99df3e66
> + updating to branch default
> + 2 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Add an .hgsubstate manually to simulate a real repo with subrepos
> +
> + $ cd testrepo0
> + $ cat > .hgsubstate << EOF
> + > 45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> + > EOF
> + $ hg add .hgsubstate
> +
> + $ hg commit -m 'manually add .hgsubstate'
> +
> +subrepo() revset with no args works if feature is disabled
> +
> + $ hg log -r 'subrepo()'
> + changeset: 2:7647585deebb
> + tag: tip
> + user: test
> + date: Thu Jan 01 00:00:00 1970 +0000
> + summary: manually add .hgsubstate
> +
> +
> +But it doesn't work with patterns
> +(This is an implementation detail because of the way disabling subrepos
> +blanks out substate and could change in the future.)
> +
> + $ hg --config subrepos.checkout=true log -r 'subrepo(hgsub)'
> + changeset: 2:7647585deebb
> + tag: tip
> + user: test
> + date: Thu Jan 01 00:00:00 1970 +0000
> + summary: manually add .hgsubstate
> +
> +
> + $ hg log -r 'subrepo(hgsub)'
> +
> + $ cd ..
> +
> +Cloning with feature enabled should clone subrepos
> +
> + $ hg clone --pull testrepo0 with-substate-enabled
> + requesting all changes
> + adding changesets
> + adding manifests
> + adding file changes
> + added 3 changesets with 3 changes to 3 files
> + new changesets 68986213bd44:7647585deebb
> + updating to branch default
> + cloning subrepo hgsub from $TESTTMP/hgsub
> + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Cloning with feature disabled should not clone subrepos
> +
> + $ hg --config subrepos.checkout=false clone --pull testrepo0 with-substate-disabled
> + requesting all changes
> + adding changesets
> + adding manifests
> + adding file changes
> + added 3 changesets with 3 changes to 3 files
> + new changesets 68986213bd44:7647585deebb
> + updating to branch default
> + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> +Test a client with subrepos enabled on a clone missing them.
> +Important: the .hgsubstate file (which is normally automatically managed)
> +should be present and unmodified.
> +
> + $ cd with-substate-disabled
> + $ hg status
> + $ cat .hgsubstate
> + 45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> + $ hg up
> + 0 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> + $ cd ..
> +
> +Test a client with subrepos disabled on a clone with subrepos
> +
> + $ cd with-substate-enabled
> + $ hg --config subrepos.checkout=false status
> + $ cat .hgsubstate
> + 45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> + $ hg --config subrepos.checkout=false up
> + 0 files updated, 0 files merged, 0 files removed, 0 files unresolved
> +
> + $ hg --config subrepos.checkout=false st
> + $ cat .hgsubstate
> + 45cc468b8f18bee314935a4651bad80f9cb3b540 hgsub
> +
> + $ hg --config subrepos.checkout=false files -S
> + .hgsub
> + .hgsubstate
> + foo
> +
> + $ cd ..
> +
> +#if git
> +
> + $ GIT_AUTHOR_NAME=author; export GIT_AUTHOR_NAME
> + $ GIT_AUTHOR_EMAIL=someone at example.com; export GIT_AUTHOR_EMAIL
> + $ GIT_AUTHOR_DATE='1234567891 +0000'; export GIT_AUTHOR_DATE
> + $ GIT_COMMITTER_NAME=author; export GIT_COMMITTER_DATE
> + $ GIT_COMMITTER_EMAIL=someone at example.com; export GIT_COMMITTER_EMAIL
> + $ GIT_COMMITTER_DATE='1234567891 +0000'; export GIT_COMMITTER_DATE
> +
> + $ git init -q gitsub
> + $ cd gitsub
> + $ echo 0 > foo
> + $ git add foo
> + $ git commit -q -m 'git subrepo initial'
> + $ cd ..
> +
> + $ hg init repo-with-git-subrepo
> + $ cd repo-with-git-subrepo
> + $ echo 0 > foo
> + $ hg -q commit -A -m initial
> +
> + $ cat > .hgsub << EOF
> + > gitsub = [git]../gitsub
> + > EOF
> + $ git clone -q ../gitsub gitsub
> + $ hg add .hgsub
> + $ hg commit -m 'add git subrepo'
> +
> +`hg files` with feature disabled only shows Mercurial files
> +
> + $ hg --config subrepos.checkout=false files -S
> + .hgsub
> + .hgsubstate
> + foo
> +
> +`hg status` only shows Mercurial files
> +FUTURE consider ignoring paths beloning to parsed subrepo paths
> +
> + $ hg --config subrepos.checkout=false status gitsub/foo gitsub/.git/config
> + ? gitsub/.git/config
> + ? gitsub/foo
> +
> +#endif
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
More information about the Mercurial-devel
mailing list