RFC: Managing Mercurial Repositories Remotely

Tue Feb 19 05:57:16 CST 2008

Hi all,

This is a somewhat lengthy proposal for an extension that supports
running a limited set of hg commands server-side. Comments welcome.

= Managing Mercurial Repositories Remotely =

The idea is to give committers an easy way to create and manage
server-side clones, for example to create branches for collaboration,
or just for web-based review. On servers where committers are granted
ssh access, this is a moot point. The target here is to provide such
management features over the http(s) protocol. Ideally, the command
set would be accessible both in the hg web ui (hg serve), as well as
to scripts.

hgfront is an attempt to provide these features by way of a full-blown
web ui. However, it aims at also providing issue tracking and a wiki.
This is far too heavy-weight for what I have in mind. And it does not
lend itself well to scripting, I think.

I propose a pair of matching hg extensions for the client to submit
remote commands, and for the server to run them. The server extension
could also expose the commands in the web ui, but that is not my
current focus. Currently, I assume that the repo server is serving a
dynamic collection so that new clones are made available
automatically. The client extension is not strictly necessary, of
course, as you could also script using wget. But it is much nicer.

A key point must be the security of the scheme. Above the usual
safeguards of requiring https and denying all rights by default, it
should not be possible to submit arbitrary code for execution, even if
properly authenticated. If this were not a requirement, then you might
as well grant ssh access.

== hg rdo ==

My first naive attempt was to let users submit arbitrary hg commands
to the server. This is hard to secure. Here's an example of cloning
main to temp and applying an mq queue to it (assuming we're in a repo
with a versioned patch queue and the server-side working dir is set to
the target repo's parent dir):

	hg rdo https://host.org/hg/main \
		"clone main temp" \
		"-R temp qinit -c"
	hg -R .hg/patches push https://host.org/hg/temp/.hg/patches
	hg rdo https://host.org/hg/temp \
		"-R temp/.hg/patches update" \
		"-R temp qpush -a"

I see multiple attack vectors here:

	* Use the --config option to directly enable arbitrary hooks and extensions.
	* Clone a repo to another repo's .hg folder, then update, thus
overwriting the other repo's configuration. Would allow you to plant
arbitrary hooks and extensions.
	* Clone and update a repo to a system path, thus compromising any
data writable by the repo server's account.
	* Use planted repos and `hg diff` to read any file readable by the
repo server's account.

So even just clone and update, two key commands, open up serious holes.

== hg rclone et al. ==

So let's try dedicated remote commands (assuming we're in a repo with
a versioned patch queue):

	hg rclone https://host.org/hg/main temp -r `hg id -i -r qparent`
	hg rqinit https://host.org/hg/temp
	hg -R .hg/patches push https://host.org/hg/temp/.hg/patches
	hg rupdate https://host.org/hg/temp/.hg/patches
	hg rqpush https://host.org/hg/temp -a

and later on

	hg rdrop https://host.org/hg/temp

`hg rclone SRC TGT [-r REV]` clones a remote repo SRC to a sibling
repo in the same path as the cloned repo. TGT is checked to not
contain a path character. This ensures clones are always in a defined,
non-problematic location. rclone does not update the new clone. rclone
is only allowed if SRC contains a SRC/.hg/hgrc-rclone file. If so,
this file is symlinked to TGT/.hg/hgrc and TGT/.hg/hgrc-rclone. (If
the source files were already symlinks, they are first traced back to
their origin and the new symlinks point there. This ensures all
symlinks point to the central location where you manage the config
files.) The same is done for .hg/hgrc-rqinit

`hg rqinit TGT` creates a versioned patch queue in the target
repository. rqinit is only allowed if TGT/.hg/hgrc-rclone exists. If
so, it is symlinked (as above) to TGT/.hg/patches/.hg/hgrc, unless
TGT/.hg/hgrc-rqinit exists. If the latter exists, it is used instead.
You could automate the reapplication of the patch queue by adding
hooks to this file.

`hg rupdate TGT [-r REV]` updates the target repo to the desired version.

`hg rqgo TGT patch` moves the remote repo's patch queue head to the
designated patch.

`hg rqselect TGT [SELECTOR]...` selects patch guards in the remote repo.

`hg rdrop TGT` deletes the target repo in its entirety. You should not
have web.allow_rdrop in your main repo's .hg/hgrc, but in its
.hg/hgrc-rclone.

I propose implementing this by way of a protocol extension called
`rdo`, which uses a separate server-side command map to configure
available server-side commands. These will _never_ be the same as the
normal hg commands as they need to do more stringent input validation.
For the sake of consistency we should probably implement this for ssh
repos too.

Once this support is in place, we might add other commands:

`hg rlog TGT [OPTS]...` (as was, I think, desired by TortoiseHG). OPTS
would only be the set of options pertaining to `hg log` as such, but
none of the global options like --config.

This approach has a number of benefits:

	* Fairly simple setup. Just configure the necessary extensions on
client and relevant server repos.
	* Separate configuration for main repo and clones. Separate config
for patch queues possible but not required.
	* Separate or central configuration for different main repos possible.
	* Central location of server-side config possible (by using symlinks
and tracing origin).
	* No need to interact with hgwebdir_mod.py. Can reuse existing
authentication methods in hgweb_mod.py.

Problems:

	* No symlinks on Windows.
	* Security hinges crucially on recognizing path components in the
target repo name in rclone. In particular, this will have to handle
UTF encodings and .. and . properly. Also, on Windows, it might be
necessary to forbid both / and \ in addition to :.

Again, comments are welcome. I shall meanwhile try to hack together a
prototype based on my rdo prototype.
-peo