Authorizing Users

This document focuses on controlling access to repositories shared over an intranet / the Internet. This is a subpage of PublishingRepositories. It does not describe what the various technologies are, simply how to implement them in a Mercurial context.

Contents

Access Control for remote users.
Access Control: Checkins within a push
1. Correlating checkins to pushes

1. Access Control for remote users.

Mercurial does not natively support restricting read access, though with extensions it does support restricting write access. So your access control will be split into a few different places.

Authentication and 1st level of access control is the job of the web server’s authentication and .htpasswd or shared ssh and its keys files. This means that you’ll configure it differently depending on if you use http vs ssh.
- If using hgweb, you can offload some of the access control (though not the authentication) to the [web] section of each repo's hgrc file via the allow_read and deny_read sections.
- If using shared ssh, depending on the wrapper tools that you use, there may be additional ways to restrict access beyond the presence/absence of key files.
Anybody who is granted access by your web server / ssh infrastructure has at least read-only access to your repository.
If there is no distinction between read-only and read-write users and also no distinction between access permissions for different repos or branches, then you're done.
If different repositories have different sets of read-access permissions, you will need to make further web server / ssh infrastructure changes.
- hgweb: allow_read, deny_read settings per repo.
- hgweb alternative: set up different URL paths for repos with different permissions so that each path can have its own .htaccess rules.
- different shared ssh schemes: different settings.
If different branches within a repository have different write-access permissions, then you need the AclExtension. This must be configured separately per-repo.
If different repos on your server have different write permissions, then there are a few ways to implement this:
- The AclExtension
- hgweb: the allow_write and deny_write settings per repo
- different shared ssh schemes: different settings.

This sounds somewhat complicated and it is. But you will find a set of configurations that works for you, then you just need to write a single global access configuration file. And presumably you already have some sort of global users/groups and user/group permissions file, and you just need to write a script that takes the users and rules files and turns them into the many different hgweb config files, ssh wrapper config files, AclExtension config files etc. (Making sure to handle any file updates/copies in an atomic manner, of course.) And everytime one of your two global files changes, you just trigger your script to run and update everything.

If you write your own hooks, you'll need to figure out the userid of the pusher. This is unfortunately not a first class Mercurial variable. (IMHO, it should be.) We can glean it from this code from the AclExtension.

   1     pusher = None
   2     if source == 'serve' and 'url' in kwargs:
   3         url = kwargs['url'].split(':')
   4         if url[0] == 'remote' and url[1].startswith('http'):
   5             pusher = urllib.unquote(url[3])
   6 
   7     if pusher is None:
   8         pusher = getpass.getuser()

2. Access Control: Checkins within a push

Essentially, there is none.

A basic feature of Mercurial is that there is no real authentication of committers, just of pushers (and only then via http/ssh). So you’ve got to decide what your rules are going to be about if a person is allowed to push somebody else’s commits and if so if the authors of those commits must all be authorized to push to your repository or if it is enough that the pusher “vouches for them” by pushing their changesets.

Here’s a hook that you can modify to do what you want with this policy.

import re, subprocess, os
import getpass, urllib
from mercurial.i18n import _
from mercurial import util, templatefilters

AUTHUSERFILE = '/path/to/apache.htaccess.for.your.repo'

def hook(ui, repo, hooktype, node=None, source=None, **kwargs):
    if hooktype not in ['pretxnchangegroup', 'pretxncommit']:
        raise util.Abort(_('config error - dts.py hook cannot be a hook type "%s".') % hooktype )

    #From hgext/acl.py
    pusher = None
    if source == 'serve' and 'url' in kwargs:
        url = kwargs['url'].split(':')
        if url[0] == 'remote' and url[1].startswith('http'):
            pusher = urllib.unquote(url[3])

    if pusher is None:
        pusher = getpass.getuser()
    #end of hgext/acl.py code

    pusher_seen = False
    for rev in xrange(repo[node], len(repo)):
        ctxuser =  templatefilters.person(repo[rev].user())
        #ui.write("Pusher %s, committer %s for id %d\n" % (pusher, ctxuser, rev))
        if (ctxuser == pusher):
            pusher_seen = True
        else:
        #   ui.write("ERROR: You may not push changes not made by yourself.\n");
        #   return 1
            p = subprocess.call("grep -q '^%s:' %s" % (ctxuser, AUTHUSERFILE), shell=True)
            if (p):
                ui.write("ERROR: Changeset %d committed by %s who is not authorized to commit.\n"
                         % (rev, ctxuser))
                return 1
    if ( not pusher_seen):
        ui.write("ERROR: Push done by %s, but does not contain any commits by %s. Blocked.\n"
            % (pusher, pusher))
        return 1
    return 0

2.1. Correlating checkins to pushes

The problems with the above are twofold.

You cannot tell who "vouched for" whom because there is no out of the box support for recording this information. Google “mercurial pushlog” for options. The Mozilla project has a nice sqlite-based pushlog implementation. See hgcustom: you’ll need stuff from all the repos: pushlog, hghooks and hg_templates. Also note that it only supports SSH access so it does not extract the http user properly. You’ll need to use the chunk of code from hgext/acl.py that’s already quoted above.
Anybody can edit their ~/.hgrc and say that they have any userid they want. So even with pushlogs, all you get is that "jdoe" pushed and vouched for a commit that claims it was coded by "xwong".

(This may seem ridiculously picky. But imagine you work in a FDA- or FAA-regulated industry where traceability of everything is absolutely paramount. If you read in the newspaper that the company who wrote the software that made your grandfather's pacemaker malfunction, or the plane's landing gear fail to retract, or the 911 call fail to be routed could not really say who it was who introduced that faulty line of code because it was just the way their VCS worked, would you accept that as ok?)

AuthorizingUsers

Authorizing Users

1. Access Control for remote users.

2. Access Control: Checkins within a push

2.1. Correlating checkins to pushes