[PATCH RESEND] hgweb: add group authorization

Mads Kiilerich mads at kiilerich.com
Thu Apr 11 18:20:45 CDT 2013


Hi Markus

That is nice work - thanks for the contribution! I am sure that it 
really solves a real problem.

But the lack of feedback might indicate that it not is something a lot 
of users have needed and asked for. The most important question might 
thus be if it is something we really need in the core. Lots of good 
features live as extensions ... and only few of them are shipped with 
Mercurial. And especially for for a server side feature like this it 
might be less of an issue if it had to fetched from some other location.

The problem can also be solved in the web server or it can be solved 
with something like RhodeCode.

If you aim high then someone (Matt) has to make the call whether we want 
it in core. You can also let it live elsewhere as an extension and let 
the user demand prove that it is something we need in core.

That uncertainty do that a detailed review isn't what is needed the 
most, but anyway, here is something. Some of it is my questionable 
opinion, some of it follows the Mercurial tradition where compliance 
with guidelines is more important than being flexible and accepting 
contributions.

On 02/07/2013 06:36 PM, Markus Zapke-Gründemann wrote:
> # HG changeset patch
> # User Markus Zapke-Gründemann <markus at keimlink.de>
> # Date 1360231888 -3600
> # Node ID d2dbfdee987a51efb6f4ad69e3b116aa22553326
> # Parent  2fefd1170bf269e26bb304553009f38e0117c342
> hgweb: add group authorization.

http://mercurial.selenic.com/wiki/ContributingChanges says no trailing '.'.

The patch is non-trivial, so the description should tell a bit about 
"what" and "why". For example something like 
http://markmail.org/message/modjr4olngtibnp7 ... or perhaps a bit shorter.

> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1286,8 +1286,12 @@ The full set of options is:
>       push is not allowed. If the special value ``*``, any remote user can
>       push, including unauthenticated users. Otherwise, the remote user
>       must have been authenticated, and the authenticated user name must
> -    be present in this list. The contents of the allow_push list are
> -    examined after the deny_push list.
> +    be present in this list. It is also possible to use groups in this
> +    list. A group name is prefixed by an ``@``. Groups can either be
> +    groups defined in the ``groups_section`` or Unix groups. If a group
> +    from the ``groups_section`` has the same name as an Unix group it
> +    is used instead. The contents of the allow_push list are examined
> +    after the deny_push list.
>   
>   ``allow_read``
>       If the user has not already been denied repository access due to
> @@ -1297,8 +1301,12 @@ The full set of options is:
>       denied for the user. If the list is empty or not set, then access
>       is permitted to all users by default. Setting allow_read to the
>       special value ``*`` is equivalent to it not being set (i.e. access
> -    is permitted to all users). The contents of the allow_read list are
> -    examined after the deny_read list.
> +    is permitted to all users). It is also possible to use groups in
> +    this list. A group name is prefixed by an ``@``. Groups can either
> +    be groups defined in the ``groups_section`` or Unix groups. If a
> +    group from the ``groups_section`` has the same name as an Unix group
> +    it is used instead. The contents of the allow_read list are examined
> +    after the deny_read list.
>   
>   ``allowzip``
>       (DEPRECATED) Whether to allow .zip downloading of repository
> @@ -1366,8 +1374,13 @@ The full set of options is:
>       Whether to deny pushing to the repository. If empty or not set,
>       push is not denied. If the special value ``*``, all remote users are
>       denied push. Otherwise, unauthenticated users are all denied, and
> -    any authenticated user name present in this list is also denied. The
> -    contents of the deny_push list are examined before the allow_push list.
> +    any authenticated user name present in this list is also denied. It
> +    is also possible to use groups in this list. A group name is
> +    prefixed by an ``@``. Groups can either be groups defined in the
> +    ``groups_section`` or Unix groups. If a group from the
> +    ``groups_section`` has the same name as an Unix group it is used
> +    instead. The contents of the deny_push list are examined before the
> +    allow_push list.
>   
>   ``deny_read``
>       Whether to deny reading/viewing of the repository. If this list is
> @@ -1380,9 +1393,12 @@ The full set of options is:
>       deny_read and allow_read are empty or not set, then access is
>       permitted to all users by default. If the repository is being
>       served via hgwebdir, denied users will not be able to see it in
> -    the list of repositories. The contents of the deny_read list have
> -    priority over (are examined before) the contents of the allow_read
> -    list.
> +    the list of repositories. It is also possible to use groups in this
> +    list. A group name is prefixed by an ``@``. Groups can either be
> +    groups defined in the ``groups_section`` or Unix groups. If a group
> +    from the ``groups_section`` has the same name as an Unix group it is
> +    used instead. The contents of the deny_read list have priority over
> +    (are examined before) the contents of the allow_read list.
>   
>   ``descend``
>       hgwebdir indexes will not descend into subdirectories. Only repositories
> @@ -1400,6 +1416,30 @@ The full set of options is:
>   ``errorlog``
>       Where to output the error log. Default is stderr.
>   
> +``groups_section``
> +    Name of hgrc section used to define groups for authorization.
> +    Default is ``web.groups``. Use the section to define the groups used
> +    by authorization.

Why is that needed? If it is needed then it seems like a separate 
feature that perhaps could be a separate patch. It also seems like this 
feature is untested.

I would just use [webgroups].

> +
> +    Example::
> +
> +        [web]
> +        allow_read = @devs
> +
> +        [web.groups]
> +        devs = alice, bob, clara, david
> +
> +    Groups can contain other groups::
> +
> +        [web]
> +        allow_read = @devs, @testers
> +        allow_push = @devs
> +
> +        [web.groups]
> +        devs = alice, bob, clara, david
> +        ci = hudson
> +        testers = @ci, lisa, mario
> +
>   ``guessmime``
>       Control MIME types for raw download of file content.
>       Set to True to let hgweb guess the content type from the file
> diff --git a/mercurial/hgweb/common.py b/mercurial/hgweb/common.py
> --- a/mercurial/hgweb/common.py
> +++ b/mercurial/hgweb/common.py
> @@ -8,6 +8,8 @@
>   
>   import errno, mimetypes, os
>   
> +from mercurial import util
> +
>   HTTP_OK = 200
>   HTTP_NOT_MODIFIED = 304
>   HTTP_BAD_REQUEST = 400
> @@ -18,6 +20,53 @@ HTTP_METHOD_NOT_ALLOWED = 405
>   HTTP_SERVER_ERROR = 500
>   
>   
> +def _get_users(ui, group, seen=None):

"For consistency and ease of reference, Mercurial uses a single style 
for all identifiers: all lowercase, with no underbars between words."

'group' is apparently a 'groupname' (in other places it is 
'groupmembers') - I would call it that to make the code more readable 
... but also more verbose.

> +    """Return the users of the group as list."""
> +    # update list of groups seen so far for detecting recursions
> +    if not seen:
> +        seen = []
> +    seen.append(group)
> +    # check which section to use to lookup groups
> +    section = ui.config('web', 'groups_section', 'web.groups')
> +    # first, try to use group definition from groups_section
> +    users = []
> +    hgrcusers = ui.configlist(section, group)

It seems like 'hgrcusers' contains both users and groups. 'groupmembers' 
or 'configmembers' or just 'members' seems more precise.

> +    if hgrcusers:
> +        for item in hgrcusers:
> +            if not item.startswith('@'):
> +                users.append(item)
> +                continue
> +            if item[1:] in seen:

'seen' should be a set where membership lookup always is cheap. It would 
probably usually be short a list, but it is better to do it right.

> +                raise ErrorResponse(HTTP_UNAUTHORIZED,
> +                    'recursion detected for group "%s" in group "%s"' %
> +                    (item[1:], group))

This would be a configuration error. I don't think the message should be 
returned to the user. An abort would be better, or it could just ignore 
the loop ... perhaps after logging an error.

> +            users += _get_users(ui, item[1:], seen)
> +    if not users:
> +        # if no users found in group definition, get users from OS-level group

I would expect that if a group has been defined then it should be used 
... even if it is empty. Silently falling back to OS groups would be 
confusing.

> +        try:
> +            users = util.groupmembers(group)
> +        except KeyError:

Relying on a util function to raise a standard exception seems a bit 
like a leaky abstraction. I would prefer a None if the group isn't found.

> +            raise ErrorResponse(HTTP_UNAUTHORIZED,
> +                    'group "%s" is undefined' % group)

Also a configuration error that shouldn't be returned.

> +    return users
> +
> +
> +def _is_member(ui, user, group):

"For consistency and ease of reference, Mercurial uses a single style 
for all identifiers: all lowercase, with no underbars between words."

> +    """Check recursively if a user is member of a group.

(There is not much recursion in this function. It just checks if the 
user is a member of the group.)

> +
> +    If the group equals * all users are members.

"If group only contains '*' ..."

> +    """
> +    if group == ['*'] or user in group:
> +        return True
> +    for item in group:

'item' is very generic. 'member' seems more descriptive.

> +        if not item.startswith('@'):
> +            continue
> +        users = _get_users(ui, item[1:])
> +        if user in users:
> +            return True
> +    return False
> +
> +
>   def checkauthz(hgweb, req, op):
>       '''Check permission for operation based on request data (including
>       authentication info). Return if op allowed, else raise an ErrorResponse
> @@ -25,18 +74,19 @@ def checkauthz(hgweb, req, op):
>   
>       user = req.env.get('REMOTE_USER')
>   
> +    # check read permission
>       deny_read = hgweb.configlist('web', 'deny_read')
> -    if deny_read and (not user or deny_read == ['*'] or user in deny_read):
> +    if deny_read and (not user or _is_member(hgweb.repo.ui, user, deny_read)):

There is a lot of lines of code here that doesn't change anything. It 
could be a separate refactoring patch (or two) that adds the is_member 
function (without group handling) and the comments.

>           raise ErrorResponse(HTTP_UNAUTHORIZED, 'read not authorized')
>   
>       allow_read = hgweb.configlist('web', 'allow_read')
> -    result = (not allow_read) or (allow_read == ['*'])
> -    if not (result or user in allow_read):
> +    if not (not allow_read or _is_member(hgweb.repo.ui, user, allow_read)):
>           raise ErrorResponse(HTTP_UNAUTHORIZED, 'read not authorized')
>   
> +    # check pull permission
>       if op == 'pull' and not hgweb.allowpull:
>           raise ErrorResponse(HTTP_UNAUTHORIZED, 'pull not authorized')
> -    elif op == 'pull' or op is None: # op is None for interface requests
> +    elif op == 'pull' or op is None:  # op is None for interface requests
>           return
>   
>       # enforce that you can only push using POST requests
> @@ -50,12 +100,13 @@ def checkauthz(hgweb, req, op):
>       if hgweb.configbool('web', 'push_ssl', True) and scheme != 'https':
>           raise ErrorResponse(HTTP_FORBIDDEN, 'ssl required')
>   
> +    # check push permission
>       deny = hgweb.configlist('web', 'deny_push')
> -    if deny and (not user or deny == ['*'] or user in deny):
> +    if deny and (not user or _is_member(hgweb.repo.ui, user, deny)):
>           raise ErrorResponse(HTTP_UNAUTHORIZED, 'push not authorized')
>   
>       allow = hgweb.configlist('web', 'allow_push')
> -    result = allow and (allow == ['*'] or user in allow)
> +    result = allow and _is_member(hgweb.repo.ui, user, allow)
>       if not result:
>           raise ErrorResponse(HTTP_UNAUTHORIZED, 'push not authorized')
>   
> diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py
> --- a/mercurial/hgweb/hgwebdir_mod.py
> +++ b/mercurial/hgweb/hgwebdir_mod.py
> @@ -10,8 +10,8 @@ import os, re, time
>   from mercurial.i18n import _
>   from mercurial import ui, hg, scmutil, util, templater
>   from mercurial import error, encoding
> -from common import ErrorResponse, get_mtime, staticfile, paritygen, \
> -                   get_contact, HTTP_OK, HTTP_NOT_FOUND, HTTP_SERVER_ERROR
> +from common import _is_member, ErrorResponse, get_mtime, staticfile, \

_is_member should not have a leading '_' when it is used 'externally'.

> +    paritygen, get_contact, HTTP_OK, HTTP_NOT_FOUND, HTTP_SERVER_ERROR
>   from hgweb_mod import hgweb, makebreadcrumb
>   from request import wsgirequest
>   import webutil
> @@ -164,12 +164,12 @@ class hgwebdir(object):
>           user = req.env.get('REMOTE_USER')
>   
>           deny_read = ui.configlist('web', 'deny_read', untrusted=True)
> -        if deny_read and (not user or deny_read == ['*'] or user in deny_read):
> +        if deny_read and (not user or _is_member(ui, user, deny_read)):
>               return False
>   
>           allow_read = ui.configlist('web', 'allow_read', untrusted=True)
>           # by default, allow reading if no allow_read option has been set
> -        if (not allow_read) or (allow_read == ['*']) or (user in allow_read):
> +        if (not allow_read) or _is_member(ui, user, allow_read):
>               return True
>   
>           return False
> diff --git a/tests/test-hgweb-authz.t b/tests/test-hgweb-authz.t
> new file mode 100644
> --- /dev/null
> +++ b/tests/test-hgweb-authz.t
> @@ -0,0 +1,111 @@
> +This test exercises the authorization functionality with a dummy script
> +
> +  $ cat <<EOF > dummywsgi
> +  > import os
> +  > import sys
> +  >
> +  > from mercurial.hgweb import hgweb
> +  >
> +  > app = hgweb(os.path.join(os.environ['TESTTMP'], 'hgweb.config'))
> +  > environ = {
> +  >     'SCRIPT_NAME': '',
> +  >     'REQUEST_METHOD': 'GET',
> +  >     'PATH_INFO': sys.argv[1],
> +  >     'SERVER_PROTOCOL': 'HTTP/1.0',
> +  >     'QUERY_STRING': '',
> +  >     'CONTENT_LENGTH': '0',
> +  >     'SERVER_NAME': 'localhost',
> +  >     'SERVER_PORT': '80',
> +  >     'REPO_NAME': sys.argv[1],
> +  >     'HTTP_HOST': 'localhost:80',
> +  >     'REMOTE_USER': sys.argv[2],
> +  >     'wsgi.input': sys.stdin,
> +  >     'wsgi.url_scheme': 'http',
> +  >     'wsgi.multithread': False,
> +  >     'wsgi.version': (1, 0),
> +  >     'wsgi.run_once': False,
> +  >     'wsgi.errors': sys.stderr,
> +  >     'wsgi.multiprocess': False,
> +  > }
> +  >
> +  > def start_response(status, headers, exc_info=None):
> +  >     def dummy_response(data):
> +  >         pass
> +  >     sys.stdout.write(status + '\n')
> +  >     return dummy_response
> +  >
> +  > app(environ, start_response)
> +  > EOF
> +
> +creating test repository
> +
> +  $ hg init r1
> +  $ cd r1
> +  $ echo c1 > f1
> +  $ echo c2 > f2
> +  $ hg ci -A -m "init" f1 f2
> +
> +writing hgweb.config
> +
> +  $ cd ..
> +  $ cat <<EOF > hgweb.config
> +  > [paths]
> +  > r1 = `pwd`/r1
> +  > EOF
> +
> +group authorization test
> +
> +  $ cat <<EOF > r1/.hg/hgrc
> +  > [web]
> +  > allow_read = @developers, cathrin
> +  >
> +  > [web.groups]
> +  > developers = alice, bob
> +  > EOF
> +
> +  $ python ./dummywsgi r1 alice
> +  200 Script output follows
> +  $ python ./dummywsgi r1 bob
> +  200 Script output follows
> +  $ python ./dummywsgi r1 cathrin
> +  200 Script output follows
> +  $ python ./dummywsgi r1 nosuchuser
> +  401 read not authorized
> +
> +groups can contain other groups
> +
> +  $ cat <<EOF > r1/.hg/hgrc
> +  > [web]
> +  > allow_read = @developers, @testers
> +  >
> +  > [web.groups]
> +  > developers = alice, bob
> +  > ci = hudson
> +  > testers = @ci, lisa, mario
> +  > EOF
> +
> +  $ python ./dummywsgi r1 hudson
> +  200 Script output follows
> +
> +using an unknown groups fails
> +
> +  $ cat <<EOF > r1/.hg/hgrc
> +  > [web]
> +  > allow_read = @quux
> +  > EOF
> +
> +  $ python ./dummywsgi r1 alice
> +  401 group "quux" is undefined
> +
> +using a recursive groups setup is not allowed
> +
> +  $ cat <<EOF > r1/.hg/hgrc
> +  > [web]
> +  > allow_read = @developers
> +  >
> +  > [web.groups]
> +  > developers = alice, bob, @developers
> +  > EOF
> +
> +  $ python ./dummywsgi r1 alice
> +  401 recursion detected for group "developers" in group "developers"
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list