Serving Mercurial repositories with Apache and mod_wsgi

1. Introduction

mod_wsgi is a simple to use Apache module which can host any Python application which supports the Python WSGI interface (such as slightly modified hgwebdir.cgi). It can be run as "in server" process (just like mod_python), or in "daemon mode" (equivalent to FastCGI). Daemon mode is strongly recommended for better security and separation.

Please don't blindly follow this document. At least read official mod_wsgi documentation (it's pretty good), and PublishingRepositories. It is expected that you know how to properly configure Apache :-D

1.1. Advantages

1.2. Disadvantages

2. Pre-requisites

You'll need following software (tested versions are in parenthesis, other versions should work too):

Starting with version 1.6 of Mercurial, the hgwebdir.wsgi script has been unified with the hgweb.wsgi script. Wherever hgwebdir.wsgi is referred to in these directions, you can substitute the hgweb.wsgi script instead.

3. Configuration

3.1. mod_wsgi

In case you can't find mod_wsgi package for your operating system, you'll have to compile it yourself.

e.g.

$ tar xvf mod_wsgi-1.1.tar.gz
$ cd mod_wsgi-1.1
$ ./configure
$ make
$ su -c "make install"

Edit your httpd.conf file to load wsgi module:

LoadModule wsgi_module libexec/httpd/mod_wsgi.so

3.2. Apache

/!\ The maximum-requests mod_wsgi configuration option can cause long-lived requests like pushes and pulls to be unceremoniously aborted, so it should be left at its default unlimited value. See 2595.

In this sample setup, we are serving mercurial repositories from separate virtual host (hg.example.net). Repositories are in htdocs directory, served by modified hgwebdir.cgi script (hgwebdir.wsgi).

<VirtualHost *:80>
    ServerName hg.example.net
    DocumentRoot /var/www/vhosts/hg.example.net/htdocs
    ErrorLog /var/log/httpd/hg.example.net-error_log
    CustomLog /var/log/httpd/hg.example.net-access_log common

    WSGIScriptAliasMatch ^(.*)$ /var/www/vhosts/hg.example.net/cgi-bin/hgwebdir.wsgi$1

    # To enable "daemon" mode, uncomment following lines. (Read mod_wsgi docs for more info)
    # WSGIDaemonProcess hg.example.net user=USER group=GROUP threads=1 processes=15
    # some more interesting options (tested on mod_wsgi 2.0):
    # umask=0007 display-name=wsgi-hg.example.net inactivity-timeout=300
    # WSGIProcessGroup hg.example.net

    <Directory /var/www/vhosts/hg.example.net/htdocs>
        Options FollowSymlinks
        DirectoryIndex index.html

        AllowOverride None
        Order allow,deny
        Allow from all
    </Directory>

    <Directory /var/www/vhosts/hg.example.net/cgi-bin>
        Options ExecCGI FollowSymlinks

        AddHandler wsgi-script .wsgi

        AllowOverride None
        Order allow,deny
        Allow from all
    </Directory>
</VirtualHost>

PeterArrenbrecht notes: On Ubuntu 8.04 I had to replace the line containing WSGIScriptAliasMatch with

WSGIScriptAlias  /hg  /var/www/vhosts/hg.example.net/cgi-bin/hgwebdir.wsgi

to make it work.

3.3. Mercurial

You only need to remove any reference to wsgicgi in hgwebdir.cgi script which comes with mercurial. The contrib directory included with the mercurial distribution already contains these changes.

Here is the short version - hgwebdir.wsgi:

from mercurial import demandimport; demandimport.enable()
from mercurial.hgweb.hgwebdir_mod import hgwebdir

application = hgwebdir('/var/www/vhosts/hg.example.net/cgi-bin/hgweb.config')

Note that the wsgicgi lines have been removed; hgwebdir is already WSGI capable and the .cgi script merely uses additional code to adapt the CGI request into a WSGI request. Note also that the mod_wsgi hgwebdir.wsgi needs to be passed the full path for the hgweb.config file even if the two files are in the same directory, whereas hgwebdir.cgi does not.

The hgweb.config file looks like this:

[web]
style = coal

[paths]
/ = /var/www/vhosts/hg.example.net/htdocs/**

/!\ Note: this only works with Mercurial 1.0 and newer; the above example is based on Mercurial 1.2 and earlier versions will be slightly different.

3.4. Reloading

Starting with version 1.3 of Mercurial, hgwebdir automatically refreshes the repositories list, so you most likely don't need to follow the instructions below.

Every time you add a new repository to your hgwebdir collections (for example by using hg clone) you will need to restart hgwebdir (hence Apache) to see it listed and be able to browse it.

To avoid the manual restart step you can add a small hook to your .hgrc so that every time you push something new you also "touch" the mod_wsgi script file and force it to be reloaded, something like this should work:

[hooks]
changegroup =
# reload wsgi application
changegroup.mod_wsgi = touch /var/www/vhosts/hg.example.net/cgi-bin/hgwebdir.wsgi

Naturally the user you push with must have the permissions to touch the file.

For the inner workings of the mod_wsgi autoreload mechanism see this.

4. Note

It seems to me that mod_wsgi is the future of python web applications on Apache. It really nice, even in this early stage, and I expect it to become even better.

Bye mod_python/fastcgi, you won't be missed.


CategoryWeb CategoryHowTo

modwsgi (last edited 2020-08-13 12:07:09 by aayjaychan)