Serving Mercurial repositories with Apache and mod_wsgi
mod_wsgi is a simple to use Apache module which can host any Python application which supports the Python WSGI interface (such as slightly modified hgwebdir.cgi). It can be run as "in server" process (just like mod_python), or in "daemon mode" (equivalent to FastCGI). Daemon mode is strongly recommended for better security and separation.
Please don't blindly follow this document. At least read official mod_wsgi documentation (it's pretty good), and PublishingRepositories. It is expected that you know how to properly configure Apache :-D
- Like mod_python or fastcgi, it is much faster than old fashioned CGI.
- It is simpler, safer and faster than mod_python
- It is easier to configure than FastCGI (e.g. for serving mercurial repositories on apache 2.2 you also need suexec, mod_fcgid, flup and modified hgweb* scripts)
- If your favourite OS does not have pre-built mod_wsgi packages, you might need to compile it yourself (if you have a compiler and some apache dev packages installed, it is quite easy to compile).
For Debian etch, there are current mod_wsgi packages on http://backports.org/.
You'll need following software (tested versions are in parenthesis, other versions should work too):
- Apache (2.2.6, 2.2.3)
- mod_wsgi (1.1, 2.0)
- Python (2.5.1, 2.4.4)
- Mercurial (1.1)
Starting with version 1.6 of Mercurial, the hgwebdir.wsgi script has been unified with the hgweb.wsgi script. Wherever hgwebdir.wsgi is referred to in these directions, you can substitute the hgweb.wsgi script instead.
In case you can't find mod_wsgi package for your operating system, you'll have to compile it yourself.
$ tar xvf mod_wsgi-1.1.tar.gz $ cd mod_wsgi-1.1 $ ./configure $ make $ su -c "make install"
Edit your httpd.conf file to load wsgi module:
LoadModule wsgi_module libexec/httpd/mod_wsgi.so
The maximum-requests mod_wsgi configuration option can cause long-lived requests like pushes and pulls to be unceremoniously aborted, so it should be left at its default unlimited value. See 2595.
In this sample setup, we are serving mercurial repositories from separate virtual host (hg.example.net). Repositories are in htdocs directory, served by modified hgwebdir.cgi script (hgwebdir.wsgi).
<VirtualHost *:80> ServerName hg.example.net DocumentRoot /var/www/vhosts/hg.example.net/htdocs ErrorLog /var/log/httpd/hg.example.net-error_log CustomLog /var/log/httpd/hg.example.net-access_log common WSGIScriptAliasMatch ^(.*)$ /var/www/vhosts/hg.example.net/cgi-bin/hgwebdir.wsgi$1 # To enable "daemon" mode, uncomment following lines. (Read mod_wsgi docs for more info) # WSGIDaemonProcess hg.example.net user=USER group=GROUP threads=1 processes=15 # some more interesting options (tested on mod_wsgi 2.0): # umask=0007 display-name=wsgi-hg.example.net inactivity-timeout=300 # WSGIProcessGroup hg.example.net <Directory /var/www/vhosts/hg.example.net/htdocs> Options FollowSymlinks DirectoryIndex index.html AllowOverride None Order allow,deny Allow from all </Directory> <Directory /var/www/vhosts/hg.example.net/cgi-bin> Options ExecCGI FollowSymlinks AddHandler wsgi-script .wsgi AllowOverride None Order allow,deny Allow from all </Directory> </VirtualHost>
PeterArrenbrecht notes: On Ubuntu 8.04 I had to replace the line containing WSGIScriptAliasMatch with
WSGIScriptAlias /hg /var/www/vhosts/hg.example.net/cgi-bin/hgwebdir.wsgi
to make it work.
You only need to remove any reference to wsgicgi in hgwebdir.cgi script which comes with mercurial. The contrib directory included with the mercurial distribution already contains these changes.
Here is the short version - hgwebdir.wsgi:
from mercurial import demandimport; demandimport.enable() from mercurial.hgweb.hgwebdir_mod import hgwebdir application = hgwebdir('/var/www/vhosts/hg.example.net/cgi-bin/hgweb.config')
Note that the wsgicgi lines have been removed; hgwebdir is already WSGI capable and the .cgi script merely uses additional code to adapt the CGI request into a WSGI request. Note also that the mod_wsgi hgwebdir.wsgi needs to be passed the full path for the hgweb.config file even if the two files are in the same directory, whereas hgwebdir.cgi does not.
The hgweb.config file looks like this:
[web] style = coal [paths] / = /var/www/vhosts/hg.example.net/htdocs/**
Note: this only works with Mercurial 1.0 and newer; the above example is based on Mercurial 1.2 and earlier versions will be slightly different.
Starting with version 1.3 of Mercurial, hgwebdir automatically refreshes the repositories list, so you most likely don't need to follow the instructions below.
Every time you add a new repository to your hgwebdir collections (for example by using hg clone) you will need to restart hgwebdir (hence Apache) to see it listed and be able to browse it.
To avoid the manual restart step you can add a small hook to your .hgrc so that every time you push something new you also "touch" the mod_wsgi script file and force it to be reloaded, something like this should work:
[hooks] changegroup = # reload wsgi application changegroup.mod_wsgi = touch /var/www/vhosts/hg.example.net/cgi-bin/hgwebdir.wsgi
Naturally the user you push with must have the permissions to touch the file.
For the inner workings of the mod_wsgi autoreload mechanism see this.
It seems to me that mod_wsgi is the future of python web applications on Apache. It really nice, even in this early stage, and I expect it to become even better.
Bye mod_python/fastcgi, you won't be missed.