Publishing Repositories with hgwebdir.cgi

The multiple repository CGI server is now described in the Publishing Repositories document together with other related information.

1. Introduction

This page explains how to make a bunch of repositories accessible through CGI using the hgwebdir.cgi-script and a webserver (apache or lighttpd). Once the script is set up, it is very easy to open new repositories. To quickly serve a single repository, please have a look at CGIinstall.

2. Pre-requisites

The installed software is:

3. Getting proper Mercurial

If you are on Linux, you probably want to install the Mercurial version packaged by your distribution. Like for RPM-based distributions:

$ cd download-directory
$ wget
$ sudo rpm -ihv mercurial-1.0-1.2.x86_64.rpm

If you can only get a pre-1.0 version from your distribution, you should probably acquire another version somehow. One way would be to get the latest version from the stable branch:

$ cd working-directory
$ hg clone
$ cd hg-stable
$ python setup.py build
$ sudo python setup.py install

/!\ Be sure to check for later versions, these filenames and versions might be out of date.

4. RHEL4 Specific Instructions

RHEL4HgWebDirSetup

5. Windows Specific Instructions

On Windows, your Python version must match the version used to compile Mercurial. Otherwise, you'll get "Invalid Magic Number" errors when trying to run the CGI.

At least one installer for 0.9.5 uses Python 2.4. (I can't verify all of them.)

The pre-compiled Windows binaries for Mercurial 1.0.x, 1.1.x, 1.2.x and 1.3.x were compiled with Python 2.5.

If you do wish to use Python 2.6, you must build your own Mercurial WindowsInstall.

The first line of hgwebdir.cgi must point to your Python executeable. Ex:

   1 #!c:/dev/Python25/python.exe
   2 #
   3 # An example CGI script to export multiple hgweb repos, edit as necessary
   4 ...

Some users report for the Mercurial 1.1 installation, Python needs to have the pywin32 package installed. It can be downloaded from: http://sourceforge.net/project/platformdownload.php?group_id=78018 If this package is not installed you will see an error like: "ImportError: No module named pywintypes". (I didn't need it for Python 1.1.1 though...)

5.1. Mercurial Lib Files

Install the Mercurial libraries. The Windows binary puts a .zip file of the libraries in your Mercurial install directory, which I refer to as MERCURIAL_HOME.

Unzip MERCURIAL_HOME\library.zip to a path of your choosing. (e.g. MERCURIAL_HOME\lib; for example: c:\dev\Mercurial\lib.). Winzip and this file may not play nice together so try using a command line unzip utility like: http://stahlforce.com/dev/index.php?tool=zipunzip

Specify the location of the library files in hgwebdir.cgi

   1 # adjust python path if not a system-wide install:
   2 import sys
   3 sys.path.insert(0, "c:/dev/Mercurial/lib")

If you compiled your own version of Mercurial, you do not need the sys.path.insert line.

5.2. Adjust Template Files

The CGI assumes templates are located inside of the lib directory you just created.

(Although you're supposed to be able to specify templates = c:/dev/Mercurial/lib in Mercurial.ini or hgweb.config, neither option worked for me.)

Move the Templates folder from my MERCURIAL_HOME\Templates to MERCURIAL_HOME\lib\Templates.

5.3. Other Gotchas

The path to your Mercurial repositories cannot also contain the name to the path to the Mercurial CGI. Maybe this is just my weird Apache setup, but when my hgwebdir.cgi was installed in c:/webdata/repos, this hgweb.config failed for me:

[paths]
repo1 = c:/webdata/
repo2 = c:/webdata/

and instead I had to change my directory names to

[paths]
repo1 = c:/webdata/
repo2 = c:/webdata/

I don't know if this is a Mercurial/Python issue or an Apache issue. Either way, hopefully this saves you some time.

5.4. Collections of Repositories

The advised way of specifying collection is now the [paths] section. This has been introduced in Mercurial 1.1 For old versions see the next section.

5.4.1. Current version

[paths]
/trunk = /webdata/hg_repos/trunk/**

Warning: Browsing the available repositories will be very slow, as all the files and subdirectories are scanned every time. To avoid scanning more than one subdirectory, use one astrics (*) instead of two (**).

This causes all repositories under c:/webdata/hg_repos/trunk to show up (whereas a simple * does not look for nested ones). Multiple collections may be specified, however each must have a unique prefix (part before the =). The prefix name may be the root - just the '/'.

This will search inside every directory of the repository, this might be extremely slow if executed on a repository containing a working directory. The advice here is to run `hg update null` on the served repository to avoid any superfluous recursive search.

5.4.2. Mercurial 1.0

There is a bug with [collections] not working with Windows drive letters - Mercurial mis-interprets the colon as a variable separator. See this bug tracker issue for details. This bug was fixed in Mercurial 1.1.x (see next section).

Repository collections still work if they are on the same drive as your CGI though. Use:

[collections]
\webdata\hg_repos\trunk = /webdata/hg_repos/trunk

The backslashes and forward slashes make a difference.

This got all of my repositories to show up that were located in c:/webdata/hg_repos/trunk. (Repositories located in lower subdirectories will show up but won't work. You have to add another line for their subdirectories.)

6. Directory Structure

Create the necessary directories:

$ sudo mkdir -p /var/hg/repos
$ sudo chown -R www-data:www-data /var/hg

It's usually a good idea to keep special directories out of the tree served by apache, but for security reasons on openSUSE the cgi scripts only work within the document root. So for openSUSE, which uses user/group wwwrun/www instead of www-data/www-data and does not allow write acces to cgi directories for anyone but the apache user it is

$ sudo mkdir -p /srv/www/htdocs/hg/repos
$ sudo chown -R wwwrun:www /srv/www/htdocs/hg/repos
$ sudo chmod 755 /srv/www/htdocs/hg

7. Preparing the config

$ cat > /tmp/hgweb.config
[collections]
repos/ = repos/
^D
$ sudo -u www-data cp /tmp/hgweb.config /var/hg
$ rm /tmp/hgweb.config

For openSUSE, replace the sudo line above with

$ sudo -u wwwrun cp /tmp/hgweb.config /srv/www/htdocs/hg

To get a look like on http://hg.kublai.com/ with subdirectories? Just create subdirectories with hg repositories in it. To get a look of the toplevel like on kublai, add the following to hgweb.config:

[web]
style = gitweb

This also works in repository hgrc's, which are mentioned below.

8. Two possibilities

You can either use a separate webserver such as Apache or lighttpd, or use the webserver built into hg.

8.1. Using Apache or lighttpd with the hgwebdir.cgi script

8.1.1. Putting the right stuff in place

Put the script in place (remember, we are still in that working-directory/hg :)):

$ sudo -u www-data cp hgwebdir.cgi /var/hg
$ sudo -u www-data chmod +x /var/hg/hgwebdir.cgi

And again, some changes for openSUSE

$ sudo -u wwwrun cp /usr/share/doc/packages/mercurial/hgwebdir.cgi /srv/www/htdocs/hg
$ sudo -u wwwrun chmod +x /srv/www/htdocs/hg/hgwebdir.cgi

8.1.2. Configuring apache for use with CGIs

Ok, now it's time for apache (see at the end of this paragraph for the openSUSE way of doing this).

First of all, do not really change the config of apache directly:

$ sudo mkdir /etc/apache2/hg

Create the config with the following contents (e.g by using sudo vim /etc/apache2/hg/main.conf):

ScriptAliasMatch        ^/hg(.*)        /var/hg/hgwebdir.cgi$1
<Directory /var/hg>
  Options ExecCGI FollowSymLinks
  AllowOverride None
</Directory>

This config says that we are going to serve our repositories through '<yourhost>/hg/'.

Now make it really available, by changing your favourite site in /etc/apache2/sites-available. For this experiment I used /etc/apache2/sites-available/default:

  ...
  Include /etc/apache2/hg/main.conf
</VirtualHost>

Make sure that everything is OK:

$ sudo apache2ctl configtest
Syntax is OK

Restart your web server:

$ sudo apache2ctl stop
$ sudo apache2ctl start

Check if it works by directing your browser to <yourhost>/hg/.

For openSUSE, just put this in /etc/apache2/conf.d/hg.conf:

ScriptAliasMatch        ^/hg(.*)        /srv/www/htdocs/hg/hgwebdir.cgi$1
<Directory /srv/www/htdocs/hg>
  Options ExecCGI FollowSymLinks
  AllowOverride None
</Directory>

and run

sudo rcapache2 reload

if this does not complain about config errors you should be done.

8.1.3. Configuring Apache with mod_wsgi

Using mod_wsgi is recommended over using mod_python. It's one of the faster and more efficient ways of serving hgweb(dir).

You can use the hgwebdir.wsgi script (it lives where other Mercurial scripts live, and works with Mercurial 1.0 and later), which references a hgwebdir.conf file in the CONFIG variable:

from mercurial.hgweb.hgweb_mod import hgweb
from mercurial.hgweb.hgwebdir_mod import hgwebdir

CONFIG = '/var/hg/hgweb.config'
application = hgwebdir(CONFIG)

This is really an ordinary Python module, but it uses a wsgi extension to make it clear what its use is.

You then need to add this to your httpd.conf (or the vhost config):

WSGIScriptAlias / /var/hg/script/hgwebdir.wsgi
<Directory /var/hg/script>
    Order deny,allow
    Allow from all
</Directory>

Since you're allowing some permissions to the directory the .wsgi script is in, you probably don't want to put the script in a directory that also contains your repositories.

NOTE: the default hgwebdir.wsgi uses 'hgweb.config' as configuration path. This may not be what you expect. If you find yourself with a working but empty installation, try to set this to the full path where your hgweb.config is situated

8.1.4. Configuring apache with mod_python

You can also use mod_python, though mod_wsgi is probably better. Here are three howto's:

http://www.selenic.com/pipermail/mercurial/2007-May/013222.html

http://slucas.wikidot.com/en:hgweb-mod-python

http://www.aventinesolutions.nl/mediawiki/index.php/Quick_Tip:_Getting_Started_with_Mercurial (on Fedora 6)

8.1.5. Configuring lighttpd

Ok, now it's time for lighttpd (no openSUSE specifics, because I don't use it).

You can either update the existing /etc/lighttpd/lighttpd.conf file, or create /etc/lighttpd.conf/hg.conf and include that file from lighttpd.conf.

First, you need to check if mod_rewrite and mod_cgi are enabled in the config file, and add them to server.modules if they haven't already been added:

server.modules += ( "mod_cgi" )
server.modules += ( "mod_rewrite" )

Next, configure rewrite rules that map URLs to the hgwebdir.cgi script. With the following added to your config file, URLs starting with either hg or mercurial will map to hgwebdir.cgi:

url.rewrite-once = (
  "^/hg([/?].*)?$" => "/hgwebdir.cgi$1",
   "^/mercurial([/?].*)?$" => "/hgwebdir.cgi$1"
)

Then, configure a URL match that invokes hgwebdir.cgi:

$HTTP["url"] =~ "^/hgwebdir.cgi([/?].*)?$" {
             server.document-root = "/var/hg/"
             cgi.assign = ( ".cgi" => "/usr/bin/python" )
}

Make sure that everything is OK:

$ sudo lighttpd -t -f /etc/lighttpd/lighttpd.conf
Syntax OK

Restart the web server:

$ sudo /etc/init.d/lighttpd restart

Check if it works by directing your browser to <yourhost>/hg/ or <yourhost>/mercurial/.

8.2. Configuring Lighttpd push support

To enable support for pushing to remote repositories, you are required to add an extra $HTTP check to your vhost. The command when pushing that requires authorization is 'unbundle', so what we do is check to see if it is within the URL:

$HTTP["querystring"] =~ "cmd=unbundle" {
                auth.require = (   "" => (
                        "method"  => "basic",
                        "realm"   => "Mercuial Repo",
                        "require" => "valid-user"
                        )
                )
        }

This example uses an Apache2 .htpasswd file. You can add the following variables to your lighttpd.conf:

auth.backend = "htpasswd" # auth method
auth.backend.htpasswd.userfile = "/path/to/file" # passwd file. Syntax: USERNAME:ENCRYPTED_PASS

An easy way to create this file:

htpasswd -c /path/to/file USERNAME

You will be prompted for password input.

8.3. Standalone

Simply run

sudo -u www-data hg serve --webdir-conf /var/hg/hgweb.config

and enjoy this speedy method of serving multiple repos. It should be faster than using Apache.

9. You are done

Hooray!

10. Final Bits

No openSUSE specifics here. You should know the differences by now (apache user and doc path).

10.1. Create a new repository

$ sudo -u www-data hg init /var/hg/repos/<repository-name>

10.2. Provide more information about it

Add the following to the /var/hg/repos/<repository-name>/.hg/hgrc file:

[web]
contact = Bilbo Baggins       # Whom to contact, plain text,
                              # no fancy stuff
description = My precious!    # Nice description what this is about,
                              # you can include HTML (like <a>)

10.3. Allow pushing to the repository

By default, nobody is allowed pushing.

To allow pushing to everybody, add the following line to the /var/hg/repos/<repository-name>/.hg/hgrc file:

[web]
allow_push = *

To allow only selected users to push changes, add the following line to the /var/hg/repos/<repository-name>/.hg/hgrc file:

[web]
allow_push = frodo, sam

These are virtual users (for instance, as defined using a .htpasswd file), and not real system users.

10.4. Deny pushing to the repository

Most likely you will want to use it together with allow_push = *. If you want allow pushing to everybody, but a selected list of people, add the following line to the /var/hg/repos/<repository-name>/.hg/hgrc file:

[web]
deny_push = saruman

10.5. Allow pushing only over a non-secure channel

(I still need to check how it works :) )

By default, pushes are allowed only over https. If you are certain and do not want to enforce https for pushes, add the following line to the /var/hg/repos/<repository-name>/.hg/hgrc file:

[web]
push_ssl = false

10.6. Customize the look

Add the following to the /var/hg/repos/<repository-name>/.hg/hgrc file:

[web]
# enable snapshot downloads
allow_archive = gz zip bz2

10.7. Change the URL using baseurl and URL rewriting

For example, we might not want the hgwebdir.fcgi in our URLs if we have a dedicated or virtual host for our repositories. In the hgweb.config, add the following:

[web]
baseurl = /

hgwebdir will now write all links as /x/y instead of /hgwebdir.fcgi/x/y. Now we only need a rewrite rule. For Lighttpd, add (if hgwebdir.fcgi resides in the server's document root):

url.rewrite-once = ("^(/
" => "$1", "^(/.*)$" => "/hgwebdir.fcgi$1" )

For apache, add this to your .htaccess file. This example is based on hgwebdir.fcgi residing on your server as such "hg.example.com/cgi-bin/hgwebdir.fcgi"

# Taken from http://www.pmwiki.org/wiki/Cookbook/CleanUrls#samedir
# Used at http://ggap.sf.net/hg/
Options +ExecCGI
RewriteEngine On
#write base depending on where the base url lives
RewriteBase /cgi-bin
RewriteRule ^$ hgwebdir.fcgi  [L]
# Send requests for files that exist to those files.
RewriteCond %{REQUEST_FILENAME} !-f
# Send requests for directories that exist to those directories.
RewriteCond %{REQUEST_FILENAME} !-d
# Send requests to hgwebdir.cgi, appending the rest of url.
RewriteRule (.*) /cgi-bin/hgwebdir.fcgi/$1  [QSA,L]

11. Disclaimer

Well, it works (worked) for me. Please do not hesitate to update this page to include small bits I've forgotten or just plainly am not aware of.


See also SharedSSH.


CategoryHowTo CategoryWeb