hgwebdir configuration problem

Paul Boddie paul.boddie at biotek.uio.no
Thu Jan 7 08:11:20 CST 2010


gda wrote:
> Following your suggestion I have removed the link and now in the public_html
> folder I have the following files:
> hgweb.config
> hgwebdir.cgi
> prog_development 
>
> where the folder prog_development contains all my repositories:
> /home/myuser/public_html/prog_development/rep1
> /home/myuser/public_html/prog_development/rep2
> /home/myuser/public_html/prog_development/rep3
>   

What I meant by not having a link was that you shouldn't need anything - 
a link or a real folder - containing repositories in the public_html 
directory. Instead, the hgweb.config file will retain the location of 
the repositories and use this knowledge to find them in order to serve 
up the repository views.

> The hgweb.config is:
> [collections]
> /home/myuser/public_html=/home/myuser/public_html/prog_development
>
> Note that now /home/myuser/public_html/prog_development is not a link but it
> is a real folder.
>   

I have to admit that I use the paths setting, not collections, but you 
should really use directories outside your public_html directory. I 
suggest something like this:

[collections]
/home/myuser/prog_development=/home/myuser/prog_development


This should be enough for the script to find your repositories.

> Exactly this is the related part of my httpd.conf file:
>
> DirectoryIndex index.html index.php hgwebdir.cgi hgweb.cgi
>   

Right. So, when Apache sees http://mypc/~myuser, it finds the 
public_html folder, looks for one of these files and then "opens" it. 
This causes the problem with the bad links, and you can see what should 
happen instead by going to http://mypc/~myuser/hgwebdir.cgi and 
following the links to repositories; they should be the actual dynamic 
views and have these locations:

http://mypc/~myuser/hgwebdir.cgi/rep1
http://mypc/~myuser/hgwebdir.cgi/rep2
http://mypc/~myuser/hgwebdir.cgi/rep3

Of course, you probably don't want the hgwebdir.cgi to appear in the 
URL, and that's why the ScriptAlias solution in the Wiki page is 
mentioned - it gets around most of the problems with URL naming and 
finding suitable CGI scripts.

> And this is the configuration part for public_html folders:
>
> UserDir public_html
> <Directory /home/*/public_html>
>     AllowOverride FileInfo AuthConfig Limit
>     Options ExecCGI MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
>     <Limit GET POST OPTIONS PROPFIND>
>         Order allow,deny
>         Allow from all
>     </Limit>
>     <LimitExcept GET POST OPTIONS PROPFIND>
>         Order deny,allow
>         Deny from all
>     </LimitExcept>
> </Directory>
>   

Is there no way to just use a ScriptAlias as follows...?

ScriptAlias /~myuser "/home/myuser/public_html/hgwebdir.cgi"

If you're setting this up for all users on a system then I think a 
Directory-based approach is required, possibly using mod_rewrite to make 
the URLs look nice, but see my remarks at the end before tackling this 
problem.

> Unfortunately this configuration gives me exactly the same problems as
> before. Moreover I had to introduce the following link in the public_html
> folder:
>
> static -> /usr/lib/python2.5/site-packages/mercurial/templates/static/
>
> Without this link my web server cannot access to the templates and gives me
> only plain text pages. 
>   

This suggests that the script isn't handling resources served via the 
dynamic (or virtual) directory whose name is "static". Although getting 
Apache to serve such resources is possible, I think the intention is 
that the script serves these resources itself.

>  
> Anyway what I cannot really understand is why using the script hgwebdir.cgi
> I have links to my archivies like:
>
> http://mypc/~myuser/prog_development/rep1/archive/tip.zip 
>
> that does NOT work while using the script hgweb.cgi the links are like:
>
> http://mypc/~myuser/prog_development/rep1/?archive/tip.zip 
>
> that works fine! 
>   

Because Apache can't resolve the former - archive/tip.zip isn't a file 
in the rep1 folder hierarchy - whereas the latter is handled by the 
hgweb.cgi script you've put in the rep1 folder.

> I guess I should have the question mark in both cases because the archive is
> not present on the server and should be created by the scripts on fly. Isn't
> it??
>   

The archive never actually exists as a file in the location rep1/archive 
- accesses to it are handled dynamically.

> Any other suggestions? I'm really confused.
>   

My advice is to only put hgwebdir.cgi and hgweb.config in the CGI 
directory (or public_html directory). The solution should work without 
any static content residing in that directory or being linked to from 
that directory. (This is in contrast to various PHP or Perl solutions 
I've seen where there's an unhealthy mix of dynamic and static content; 
most Python-based Web solutions I've seen maintain a strict separation 
between dynamic and static content.) The aim is to have the script be 
responsible for serving absolutely everything.

You should then set up hgweb.config as I describe above, where the 
repositories reside in their usual place, not in the public_html 
directory. This will prevent Apache from wanting to serve them up as 
static content and confusing the process of troubleshooting.

Although having an Apache configuration which uses the DirectoryIndex 
setting can help you get an index page under a "nice" URL 
(http://mypc/~myuser), you should be aware that this is a convenience 
for only finding the script at that *precise* location. As soon as you 
add things to the end of the URL (http://mypc/~myuser/rep1), Apache will 
no longer consider using the script to serve these resources, leading 
you in the wrong direction when considering how to fix such problems. I 
recommend verifying that things work by using the full URLs such as 
http://mypc/~myuser/hgwebdir.cgi and browsing the repositories (and 
trying the archive features) before trying to make the URLs look nicer.

To get "nice" URLs, if you can't use ScriptAlias, I recommend using a 
more complete solution using something like mod_rewrite where "nice" 
URLs get turned into something like 
http://mypc/~myuser/hgwebdir.cgi/rep1 in a way that the script can 
handle correctly. The documentation isn't that good around this 
currently, mostly because the need for doing so is arguably rather 
limited, but we can surely work towards a decent solution together.

I hope this helps.

Paul

P.S. I've added some documentation for the collections setting in 
hgweb.config to the Wiki page. It doesn't really seem very usable to me, 
but maybe there was a motivating factor in having it behave as it does 
that eludes me.


More information about the Mercurial mailing list