[PATCH] Defined REQUEST_URI on Microsoft IIS servers

Ezra.Smith at bentley.com Ezra.Smith at bentley.com
Wed Oct 31 14:19:49 CDT 2007


Argh, I didn't think the Gmail one even went through. Again, sorry for
the excessive spam.

I forgot to mention the issue with PATH_INFO on the commit message.

PATH_INFO on IIS is a bit different than you would expect. It's defined
here:
http://msdn2.microsoft.com/en-us/library/ms524602.aspx

Note that I could set a property on the server to make it compatible
with
the canonically expected value, but that would have the side effect of
breaking things like ASP (again, mentioned in the link above).

The issue is that, where the CGI spec would expect something like:
	PATH_INFO = /msj/rev/012a04065db5

IIS actually gives something like:
	PATH_INFO = /list.py/msj/rev/012a04065db5

Without modification, all of your links end up pointing to
http://hgrepo/list.py/address
instead of http://hgrepo/address, which is not the what hgwebdir_mod.py
intends.

SCRIPT_NAME actually gives something like:
	SCRIPT_NAME = /list.py

And that's mostly what my patch is doing. Stripping the script name off
the front
of PATH_INFO to make it valid. I can't speak to whether or not this is a
great
idea for complying with standards, but this is how to make IIS function
without
changing a server-side toggle that will break a bunch of other
applications.
Similarly, because of quirks specific to IIS, I would argue that it's
probably
worth checking SERVER_SOFTWARE specifically for IIS, especially since my
definition
for REQUEST_URI is based on the behavior of PATH_INFO. Which brings me
to REQUEST_URI...

I don't have an Apache server handy to test this on, but I believe that
REQUEST_URI
does not actually include the "protocol://SERVER_NAME:SERVER_PORT" part
of a URI.
I mostly base this on having googled for some dumps of other people's
environment
variables, such as:

http://www.cinahl.com/wcgis/testenv.cgi?cookies=off
http://urchin.earth.li/~twic/wml/wap-cgi-environment-variables.html
http://www.onlamp.com/php/2000/12/01/dump-globals.php

And now we have an issue. In hgwebdir_mod.py, you have:

   url = req.env['REQUEST_URI'].split('?')[0]
   if not url.endswith('/'):
       url += '/'
   pathinfo = req.env.get('PATH_INFO', '').strip('/') + '/'
   base = url[:len(url) - len(pathinfo)]

But assume that we've modified PATH_INFO to be as it should be
(/msj/rev/012a04065db5),
and REQUEST_URI contains SCRIPT_NAME + PATH_INFO
(/list.py/msj/rev/012a04065db5).
At this point, all of my URLs are being mapped as if they had a list.py
in their
address, because base = /list.py/. But...we're using a mod_rewrite clone
to give
our repository here a more standard address like the ones most other HG
repositories
are using (http://hgserver/hg/ instead of http://hgserver/hg/list.py/).

These environment variables know nothing of the rewrite on IIS,
unfortunately, and
if REQUEST_URI is formed as you suggested, then all of the links on my
web repo break.
This certainly is worth noting, and I should have done so originally, as
the patch
ends up being geared specifically toward how we have things set up here.
For what it's
worth, trying to make things work on IIS without a mod_rewrite type
extension wasn't
working for me at all. My original post about mercurial on IIS
recommended using
ISAPI_Rewrite Lite, which I'd still recommend to anyone interested in
trying this on IIS.

I suppose a better alternative is to create REQUEST_URI as it should be
according to
standards, and then modify the code block above to use SCRIPT_NAME .
PATH_INFO for
determining the base URL, instead of using PATH_INFO alone. Of course,
this would need
to be wrapped in another "if we're running on IIS" block, as I cannot
guarantee the
intended behavior on any platform aside from IIS.

Lastly, I omitted the query string in my original implementation and
forgot to go
back and add it before putting out a patch. Thanks for pointing it out. 

-Ezra

-----Original Message-----
From: Micah Cowan [mailto:micah at cowan.name] 
Sent: Wednesday, October 31, 2007 12:00 PM
To: Ezra Smith
Cc: mercurial-devel at selenic.com
Subject: Re: [PATCH] Defined REQUEST_URI on Microsoft IIS servers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Ezra.Smith at bentley.com wrote:
> Apologies for the spam. Outlook seems quite adamant about formatting
my
> 
> emails after I've hit the send button. This one really should be
clean,
> though.

It's not really, though.

The plaintext version is double-spaced, so it really depends on how well
people's mailers can copy/paste the HTML version, and whether the NBSPs
get copied literally or translated into normal spaces, etc.

Best is to send patches as attachments if you have formatting problems
(ideally with an inline disposition).

I'm not a Mercurial developer; but this patch doesn't look entirely
appropriate to me: it doesn't seem at all appropriate that you set both
REQUEST_URI and PATH_INFO to the same value, and a value which is
incorrect for the canonically expected value for PATH_INFO. Your patch
claims to set REQUEST_URI, but it does more than that, it changes
PATH_INFO as well, and I don't see a reason for that. And the value it
sets REQUEST_URI to isn't a complete URI; other code might reasonably
expect it to be. We should keep CGI variables at sane values, so that
modules don't find themselves in a strange, unexpected (and
undocumented) runtime environment.

To properly generate REQUEST_URI from actual environment variables,
you'd need something like (borrowed from current CGI 1.2 draft spec:)

protocol "://" SERVER_NAME ":" SERVER_PORT enc-script
               enc-path-info "?" QUERY_STRING

Where protocol is derived from SERVER_PROTOCOL, and enc-script and
enc-path-info are percent-encoded equivalents to SCRIPT_NAME and
PATH_INFO. The question mark of course should be omitted if QUERY_STRING
is empty.

Also, rather than check explicitly for Microsoft IIS, I suspect it'd be
preferable to check for the lack of a REQUEST_URI. That doesn't seem to
be a CGI variable (at least, I can't find it in the spec:
http://hoohoo.ncsa.uiuc.edu/cgi/env.html; nor in the draft spec), so
odds are good IIS isn't the only server to lack support for it.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHKKZx7M8hyUobTrERCHgNAJ0dP9FOb+KOvKZZ8dyUHTkvjftBOQCbBC9H
OQsVGSvLnih89ZdthQ8hhH8=
=9nCs
-----END PGP SIGNATURE-----



More information about the Mercurial-devel mailing list