[PATCH] clone on Windows: abort on reserved filenames (1st update)

Adrian Buehlmann adrian at cadifra.com
Thu Jun 12 03:20:25 CDT 2008

On 12.06.2008 01:33, Mads Kiilerich wrote:
> If I understand it correctly now, then it seems that the patch will make 
> it easy to create repositories that never can be cloned on windows 
> because of forbidden names in the history.

We already *are* in this situation. That patch does *not* make
it easier to create such problem repositories. Quite the opposite,
I would say, as the patch makes the current situation more
transparent by "converting" a specific hg call (clone to Windows
given said preconditions) from a silly hanger into a bold explicit abort,
which lists the specific repository-relative path it cannot support.

In the meantime I have verified that it is impossible to create a
file having the name "aux.i" on Windows XP.

So, http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx
is incorrect in just _recommending_ to "avoid" file names of the
sort "aux.txt" (or e.g. "nul.i", "PRN.o", etc.).

Mercurial tries to create a file "aux.i" when it wants to write
the revlog inside the .hg dir for a repository containing a
tracked file with the name "aux".

Current Mercurial hangs in this case. The patch intends to
abort clone and pull with an explicit error message, listing the
problem file.

> Some kind of encoding or
> _-escaping of these names could perhaps be a convenient alternative...

I am thinking about this. Patrick Mezard uploaded tentative patches
at issue issue793 in the bug tracker which encode those specific reserved
names, thus creating a new, incompatible, repository format.
Patrick told me on IRC that said patch was deemed as ugly and was thus
not applied to the Mercurial codebase.

Yesterday, there was some discussion on IRC (nicks cmason, pmezard, tonfa, djc,
bos) about the potential implications about going to some hashed encoding
of file/directory names again. It seems Mercurial was once in such a state.
As I understand it, that encoding was dropped due to inferior disk access
speed, because file systems seem to be optimized for laying out files on disk
which have real-world, user-supplied names (and not arbitrary random
number/character strings).
I got the impression that Patrick (nick pmezard on IRC) is inclined to revisit
the hashes decision again (in some form).

Another major driving force seems to be that painful Windows path length limit
problematic, which was exacerbated with the anti-case-folding
"HELLO" -> "h_e_l_l_o" encoding.

> And _-escaping is currently done on all platforms so that .hg folders 
> can be moved between platforms. Shouldn't this black-listing be done on 
> all platforms too?

I am thinking about that too. But those who can ignore Windows sure don't want
to be limited by stupidities of platforms they enjoy not to have to deal with.

But, maybe we can add a config option which makes Mercurial yell if a
user tries to push/commit a file with a horrid name like "aux" into a repo.
So, mixed platform projects can make sure they don't get into this reserved file
name trap. Once you have committed a file with a name "aux" to a repo (on unix),
that repo can never ever be cloned to Windows again. This cannot be fixed by
deleting that "aux" file on the unix side, as it is already in the history
of the repo. You can only convert it away (which invalidates all revision IDs
referencing that repo in the world) or change Mercurial.

More information about the Mercurial-devel mailing list