bfiles filename encoding

Adrian Buehlmann adrian at cadifra.com
Mon Jun 7 08:52:24 CDT 2010


On 07.06.2010 15:28, Greg Ward wrote:
> On Sat, Jun 5, 2010 at 6:29 PM, Benjamin Pollack <benjamin at bitquabit.com> wrote:
>> Greg: the more I play with this, and with bfiles on Windows, the more I'm thinking that at least the push destinations should be encoded using the fncache naming strategy.
> 
> Well, I *know* that the structure of bfiles' central store will have
> to change the minute someone tries to bfput a file called "aux" to a
> central store running on Windows.  Or even "foo" and "Foo".  In fact
> the case-sensitivity issue will almost certainly bite on OS X just as
> soon as I write a test for it.  The only solution I can see is to
> encode filenames on the central store, and reusing Mercurial's code
> for doing that seems very desirable.
> 
> But I don't understand what you mean by "the push destinations should
> be encoded".  Are you talking about wire protocol changes?  That seems
> unnecessary; this is all about dealing with filesystems that are not
> 100% traditional Unix filesystems: HFS+ and NTFS.
> 
>> What are your feelings on changing this for HTTP store? What about for the SSH store?
> 
> Those just select a different protocol to access the same underlying
> central store.  Same as the relationship between Mercurial's wire
> protocols and the repo that you're talking to.  bfiles needs to fix
> its relationship with the filesystem, not the network.
> 
>> (There's an argument for .hgbfiles being that way, too, due to file name length limits on Windows, but I'm happy to discuss that issue separately.)
> 
> Oh crap, I hadn't thought about that.  But is it really a problem?  I
> mean, if you have
> 
>   .hgbfiles/really/long/deep/path/to/bigfile
> 
> then that represents
> 
>   really/long/deep/path/to/bigfile
> 
> which is only slightly shorter than the path in .hgbfiles.  So
> mangling paths in .hgbfiles to workaround Windows brain damage only
> buys, what, 10 more bytes of headroom in the path?  Not worth it,
> IMHO.
> 
>> The fncache code also currently only escapes files that are located in .hg/store, which has been frustrating for me on other occasions when I've wanted to reuse the logic for other locations. (E.g., Kiln caches annotation output, which required some copy-paste coding unless we wanted to store the annotation data in .hg/store). What are the feelings on abstracting that code so that it can provide names for files in other directories?
> 
> Hmmm.  If fncache is not factored for reusability, that means either
> 1) don't encode filenames that way, 2) submit refactoring patches to
> Mercurial and make bfiles require Mercurial 1.6, or 3) copy the code
> for now and remove the copy once bfiles requires Mercurial 1.6.  Yuck.
> 
> Consider also that there are two known cases not covered by fncache encoding:
> 
> 1) Windows Vista and 7 mangle leading whitespace in filenames, which
> corrupts hg repos
> 2) bad stuff happens if you commit .DS_Store on OS X
> 
> So if we reuse fncache in bfiles, either by copying or refactoring,
> then bfiles will inherit those two bugs.

Windows Vista and Windows 7 explorer strip leading spaces from filenames
(see issue1713) when copying trees.

I have never looked at bfiles so far, but I assume bfiles *is* already
affected by that, or do you already encode leading spaces in filenames?

(I'm a bit puzzled by the term "inheriting". How can you inherit
something you already have?)

> Perhaps we should cook up a new filename encoding algorithm for
> bfiles.  If it works, we could even propose it for core Mercurial once
> people have the appetite for yet another change there.

Matt was against doing another repo format change just to work around
issue1713.


More information about the Mercurial-devel mailing list