RFC: safe pattern matching for problematic encoding

Martin Geisler mg at aragost.com
Fri May 25 05:03:26 CDT 2012


FUJIWARA Katsunori <foozy at lares.dti.ne.jp> writes:

> At Wed, 23 May 2012 22:50:53 +0200,
> Martin Geisler wrote:
>
>> Matt Mackall <mpm at selenic.com> writes:
>> 
>> > On Wed, 2012-05-23 at 15:14 +0200, Antoine Pitrou wrote:
>> >> On Wed, 23 May 2012 14:53:43 +0200
>> >> Mads <mads at kiilerich.com> wrote:
>> >> > 
>> >> > > As you noticed, wrapping/hooking points are scattered in widely, so I
>> >> > > think that this implementation is not so good. But I don't have any
>> >> > > other ideas.
>> >> > >
>> >> > > Are there any other ideas to solve this problem ?
>> >> > 
>> >> > The only viable solution is to consistently use utf-8 inside Mercurial.
>> >> 
>> >> Or to consistently use unicode strings ;)
>> >
>> > <rage class=python3>
>> > Yes, please go waste the next year or two of your life working on that
>> > brilliant idea. Don't come back until you can preserve mixed filename
>> > encodings on Linux while interoperating with old hg clients. Best of
>> > luck.
>> > </rage>
>> 
>> Is mixed filename encodings really something we want to support?
>> 
>> It sounds like a super rare situation to me, and a situation that the
>> users would be happy to correct if it is detected.
>> 
>> Some users would probably say "why did you even allow me to make this
>> mess in the first place?!" and consider it a bug that such
>> repositories can exist today.
>
> I understand that Matt worries about time lag between:
>
>   - unicode specification changing
>   - catch-up for unicode (or wchar?) file API of each platforms
>   - catch-up for encoding implementation of Python for unicode decoding
>
> and gaps of supported level between each components on each environments.

Okay, I can see why there might be some problems there. But for 99.9% of
the cases I think Python's Unicode support is okay. Things that breaks
must be pretty obscure, right? In those cases I would tell users that
their filename isn't supported.

-- 
Martin Geisler

aragost Trifork
Commercial Mercurial support
http://aragost.com/mercurial/


More information about the Mercurial-devel mailing list