RFC: safe pattern matching for problematic encoding
FUJIWARA Katsunori
foozy at lares.dti.ne.jp
Wed May 23 08:29:27 CDT 2012
At Wed, 23 May 2012 14:53:43 +0200,
Mads wrote:
>
> On 23/05/12 14:38, FUJIWARA Katsunori wrote:
> > Hi, devels.
> >
> > I'm working to achieve safe pattern matching/parsing for problematic
> > encodings (e.g.: cp932), in which strings may contain '\\' as a part
> > of multi-byte characters.
>
> Is that a part of improving hgext/win32mbcs.py ? Or how are they related?
In my patch serires:
- add some wrapping/hooking points to core code, and
- wrap/hook them by hgext/win32mbcs, if it is enabled
> > As you noticed, wrapping/hooking points are scattered in widely, so I
> > think that this implementation is not so good. But I don't have any
> > other ideas.
> >
> > Are there any other ideas to solve this problem ?
>
> The only viable solution is to consistently use utf-8 inside Mercurial.
I also think so, too :-)
> > BTW, how is "using Unicode API on Windows" plan progressing ?
> >
> > http://www.selenic.com/pipermail/mercurial-devel/2011-December/036385.html
>
> No progress at all. Windows users do apparently not care enough about
> the problem to contribute in any way.
I have thought that the mail from Matt was the announcement to start
of work in core development team, but it is my mis-understanding,
isn't it ?
I and someone in Japan have much interest in contribution to solve
this problem, because it is very important to spread Mercurial in
Japan !
Should we start to implement according to the policy described by Matt
immediately ?
----------------------------------------------------------------------
[FUJIWARA Katsunori] foozy at lares.dti.ne.jp
More information about the Mercurial-devel
mailing list