RFC: safe pattern matching for problematic encoding
mpm at selenic.com
Wed May 23 14:03:26 CDT 2012
On Wed, 2012-05-23 at 22:29 +0900, FUJIWARA Katsunori wrote:
> At Wed, 23 May 2012 14:53:43 +0200,
> Mads wrote:
> > On 23/05/12 14:38, FUJIWARA Katsunori wrote:
> > > Hi, devels.
> > >
> > > I'm working to achieve safe pattern matching/parsing for problematic
> > > encodings (e.g.: cp932), in which strings may contain '\\' as a part
> > > of multi-byte characters.
> > Is that a part of improving hgext/win32mbcs.py ? Or how are they related?
> In my patch serires:
> - add some wrapping/hooking points to core code, and
> - wrap/hook them by hgext/win32mbcs, if it is enabled
> > > As you noticed, wrapping/hooking points are scattered in widely, so I
> > > think that this implementation is not so good. But I don't have any
> > > other ideas.
> > >
> > > Are there any other ideas to solve this problem ?
> > The only viable solution is to consistently use utf-8 inside Mercurial.
> I also think so, too :-)
> > > BTW, how is "using Unicode API on Windows" plan progressing ?
> > >
> > > http://www.selenic.com/pipermail/mercurial-devel/2011-December/036385.html
> > No progress at all. Windows users do apparently not care enough about
> > the problem to contribute in any way.
> I have thought that the mail from Matt was the announcement to start
> of work in core development team, but it is my mis-understanding,
> isn't it ?
That was this:
As you can see.. no one expressed ANY interest at all despite
complaining about this issue more or less constantly.
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel