[PATCH V2] fsmonitor: match watchman and filesystem encoding

Yuya Nishihara yuya at tcha.org
Fri Apr 7 08:54:13 EDT 2017


On Thu, 6 Apr 2017 12:46:38 -0400, Olivier Trempe wrote:
> On Thu, Apr 6, 2017 at 9:45 AM, Yuya Nishihara <yuya at tcha.org> wrote:
> > > > +def _watchmantofsencoding(path):
> > > > +    """Fix path to match watchman and local filesystem encoding
> > > > +
> > > > +    watchman's paths encoding can differ from filesystem encoding.
> > For example,
> > > > +    on Windows, it's always utf-8.
> > > > +    """
> > > > +    try:
> > > > +        decoded = path.decode(_watchmanencoding)
> > > > +    except UnicodeDecodeError as e:
> > > > +        raise error.Abort(e, hint='watchman encoding error')
> > >
> > > Does this need to be str(e)?
> >
> > Perhaps.
> > >
> > > > +
> > > > +    return decoded.encode(_fsencoding, 'replace')
> >
> > Maybe it's better to catch exception here. Encoding error would be more
> > likely
> > to happen because Windows ANSI charset is generally narrower than UTF-*.
> >
> 
> You mean setting the error handler to 'strict' rather than 'replace' and
> wrap the call in a try except block?

Yes.

> Or just wrap the call in a try except block, but keep the 'replace' error
> handler?
> Using the 'replace' error handler is necessary here to match the behavior
> of osutil.listdir

It appears 'mbcs' codec replaces unknown characters no matter if 'strict' is
specified or not. Perhaps that would be done by WideCharToMultiByte(). I think
using 'strict' here is more consistent because osutil.listdir() handles
nothing about encoding in Python layer.

https://msdn.microsoft.com/en-us/library/windows/desktop/dd374130(v=vs.85).aspx


More information about the Mercurial-devel mailing list