[PATCH V2] fsmonitor: match watchman and filesystem encoding

Yuya Nishihara yuya at tcha.org
Thu Apr 6 09:45:34 EDT 2017


On Wed, 5 Apr 2017 10:55:17 -0700, Siddharth Agarwal wrote:
> On 4/5/17 08:42, Olivier Trempe wrote:
> > # HG changeset patch
> > # User Olivier Trempe <oliviertrempe at gmail.com>
> > # Date 1488981822 18000
> > #      Wed Mar 08 09:03:42 2017 -0500
> > # Branch stable
> > # Node ID 2021c3032968bef6b8d1cd7bea5a22996ced994c
> > # Parent  68f263f52d2e3e2798b4f1e55cb665c6b043f93b
> > fsmonitor: match watchman and filesystem encoding
> >
> > watchman's paths encoding can differ from filesystem encoding. For example,
> > on Windows, it's always utf-8.
> >
> > Before this patch, on Windows, mismatch in path comparison between fsmonitor
> > state and osutil.statfiles would yield a clean status for added/modified files.
> >
> > In addition to status reporting wrong results, this leads to files being
> > discarded from changesets while doing history editing operations such as rebase.
> 
> This patch looks correct to me, though I have questions about its 
> performance below.
> 
> +cc foozy for another look.

[...]

> > +_watchmanencoding = pywatchman.encoding.get_local_encoding()
> > +_fsencoding = sys.getfilesystemencoding() or sys.getdefaultencoding()
> > +_fixencoding = codecs.lookup(_watchmanencoding) != codecs.lookup(_fsencoding)
> > +
> > +def _watchmantofsencoding(path):
> > +    """Fix path to match watchman and local filesystem encoding
> > +
> > +    watchman's paths encoding can differ from filesystem encoding. For example,
> > +    on Windows, it's always utf-8.
> > +    """
> > +    try:
> > +        decoded = path.decode(_watchmanencoding)
> > +    except UnicodeDecodeError as e:
> > +        raise error.Abort(e, hint='watchman encoding error')
> 
> Does this need to be str(e)?

Perhaps.
> 
> > +
> > +    return decoded.encode(_fsencoding, 'replace')

Maybe it's better to catch exception here. Encoding error would be more likely
to happen because Windows ANSI charset is generally narrower than UTF-*.


More information about the Mercurial-devel mailing list