[PATCH stable] fsmonitor: match watchman and local encoding

Siddharth Agarwal sid at less-broken.com
Mon Mar 6 20:14:13 EST 2017


On 3/6/17 09:50, Olivier Trempe wrote:
> # HG changeset patch
> # User Olivier Trempe <oliviertrempe at gmail.com>
> # Date 1488810111 18000
> #      Mon Mar 06 09:21:51 2017 -0500
> # Branch stable
> # Node ID c9d3f8d1a57346228f5c3bb749acdff90d37e194
> # Parent  6b00c3ecd15b26587de8cca6fab811069cba3b2f
> fsmonitor: match watchman and local encoding
>
> watchman's paths encoding is os dependant. For example, on Windows, it's
> always utf-8. This causes paths comparison mismatch when paths contain non ascii
> characters.

I really doubt this is correct unixes, where Watchman returns bytes as 
they are on disk, which matches exactly with what Mercurial wants.

(On Windows Watchman indeed always returns UTF-8.)

>
> diff -r 6b00c3ecd15b -r c9d3f8d1a573 hgext/fsmonitor/__init__.py
> --- a/hgext/fsmonitor/__init__.py	Thu Mar 02 20:19:45 2017 -0500
> +++ b/hgext/fsmonitor/__init__.py	Mon Mar 06 09:21:51 2017 -0500
> @@ -99,6 +99,7 @@
>   from mercurial import (
>       context,
>       encoding,
> +    error,
>       extensions,
>       localrepo,
>       merge,
> @@ -110,6 +111,7 @@
>   from mercurial import match as matchmod
>   
>   from . import (
> +    pywatchman,
>       state,
>       watchmanclient,
>   )
> @@ -159,6 +161,20 @@
>       sha1.update('\0')
>       return sha1.hexdigest()
>   
> +def _watchmanencodingtolocal(path):
> +    """Fix path to match watchman and local encoding
> +
> +    watchman's paths encoding is os dependant. For example, on Windows, it's

"dependent"

> +    always utf-8. This converts watchman encoded paths to local encoding to
> +    avoid paths comparison mismatch.
> +    """
> +    try:
> +        decoded = pywatchman.encoding.decode_local(path)
> +    except UnicodeDecodeError as e:
> +        raise error.Abort(e, hint='watchman encoding error')

Could you elaborate a bit on when the exception can happen?

> +
> +    return decoded.encode(encoding.encoding, 'replace')
> +
>   def overridewalk(orig, self, match, subrepos, unknown, ignored, full=True):
>       '''Replacement for dirstate.walk, hooking into Watchman.
>   
> @@ -302,7 +318,7 @@
>       # Watchman tracks files.  We use this property to reconcile deletes
>       # for name case changes.
>       for entry in result['files']:
> -        fname = entry['name']
> +        fname = _watchmanencodingtolocal(entry['name'])

Adding a non-trivial function call here is likely going to regress 
performance. Have you measured the impact?

- Siddharth

>           if switch_slashes:
>               fname = fname.replace('\\', '/')
>           if normalize:
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel




More information about the Mercurial-devel mailing list