D5273: hgignore: faster conversion from globs to regexp

Yuya Nishihara yuya at tcha.org
Thu Nov 15 06:31:56 EST 2018


>   I couldn't find documentation on how encoding works for this (user
>   data interpreted by hg). This function appears to assume the
>   encoding of the input pattern is an extension of ascii, so I think
>   my change should be correct for that.

Correct. It expects ASCII superset.

>  def reescape(pat):
>      """Drop-in replacement for re.escape."""
>      # NOTE: it is intentional that this works on unicodes and not
>      # bytes, as it's only possible to do the escaping with
>      # unicode.translate, not bytes.translate. Sigh.
>      wantuni = True
>      if isinstance(pat, bytes):
> +        if len(pat) == 1:
> +            # fast path for hgignore parsing, which calls this on one
> +            # char at a time
> +            return _regexescapemapb.get(pat, pat)

Doh. I think it's better to add a function that escapes exactly one character.
The caller gets around the dict lookup of globals, so it would really want to
be fast.


More information about the Mercurial-devel mailing list