[PATCH 2 of 2] match: implement __repr__() and update users (API)

Yuya Nishihara yuya at tcha.org
Wed May 24 09:38:06 EDT 2017


On Tue, 23 May 2017 16:12:27 -0700, Martin von Zweigbergk wrote:
> On Tue, May 23, 2017 at 5:36 AM, Yuya Nishihara <yuya at tcha.org> wrote:
> > On Mon, 22 May 2017 11:22:33 -0700, Martin von Zweigbergk via Mercurial-devel wrote:
> >> # HG changeset patch
> >> # User Martin von Zweigbergk <martinvonz at google.com>
> >> # Date 1495476498 25200
> >> #      Mon May 22 11:08:18 2017 -0700
> >> # Node ID fa82a6f7adb3deef43dacf5059e906eed9a1beba
> >> # Parent  bdc4861ffe597d6dc0c19b57dcb98edaf5aaa89f
> >> match: implement __repr__() and update users (API)
> >>
> >> fsmonitor and debugignore currently access matcher fields that I would
> >> consider implementation details, namely patternspat, includepat, and
> >> excludepat. Let' instead implement __repr__() and have the few users
> >> use that instead.
> >>
> >> Marked (API) because the fields can now be None.
> >>
> >> diff --git a/hgext/fsmonitor/__init__.py b/hgext/fsmonitor/__init__.py
> >> --- a/hgext/fsmonitor/__init__.py
> >> +++ b/hgext/fsmonitor/__init__.py
> >> @@ -148,19 +148,7 @@
> >>
> >>      """
> >>      sha1 = hashlib.sha1()
> >> -    if util.safehasattr(ignore, 'includepat'):
> >> -        sha1.update(ignore.includepat)
> >> -    sha1.update('\0\0')
> >> -    if util.safehasattr(ignore, 'excludepat'):
> >> -        sha1.update(ignore.excludepat)
> >> -    sha1.update('\0\0')
> >> -    if util.safehasattr(ignore, 'patternspat'):
> >> -        sha1.update(ignore.patternspat)
> >> -    sha1.update('\0\0')
> >> -    if util.safehasattr(ignore, '_files'):
> >> -        for f in ignore._files:
> >> -            sha1.update(f)
> >> -    sha1.update('\0')
> >> +    sha1.update(repr(ignore))
> >>      return sha1.hexdigest()
> >
> > This will cause problems on Python 3 where repr() must return a unicode string
> > but sha1 expects bytes.
> 
> Good point. Since the patterns (regexes) are bytes (I think), it seems
> like we'd want the representation to be bytes as well. IIUC, __bytes__
> was introduced in py3, so we can't use that. Should we add a custom
> bytes() (or bytesrepr()? or ...) or what do we do?

repr() seems fine for debugging output, though we'll need to convert it
back to bytes (by e.g. sysbytes()) on Python 3 (and glob out b'' in tests.)
I'm not a fan of using repr() as a cache key, but that's probably okay. We
can encode it by sysbytes().

> Okay if I fix this in a followup? Augie said he'd slightly prefer that
> because my patch is already pretty deep down in the stack. I also have
> a long series built on top that will also use the __repr__ format in
> tests.

Yeah, this isn't a blocker anyway.


More information about the Mercurial-devel mailing list