D7061: convert: don't pass bytes to, or expect bytes from, emailparser

durin42 (Augie Fackler) phabricator at mercurial-scm.org
Wed Oct 16 16:43:27 EDT 2019


durin42 added inline comments.

INLINE COMMENTS

> Kwan wrote in gnuarch.py:301
> Hmm, I wasn't aware of that drawback of `unifromlocal()` if it's true.  Is there a canonical "Give me unicode from these mercurial bytes" function?  Regardless, `BytesParser` does sound handy, and is even in 3.5 <https://docs.python.org/3.5/library/email.parser.html#email.parser.BytesParser>, but isn't present in 2.7.  Would doing it conditionally be alright?  (and a conditional alias for parsebytes)
> 
>   self.catlogparser = (
>       emailparser.BytesParser()
>       if pycompat.ispy3
>       else emailparser.Parser()
>   )
>   if not pycompat.ispy3:
>       self.catlogparser.parsebytes = self.catlogparser.parsestr
> 
>   -            catlog = self.catlogparser.parsestr(data)
>   +            catlog = self.catlogparser.parsebytes(data)

It depends which bytes, basically. Some bytes in hg are known to be UTF-8, but anytime we have file contents or filenames we don't know.

I'm not sure of the context here, but maybe the output from tla is in some known encoding?

(I'm also open to the idea of dropping tla convert support as we move to Python 3, as tla has been obsolete for a _long_ time.)

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7061/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7061

To: Kwan, #hg-reviewers
Cc: durin42, martinvonz, mercurial-devel


More information about the Mercurial-devel mailing list