[PATCH 02 of 10] localrepo: bytes for errors

Matt Mackall mpm at selenic.com
Fri May 13 16:37:29 EDT 2016


On Fri, 2016-05-13 at 09:30 -0700, Gregory Szorc wrote:
> On Fri, May 13, 2016 at 7:18 AM, Jun Wu <quark at fb.com> wrote:
> 
> > 
> > On 05/13/2016 04:48 AM, Gregory Szorc wrote:
> > 
> > > 
> > > In theory, we might be able to implement a custom module loader on Python
> > > 3
> > > that does source/ast translation when loading .py files. But this scares
> > > me
> > > for several reasons.
> > > 
> > I realized the "module loader" may include the "# coding: " hack.
> > Could you explain the reasons?
> > 
> I didn't realize the "# coding" hack is an option. That's very attractive!
> 
> I suggested the module loading hack because we already have a custom module
> loader handling mercurial.* modules. We could likely extend it rather
> easily to do rewriting. But I think I like the "# coding" idea better.
> Feels simpler.

Requirements for whatever solution we pick:

- minimal impact on 2.x performance and maintainability
- protection against rot / active enforcement by check-code
- cannot require devs to do their own unicode/bytes type analysis
- finite amount of churn[1]

Non-requirements:

- elegant
- good 3.x performance (a lost battle before we hit the first line of code)

Another possibility is we subvert unicode objects in Py3:

- force Py3's idea of the system encoding to be Latin1
- because Latin1 has 256 valid codepoints, it's 1:1 with byte strings
- thus unicode objects effectively become fat, slow byte strings
- our ASCII string literals don't need any adjustment
- we continue to do the "real" charset work in tolocal/fromlocal
- vast bulk of code doesn't need touching

[1] The level of churn we've gone through lately just for the print and import
statement work is a bit too high and is impacting throughput of higher-priority
development.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list