D8022: chg: pass copies of some envvars so we can detect py37+ modifications

spectral (Kyle Lippincott) phabricator at mercurial-scm.org
Wed Jan 29 14:52:19 EST 2020


spectral added a comment.


  In D8022#118428 <https://phab.mercurial-scm.org/D8022#118428>, @yuja wrote:
  
  >>   This would cause a difference in behavior between hg and chg. I don't know how big of an issue that would be.
  >>   hg: starts up, python coerces LC_CTYPE, hg spawns a non-python subprocess, LC_CTYPE is set to the coerced value
  >>   chg: starts up, python coerces LC_CTYPE, chg fixes it, hg spawns a non-python subprocess, LC_CTYPE is set to the original value (or unset).
  >
  > I think a minor behavior difference is acceptable. I see this is the problem
  > of Python 3 design "the world is unicode", and IMHO we want to work around
  > the problem with minimal changes.
  
  Getting this implemented "correctly" using daemon_postexec is getting relatively complicated - there's no way to backwards- and forwards-compatibly do this since there's no capabilities mechanism for daemon_postexec. If I blindly do it, it fails when connecting to an older hg saying it doesn't know how to do handle that command.
  
  Much of the stuff being done in this stack is to handle the case with the setenv command (after the chg daemon is started). If we don't care about overwriting the python-modified LC_CTYPE, we can do this quite a bit easier:
  
  In execcmdserver, we copy LC_CTYPE to CHGORIG_LC_CTYPE, export that when starting
  In chgserver.py, when doing the initial startup process, check if CHGORIG_LC_CTYPE is in the environment, and overwrite LC_CTYPE with its value.
  
  I'll see how easy that is to implement and send a separate review request once I have it working.
  
  > `PYTHONCOERCECLOCALE=0` was rejected because it may break Python subprocesses
  > that expects the "coercing" will occur, but restoring the cheated `LC_CTYPE`
  > should be fine. The chg behavior will be more correct, and non-Python
  > subprocesses should work in the original "C" environment.
  >
  >>   Another case where it matters is: LC_CTYPE *is* important for hg when using any curses programs (or really anything that uses the locale module). If you are on a mac and set your region settings to have a region and a primary language that aren't representable using the locale settings (such as region = Brazil, language = English), then LC_CTYPE starts off as "UTF-8" (not "C.UTF-8"). This is allowed on macOS and I think other BSDs, but Linux doesn't like it. If the LC_CTYPE variable is forwarded when sshing to a Linux machine, this breaks curses. I'm sure this wasn't intentional, but the way that Python3.7 coerces the locale, it also happens to fix this particular issue and set it to C.UTF-8, which *does* work on Linux and makes `hg commit --interactive` with ui.interface=curses work; otherwise it gets a traceback.
  >
  > Does the curses depend on the locale environment variable, not on the
  > `setlocale()`-ed process state?
  
  chunkselector (crecord.py:578) calls `locale.setlocale(locale.LC_ALL, '')`, which calls `_setlocale(category, locale)`, which raises `locale.Error: unsupported locale setting`. :(
  
  > Anyway, I think this is an environment issue and it's totally fine for
  > chg to crash (or abort) if the environment variable is incorrectly set.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D8022/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D8022

To: spectral, #hg-reviewers
Cc: yuja, mercurial-devel


More information about the Mercurial-devel mailing list