D7225: import-checker: open all source files as utf-8
indygreg (Gregory Szorc)
phabricator at mercurial-scm.org
Tue Nov 5 04:58:06 UTC 2019
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.
REVISION SUMMARY
Before, we opened in text mode and used the default encoding
to interpret the bytes within.
This caused problems interpreting some byte sequences in some
files.
This commit changes things to always open files as UTF-8, which
makes the error go away.
test-check-module-imports.t now passes on Python 3.5 and 3.6
with this change.
REPOSITORY
rHG Mercurial
BRANCH
stable
REVISION DETAIL
https://phab.mercurial-scm.org/D7225
AFFECTED FILES
contrib/import-checker.py
CHANGE DETAILS
diff --git a/contrib/import-checker.py b/contrib/import-checker.py
--- a/contrib/import-checker.py
+++ b/contrib/import-checker.py
@@ -4,6 +4,7 @@
import ast
import collections
+import io
import os
import sys
@@ -754,7 +755,11 @@
yield src.read(), modname, f, 0
py = True
if py or f.endswith('.t'):
- with open(f, 'r') as src:
+ # Strictly speaking we should sniff for the magic header that denotes
+ # Python source file encoding. But in reality we don't use anything
+ # other than ASCII (mainly) and UTF-8 (in a few exceptions), so
+ # simplicity is fine.
+ with io.open(f, 'r', encoding='utf-8') as src:
for script, modname, t, line in embedded(f, modname, src):
yield script, modname.encode('utf8'), t, line
To: indygreg, #hg-reviewers
Cc: mercurial-devel
More information about the Mercurial-devel
mailing list