[PATCH 0 of 5] Patches and new win32mbcs extension
Shun-ichi Goto
shunichi.goto at gmail.com
Wed Jan 9 13:44:01 UTC 2008
These patches are request to fix for mercurial core code to cooperate
with new win32mbcs extension to handle MBCS filenames correctly on
windows.
Describing a problem around MBCS issue is omitted here.
See description if previous patchbomb mail if need.
Subject: [PATCH 0 of 5] Fix to handle MBCS filename correctly
Message-Id: <patchbomb.1199622371 at yomi>
Date: Sun, 06 Jan 2008 21:26:11 +0900
There are 4 patches and one new extension:
# The extension code is posted for review.
# I'll put it on wiki page later.
1) (first 4 patches)
Remove/alternate codes using os.sep to use existing/new functions.
These change is intended to allow to be hooked by win32mbcs extension.
For example:
s.replace('\\', '/') => util.normpath(s) ... use existing function
s.split(os.sep) => util.splitpath(s) ... use new function
s.endswith(os.sep) => util.endswithsep(s) ... use new function
do not use rfindall(os.sep) ... change code
These are almost same with previous patch I sent except:
* changed commit description and function doc string as
suggested from Matt.
* fix bug in patch against util.path_auditor().
2) (last patch is new extension, for review)
Introduce a new extension called 'win32mbcs.py'.
This extension wraps some python built-in functions (os.path.*, etc.)
and mercurail function (util.xxx) to handle raw encoded MBCS
string. By enabling this extension, wrapper is installed and activated.
This is usefull for:
* Japanese Windows user using shift_jis encoding.
* (maybe) Chinese Windows user using big5 encoding.
There's no mean for Unix users.
This extension assumes the path strings are encoded by
util._encoding as local file system encoding. But it checks passed
argument is exactly encoded to util._encoding, then call original
function with converting arguments to unicode and re-encoding
return value. If the string is encoded by other encoding, warn to
user and call original function without any conversion. If string
is unicode, simply call original.
By this extension, some important functions (os.path.*, util.*)
are altered to own spec, but I belive it is safe and behave as
usual.
Opinions are welcome.
More information about the Mercurial-devel
mailing list