[issue2162] BOM (byte order mark) support for Mercurial.ini
Alexander Belchenko
bialix at ukr.net
Wed Apr 28 02:29:15 CDT 2010
Yuya Nishihara пишет:
> Mads Kiilerich wrote:
>> Alexander Belchenko wrote, On 04/27/2010 04:15 PM:
>>> Yuya Nishihara пишет:
>>>> New submission from Yuya Nishihara <yuya at tcha.org>:
>>>>
>>>> Some text editors, like Notepad.exe, insert BOM (byte order mark)
>>>> silently if you save Mercurial.ini as UTF-8.
>>>>
>>>> IMHO, they shouldn't insert BOM for UTF-8, but it's really hard to
>>>> debug because BOM isn't visible. So it seems reasonable to
>>>> skip/recognize BOM before reading Mercurial.ini.
>>> I was under impression that UTF-8 might have optional BOM marker, and
>>> Python even has this constant defined:
>>>
>>> In [1]: import codecs
>>>
>>> In [2]: codecs.BOM
>>> codecs.BOM codecs.BOM_BE codecs.BOM_UTF32
>>> codecs.BOM32_BE codecs.BOM_LE codecs.BOM_UTF32_BE
>>> codecs.BOM32_LE codecs.BOM_UTF16 codecs.BOM_UTF32_LE
>>> codecs.BOM64_BE codecs.BOM_UTF16_BE codecs.BOM_UTF8
>>> codecs.BOM64_LE codecs.BOM_UTF16_LE
>>>
>>> In [2]: codecs.BOM_UTF8
>>> Out[2]: '\xef\xbb\xbf'
>>>
>>> So, why you say it "shouldn't"?
>> Because it is optional, has no benefit, and "never" is used?
>
> I heard it can be used for detection of character encoding,
> but it seems silly to lose ascii compatibility just for such reason.
> UTF-8 does exist for ascii transparency.
I don't understand what is "ascii transparency" here. When somebody said
about "ascii" seriously, for me it sounds the same as pretend we're
living in the flat world which stand on the back of big turtle.
More information about the Mercurial-devel
mailing list