UTF-16 in Mercurial
Benoît Allard
benoit at aeteurope.nl
Tue Mar 2 08:04:26 CST 2010
Hi there,
I've been experimenting on Windows with some UTF-16 (so called UNICODE
under Windows) config files (registry export to be complete) and the
attached -very- little extension that tries to make UTF-16 (or UTF-32)
seen as text (not binary).
It has the drawback of generating non consistent patches: the body of
the patch being in the encoding of the file, and the metadata (@@, +++,
...) being in ANSI.
In one word, it's a dead end. Let me explain:
My first tests (@home) on my Mac were quite promising, patch was looking
good, GNU patch was happy with it.
On my Windows station (WinXP) hg diff throws garbage to the terminal,
regardless of the fact if you are using cygwin or the genuine cmd.exe,
the terminal shows the first lines of the patch and unexpectedly stops
displaying at some point giving the hand back to bash (or whatever
interpreter windows is running).
I've not been able to test GNU patch on windows not having it installed
on my system, but hg import, although applying the diff without
complaining, did a completely different operation than the one the diff
was about (other part of the file modified).
About TortoiseHG, it seems to be that it is displaying line by line or
chunk by chunk the diff, depending on the view you are having, but it is
consistently stopping at the first <NUL> byte: first of the line, or
first of the chunk. Thus not displaying any interesting information.
I guess we need (at least I do) a solution to handle UTF-16 files in our
diffs (export, terminal, thg, ...). So at this point, I'm asking if
anyone has an idea on how we could proceed.
As a first step, I could turn myself toward the thg people, but I think
this would be something the whole Mercurial community could benefit from.
Regards,
Benoît
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BOM.py
Type: text/x-python
Size: 426 bytes
Desc: not available
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20100302/e32e38d6/attachment.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6031 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20100302/e32e38d6/attachment.bin>
More information about the Mercurial-devel
mailing list