[PATCH 4 of 8] encode all output in stdio encoding

Matt Mackall mpm at selenic.com
Mon Nov 20 11:28:55 CST 2006


On Mon, Nov 20, 2006 at 05:29:54PM +0700, Andrey wrote:
> # HG changeset patch
> # User Andrey <grooz-work at gorodok.net>
> # Date 1163471414 -21600
> # Node ID ad1e778d48b8e6ae6a75dd5b1cebf854f9fbc0e4
> # Parent  2e2cc18a4e6dd837305f247ff5e2046084e77960
> encode all output in stdio encoding

This is wrong. What should happen if I run:

 hg cat random-encoding.txt

Answer: Exactly the same thing as when I run:

 cat random-encoding.txt

It should not attempt to do any conversion. And it absolutely should
not throw an exception because it fails to coerce things. It should
pass the data through exactly as it was given. The same rule applies
for about a dozen other commands.

There are two things we know the encoding of:

- message strings (ASCII or whatever the translation tells us)
- changelog data (unlike file contents, we 'own' it)

Now there are two ways we can go:

a) add a second 'encoded' write function and audit all writes
b) localize the encoding of the two above things at their point of
   origin

Path (a) is likely to be a disaster, both initially and for
maintenance.

Path (b) is a lot less work and less error-prone. We convert changelog
text in two spots, and localize message strings in our implementation
of _.

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list