Storage format for remotenames.

Yuya Nishihara yuya at tcha.org
Wed Nov 8 09:00:03 EST 2017


On Tue, 7 Nov 2017 09:58:04 -0800, Durham Goode wrote:
> I wish we had some easily reusable serializer/deserializer instead of 
> having to reinvent these every time.  What's our reasoning for not using 
> json? I forget. If there are some weird characters, like control 
> characters or something, that break json, I'd say we just use json and 
> prevent users from creating bookmarks and paths with those names.

Just about json. Using json (in Mercurial) is bad because it's so easily
to spill out unicode objects without noticing that, all tests pass (because
the whole data is ascii), and we'll get nice UnicodeError in production.

Another concern is that encoding conversion can be lossy even if it goes
with no error. There are n:m mappings between unicode and legacy encoding.
For example, we have three major Shift_JISes in Japan, and the Microsoft one
allocates multiple code points for one character for "compatibility" reasons.


More information about the Mercurial-devel mailing list