[PATCH 1 of 8] use UTF-8 to encode/decode log text

Brendan Cully brendan at kublai.com
Mon Nov 20 12:52:51 CST 2006


On Tuesday, 21 November 2006 at 00:48, Andrey wrote:
> On 21 November 2006 (Tue) 00:14, Alexis S. L. Carvalho wrote:
> > Thus spake Andrey:
> > > @@ -60,6 +62,7 @@ class changelog(revlog):
> > >          """
> > >          if not text:
> > >              return (nullid, "", (0, 0), [], "", {})
> > > +        text = unicode(text, CHANGELOG_ENCODING)
> >
> > Should we encode/decode the whole changelog text or just the user and
> > comment sections?
> >
> > I'm not sure about the extra section (branch name should be UTF-8, but
> > I don't know if binary data is forbidden), but, at least for now, I
> > think we don't want to encode/decode the list of files.
> 
> I see. Seems like only comment should be encoded for now, and maybe extra.

I doubt extra should be either - it's only accessed via code, and it
supports binary data, so it's probably better for the users of
particular fields there to decide for themselves whether to encode
them.


More information about the Mercurial-devel mailing list