[PATCH 1 of 8] use UTF-8 to encode/decode log text

Mon Nov 20 12:52:51 CST 2006

On Tuesday, 21 November 2006 at 00:48, Andrey wrote:
> On 21 November 2006 (Tue) 00:14, Alexis S. L. Carvalho wrote:
> > Thus spake Andrey:
> > > @@ -60,6 +62,7 @@ class changelog(revlog):
> > >          """
> > >          if not text:
> > >              return (nullid, "", (0, 0), [], "", {})
> > > +        text = unicode(text, CHANGELOG_ENCODING)
> >
> > Should we encode/decode the whole changelog text or just the user and
> > comment sections?
> >
> > I'm not sure about the extra section (branch name should be UTF-8, but
> > I don't know if binary data is forbidden), but, at least for now, I
> > think we don't want to encode/decode the list of files.
> 
> I see. Seems like only comment should be encoded for now, and maybe extra.

I doubt extra should be either - it's only accessed via code, and it
supports binary data, so it's probably better for the users of
particular fields there to decide for themselves whether to encode
them.