[PATCH 2 of 2 stable] util: fix ellipsis() not to break multi-byte sequence (issue2564)
Yuya Nishihara
yuya at tcha.org
Thu Dec 23 23:42:23 CST 2010
Matt Mackall wrote:
> On Fri, 2010-12-24 at 01:37 +0900, Yuya Nishihara wrote:
> > # HG changeset patch
> > # User Yuya Nishihara <yuya at tcha.org>
> > # Date 1293121040 -32400
> > # Branch stable
> > # Node ID 2ab92b58076868e42c632828b2487cabe2823e8e
> > # Parent 0ed736fe75b467ad9191f2ef52129992381659e5
> > util: fix ellipsis() not to break multi-byte sequence (issue2564)
> >
> > diff --git a/mercurial/util.py b/mercurial/util.py
> > --- a/mercurial/util.py
> > +++ b/mercurial/util.py
> > @@ -1202,10 +1202,13 @@ def email(author):
> >
> > def ellipsis(text, maxlength=400):
> > """Trim string to at most maxlength (default: 400) characters."""
> > - if len(text) <= maxlength:
> > + utext = encoding.fromlocal(text).decode('utf-8')
>
> This assumes 'text' is in utf-8, which is not how strings in Mercurial
> generally work:
It converts 'text' to utf-8-encoded string before .decode('utf-8').
And IMO it's better than
text.decode(encoding.encoding, 'replace')
because encoding.fromlocal() has the ability of lossless conversion.
> http://mercurial.selenic.com/wiki/EncodingStrategy
Regards,
Yuya
More information about the Mercurial-devel
mailing list