[PATCH 2 of 2 stable] util: fix ellipsis() not to break multi-byte sequence (issue2564)

Matt Mackall mpm at selenic.com
Fri Dec 24 12:08:35 CST 2010


On Fri, 2010-12-24 at 14:42 +0900, Yuya Nishihara wrote:
> Matt Mackall wrote:
> > On Fri, 2010-12-24 at 01:37 +0900, Yuya Nishihara wrote:
> > > # HG changeset patch
> > > # User Yuya Nishihara <yuya at tcha.org>
> > > # Date 1293121040 -32400
> > > # Branch stable
> > > # Node ID 2ab92b58076868e42c632828b2487cabe2823e8e
> > > # Parent  0ed736fe75b467ad9191f2ef52129992381659e5
> > > util: fix ellipsis() not to break multi-byte sequence (issue2564)
> > > 
> > > diff --git a/mercurial/util.py b/mercurial/util.py
> > > --- a/mercurial/util.py
> > > +++ b/mercurial/util.py
> > > @@ -1202,10 +1202,13 @@ def email(author):
> > >  
> > >  def ellipsis(text, maxlength=400):
> > >      """Trim string to at most maxlength (default: 400) characters."""
> > > -    if len(text) <= maxlength:
> > > +    utext = encoding.fromlocal(text).decode('utf-8')
> > 
> > This assumes 'text' is in utf-8, which is not how strings in Mercurial
> > generally work:
> 
> It converts 'text' to utf-8-encoded string before .decode('utf-8').

You're right, I was too blind to spot the fromlocal there. However, I
don't think this is sufficiently. There's no guarantee that 'text' -can-
be converted to UTF-8: see the use in dispatch.py where we dump some
random unexpected HTTP/SSH garbage. We need a fallback in case we can't
decode the string cleanly.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list