As title... To reproduce (in case this doesn't paste properly, the bit in the brackets is character 176, which I produced by opening Windows charmap and copying/ pasting). $ mkdir ascii_issue_bzr $ cd ascii_issue_bzr $ bzr init $ echo "ABC" > abc $ bzr add $ bzr ci -m "This is a commit message with a degree sign in it (°)." $ cd .. $ hg convert ascii_issue_bzr ascii_issue_hg initializing destination ascii_issue_hg repository scanning source... sorting... converting... 0 This is a commit message with a degree sign in it (ï°). transaction abort! rollback completed ** unknown exception encountered, please report by visiting ** http://mercurial.selenic.com/wiki/BugTracker ** Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01) [GCC 4.3.4 20090804 (release) 1] ** Mercurial Distributed SCM (version 1.9.2) ** Extensions loaded: color, graphlog, progress, convert, extdiff, purge, record, fetch, schemes, hgk, rebase Traceback (most recent call last): File "/usr/bin/hg", line 38, in <module> mercurial.dispatch.run() File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 27, in run sys.exit(dispatch(request(sys.argv[1:]))) File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 64, in dispatch return _runcatch(req) File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 87, in _runcatch return _dispatch(req) File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 688, in _dispatch cmdpats, cmdoptions) File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 463, in runcommand ret = _runcommand(ui, options, cmd, d) File "/usr/lib/python2.6/site-packages/mercurial/extensions.py", line 182, in wrap return wrapper(origfn, *args, **kwargs) File "/usr/lib/python2.6/site-packages/hgext/color.py", line 368, in colorcmd return orig(ui_, opts, cmd, cmdfunc) File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 742, in _runcommand return checkargs() File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 696, in checkargs return cmdfunc() File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 685, in <lambda> d = lambda: util.checksignature(func)(ui, *args, **cmdoptions) File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 385, in check return func(*args, **kwargs) File "/usr/lib/python2.6/site-packages/hgext/convert/__init__.py", line 269, in convert return convcmd.convert(ui, src, dest, revmapfile, **opts) File "/usr/lib/python2.6/site-packages/hgext/convert/convcmd.py", line 445, in convert c.convert(sortmode) File "/usr/lib/python2.6/site-packages/hgext/convert/convcmd.py", line 361, in convert self.copy(c) File "/usr/lib/python2.6/site-packages/hgext/convert/convcmd.py", line 330, in copy source, self.map) File "/usr/lib/python2.6/site-packages/hgext/convert/hg.py", line 171, in putcommit self.repo.commitctx(ctx) File "/usr/lib/python2.6/site-packages/mercurial/localrepo.py", line 1112, in commitctx user, ctx.date(), ctx.extra().copy()) File "/usr/lib/python2.6/site-packages/mercurial/changelog.py", line 243, in add text = "\n".join(l) UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 51: ordinal not in range(128)
Your example does work for me on OSX with bzr 2.4.2 and a development version of hg. Is your shell encoding configured for UTF-8? What is your environment (I am a little puzzled about the Windows charmap reference with /usr/lib mentions in the traceback). That said, I think the encoding code in bzr.py is wrong.
Can you try the attached patch? It changes the encoding behaviour of bzr source, making it expect unicode objects everywhere and trying to encode them to UTF-8 before passing them to hg. I believe this is more correct.
The environment for the test cases and in which I'm running Mercurial is cygwin (hence the /lib paths) with a (modified) putty terminal running on Windows XP. UTF-8 usually works fine with this set-up. The original branch that caused me problems was almost certainly (I'm not sure as I wasn't the committer) committed using the QBzr "qcommit" tool, which I believe also uses UTF-8. I've just tried the same script on my home PC (running Ubuntu) and it seems to work okay there, so I guess it's something to do with cygwin/Windows (isn't everything?!). Unfortunately, I don't have Windows on any of my home PCs, so I won't be able to test this again until tomorrow (~8am GMT).
No problem, I'll try the patch first thing tomorrow morning.
The patch appears to have fixed the problem. It didn't apply cleanly (presumably by Cygwin version of Mercurial is a little old: there is no 'seen.add(path or topath)' in my bzr.py), but I applied the middle change manually and it seems to work fine.
Well, you are probably missing this fix with 1.9.2: http://hg.intevation.org/mercurial/crew/rev/6ba2fc0a87ab
Fixed by http://selenic.com/repo/hg/rev/f5b6046f6ce8 Patrick Mezard <pmezard@gmail.com> convert/bzr: expect unicode metadata, encode in UTF-8 (issue3232) (please test the fix)
--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:27 EDT --- This bug was previously known as _bug_ 3232 at http://mercurial.selenic.com/bts/issue3232 Imported an attachment (id=1624)