[PATCH 1 of 3 STABLE] convert/svn: make svn sink work with svn 1.7

Matt Mackall mpm at selenic.com
Fri Dec 2 14:35:46 CST 2011


On Fri, 2011-12-02 at 18:28 +0100, Patrick Mezard wrote:
> # HG changeset patch
> # User Patrick Mezard <pmezard at gmail.com>
> # Date 1322842275 -3600
> # Branch stable
> # Node ID e317a36f99b908f1fc8346e174c0a44075dc7ef2
> # Parent  b5f1d7a1dcc5f143a2cd111359171f9fc507b4c7
> convert/svn: make svn sink work with svn 1.7
> 
> "svn add file" now fails if "file" is already tracked. To filter them we have
> to mirror the svn manifest in the sink.

> +                # Entries are compared with names coming from
> +                # mercurial, so bytes with undefined encoding. Our
> +                # best bet is to assume they are in local
> +                # encoding. They will be passed to command line calls
> +                # later anyway, so they better be.
> +                m.add(encoding.tolocal(''.join(name).encode('utf-8')))

Ok, I take it 'name' is a list of ustrings.

I'm a little confused: if this is an hg->svn conversion (svn sink), then
this seems to be going the wrong direction.

But let's speak more generally: converting SVN's notion of a name to
Mercurial's notion of a name is philosophically equivalent to "getting
what SVN would check out on a byte-oriented system like Linux". But
encoding.tolocal doesn't quite embody that philosophy because it's
supposed to be used for display: it turns unencodable chars into '?'
whereas SVN aborts[1]. So I think the above code could result in some
really "interesting" filenames.

The flip side of the coin is that if we do something like convert a
UTF-8 Mercurial repo to SVN -in a Latin1- locale, it'll (a) succeed and
(b) result in "permanent" mojibake in the SVN history. Again, the
principle is "SVN gets what it would get if you checked in on Linux".
For UTF-8, at least, we can probably do better, as we can fairly
reliably detect UTF-8.

So we should probably:

- try to detect whether UTF-8 is in use and just use it
- otherwise fall back to encoding.encoding
- abort on untranscodable characters

[1] http://www.tigris.org/scdocs/SVNEncoding

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list