[PATCH 1 of 3 STABLE] convert/svn: make svn sink work with svn 1.7

Patrick Mézard pmezard at gmail.com
Sat Dec 3 04:35:55 CST 2011


Le 02/12/11 21:35, Matt Mackall a écrit :
> On Fri, 2011-12-02 at 18:28 +0100, Patrick Mezard wrote:
>> # HG changeset patch
>> # User Patrick Mezard <pmezard at gmail.com>
>> # Date 1322842275 -3600
>> # Branch stable
>> # Node ID e317a36f99b908f1fc8346e174c0a44075dc7ef2
>> # Parent  b5f1d7a1dcc5f143a2cd111359171f9fc507b4c7
>> convert/svn: make svn sink work with svn 1.7
>>
>> "svn add file" now fails if "file" is already tracked. To filter them we have
>> to mirror the svn manifest in the sink.
> 
>> +                # Entries are compared with names coming from
>> +                # mercurial, so bytes with undefined encoding. Our
>> +                # best bet is to assume they are in local
>> +                # encoding. They will be passed to command line calls
>> +                # later anyway, so they better be.
>> +                m.add(encoding.tolocal(''.join(name).encode('utf-8')))
> 
> Ok, I take it 'name' is a list of ustrings.
> 
> I'm a little confused: if this is an hg->svn conversion (svn sink), then
> this seems to be going the wrong direction.
> 
> But let's speak more generally: converting SVN's notion of a name to
> Mercurial's notion of a name is philosophically equivalent to "getting
> what SVN would check out on a byte-oriented system like Linux". But
> encoding.tolocal doesn't quite embody that philosophy because it's
> supposed to be used for display: it turns unencodable chars into '?'
> whereas SVN aborts[1]. So I think the above code could result in some
> really "interesting" filenames.
> 
> The flip side of the coin is that if we do something like convert a
> UTF-8 Mercurial repo to SVN -in a Latin1- locale, it'll (a) succeed and
> (b) result in "permanent" mojibake in the SVN history. Again, the
> principle is "SVN gets what it would get if you checked in on Linux".
> For UTF-8, at least, we can probably do better, as we can fairly
> reliably detect UTF-8.
> 
> So we should probably:
> 
> - try to detect whether UTF-8 is in use and just use it

Is in use where? In the source repository? In the shell we are using to push stuff to svn?

> - otherwise fall back to encoding.encoding
> - abort on untranscodable characters

--
Patrick Mézard


More information about the Mercurial-devel mailing list