EOL extension

Martin Geisler mg at lazybytes.net
Thu Dec 3 07:57:22 CST 2009

Colin Caughie <c.caughie at indigovision.com> writes:

> Not quite, but it's related.

>> Diffs are computed based on the encoded forms of files. This little
>> script illustrates it in a nice blunt way on Linux:
>>   #!/bin/sh
>>   hg init encode-test.$$
>>   cd encode-test.$$
>>   echo 'Hello World' > hello.txt
>>   cat > .hg/hgrc <<EOF
>>   [encode]
>>   ** = tr a-z A-Z
>>   [decode]
>>   ** = tr A-Z a-z
>>   EOF
>>   hg add hello.txt
>>   hg diff
>>   cd ..
>>   rm -r encode-test.$$
>> It asks Mercurial to "encode" files when saving them by making the
>> content uppercase, and to "decode" them into the working copy by
>> making content lowercase.
>> The problem is the diff: it looks like this:
>>   diff --git a/hello.txt b/hello.txt
>>   new file mode 100644
>>   --- /dev/null
>>   +++ b/hello.txt
>>   @@ -0,0 +1,1 @@
>> That is, it works on the encoded form instead of the decoded form. I
>> would expect it to be all lowercase since that is the logical change
>> in the users environment.
>> However, one could argue that the diff *should* be in uppercase since
>> that is the correct patch, i.e., the one that will apply in another
>> clone that does not have the encode/decode filters installed.
> I agree with the second argument; diffs should be based on the encoded
> (i.e. repository) form so that the same patches can work regardless of
> whether you have the filters enabled.

Yes, you're right -- that is indeed better.

I'm actually unsure what happens in a Windows terminal when 'hg diff'
produces a diff of files with LF EOLs. Is that printed correctly?

> The problem is that although diff correctly honours the encode/decode
> filters, patch (specifically patch.applydiff) doesn't. Patching always
> operates on the working directory files, not the "logical" repo files.
> So although your capitalized "HELLO WORLD" patch would apply fine on a
> repo *without* the capitalization filters, it doesn't apply on the
> repo that has them enabled. In other words, having exported a patch
> from your translation-enabled repo, you couldn't then import this
> patch back into the same repo.
> The patch.eol fix in 1.3 works around this by adding an option to have
> patch ignore line ending style entirely. But if we plan to make the
> line ending translation rules more sophisticated than they are in
> win32text, I would prefer to have a patch function that is genuinely
> complementary to the diff function, so that we can be more confident
> that it will always work.

I agree, thanks for the good explaination! So we should keep do what we
do now and diff the encoded files in order to generate correct output.

The patch function that is wrong when it wants to work on the decoded
(working copy) files -- we should at least make it more tolerant of
EOL-mismatches when the eol extension is in effect.

I will try and make a test case for this when I get home but you are
more than welcome to beat me to it! :-)

I'll be happy to give you write access to the Bitbucket repository -- I
consider it sort of a playground for now when we're still banging out
the bugs and design issues. So please submit a failing test case so that
we have something to aim for.

Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

More information about the Mercurial-devel mailing list