[PATCH 1 of 5 rfc] tests: explore some bdiff cases

Mads Kiilerich mads at kiilerich.com
Sun Nov 6 10:56:27 EST 2016


On 11/06/2016 10:07 AM, Yuya Nishihara wrote:
> On Thu, 03 Nov 2016 22:34:11 +0100, Mads Kiilerich wrote:
>> # HG changeset patch
>> # User Mads Kiilerich <madski at unity3d.com>
>> # Date 1478208837 -3600
>> #      Thu Nov 03 22:33:57 2016 +0100
>> # Node ID f6408efe0d0f4179fe6cc2b967164c1b4567f3d6
>> # Parent  d06c049695e6ad3219e7479c65ce98a2f123e878
>> tests: explore some bdiff cases
>>
>> diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t
>> new file mode 100644
>> --- /dev/null
>> +++ b/tests/test-bhalf.t
> '#require no-pure' is necessary since we use difflib in pure.
>
> The other changes in this series look good to me, but it's bdiff.c so I
> don't queue them.

Thanks for reviewing and the positive feedback. I will try to polish it 
for "real" submission.

For the last patch, I wonder if it would be better to add a post 
processing step that - given all the chunks - try to shift/rotate all 
match sequences to be as early as possible (and thus deltas to be as 
late and "appending" as possible). That could give more readable diffs, 
especially when combined with heuristics for preferring chunks starting 
with the lowest amount of indentation.

One lesson from these changes seems to be that it is a problem that we 
use the same low level diff algorithm for revlog delta storage and 
bundles and for readable patch diffs. One idea that got mentioned at the 
latest sprint was to use zstandard for storage and "just" seed it with 
the "a" version of the file as dictionary and let it compress the "b" 
side. That might be a better long term solution.

More short term, I wonder how much we could gain from somehow teaching 
bdiff to consider both parents for each chunk instead of just using 
deltas from one side and store chunks from the other verbatim. I think 
that could make a significant difference for repositories with a lot of 
big merges in files or the manifest.

/Mads



More information about the Mercurial-devel mailing list