[PATCH 7 of 7 v2] bdiff: give slight preference to removing trailing lines
Mads Kiilerich
mads at kiilerich.com
Fri Nov 25 09:58:28 EST 2016
On 11/24/2016 06:52 PM, Jun Wu wrote:
> Excerpts from Augie Fackler's message of 2016-11-17 12:42:26 -0500:
>> My own cursory perfbdiff runs suggest this is a perf wash (using
>> `perfbdiff -m 3041e4d59df2` in the mozilla repo). Queued. Thanks!
> I'd mention this series changes the behavior of the diff output. The
> difference was caught by fastannotate test.
>
> See the below table (old: e1d6aa0e4c3a, new: 8836f13e3c5b):
>
> a | b | old | new
> --------------------
> a | a | a | -a
> a | z | +z | a
> a | a | a | +z
> | | -a | a
> --------------------
> a | a | a
> a | a | a
> a | | -a
>
> I think we would always prefer putting deletions at the end, to be consistent.
I agree that this end result would be nice.
(My patches "bdiff: early pruning of common suffix before doing
expensive computations" and "bdiff: early pruning of common prefix
before doing expensive computations" happens to give the obvious "good"
result in this case ... but is no general solution to the problem.)
Stating the obvious context:
The algorithms and tweaks for making "good" diffs are just heuristics
and can never be perfect. A counter example doesn't necessarily prove
anything. But also, it might be worth it to add a tweak...
An idea, described informally: While the current recursive algorithm
works "best" by selecting middle-most longest common substrings, I guess
it could make sense to have a post processing step that shifts common
substrings to the first occurrence in the previous diff chunk. Also, if
a previous or later non-common range is empty (because it is an
add/remove), the matching range can be "shifted", perhaps with
preference of not starting with an "empty" line but prefering the lowest
amount of leading spaces.
/Mads
More information about the Mercurial-devel
mailing list