[issue1814] bdiff.c: 4 is too low for popular-line threshold
Jason Orendorff
mercurial-bugs at selenic.com
Wed Aug 26 17:31:34 UTC 2009
New submission from Jason Orendorff <jorendorff at mozilla.com>:
In mercurial/bdiff.c:
> /* compute popularity threshold */
> t = (bn >= 4000) ? bn / 1000 : bn + 1;
The lower the threshold, the stronger the popularity hack's
influence. So at 3999 lines, the hack is disabled; and at 4000 lines,
the hack is enabled at maximum strength (t=4).
No source file in mercurial/crew is over 4000 lines. But there are, oh,
a few such files in Mozilla. I can testify that this hack causes hg to
generate some correct but eyebrow-raising patches.
I think the hack should phase in gradually. The threshold should be high
for small files where we don't need it so much. Like this:
t = (bn < 31000) ? 1000000 / bn : bn / 1000;
That would leave the popularity hack disabled for small files, then
gradually phase it in:
bn < 1000 -- t > bn (popularity hack is completely disabled)
bn == 1000 -- t = 1000 (still effectively disabled)
bn == 2000 -- t = 500 (only hits unusual files)
bn == 10000 -- t = 100 (only hits especially common lines)
bn == 31000 -- t = 31 (hack is at maximum power)
bn == 32000 -- t = 32 (hack could backfire, ease off)
If I *completely* disable the popularity hack by changing that line to
`t = bn + 1;`, hg becomes 20% slower on a large (~10sec) qrefresh, and
the diffs really are better for human consumption.
----------
messages: 10425
nosy: jorendorff
priority: bug
status: unread
title: bdiff.c: 4 is too low for popular-line threshold
____________________________________________________
Mercurial issue tracker <mercurial-bugs at selenic.com>
<http://mercurial.selenic.com/bts/issue1814>
____________________________________________________
More information about the Mercurial-devel
mailing list