[PATCH 5 of 5 RFC] revlog: increase I/O bound to 4x the amount of data consumed

Siddharth Agarwal sid at less-broken.com
Wed Nov 12 19:24:05 CST 2014


On 11/12/2014 05:18 PM, Mads Kiilerich wrote:
> On 11/13/2014 12:09 AM, Siddharth Agarwal wrote:
>> # HG changeset patch
>> # User Siddharth Agarwal <sid0 at fb.com>
>> # Date 1415765299 28800
>> #      Tue Nov 11 20:08:19 2014 -0800
>> # Node ID 14df598f1c13fa7fb04daf652efb94bd0f55e8c9
>> # Parent  ec432ebb569fd4537093a3c446ccaefce5298636
>> revlog: increase I/O bound to 4x the amount of data consumed
>>
>> This doesn't affect normal clones since they'd be bound by the CPU 
>> bound below
>> anyway -- it does, however, improve generaldelta clones significantly.
>>
>> This also results in better deltaing for generaldelta clones -- in 
>> generaldelta
>> clones, we calculate deltas with respect to the closest base if it 
>> has a higher
>> revision number than either parent. If the base is on a significantly 
>> different
>> branch, this can result in pointlessly massive deltas. This reduces 
>> the number
>> of bases and hence the number of bad deltas.
>>
>> Empirically, for a highly branchy repository, this resulted in an 
>> improvement
>> of around 15% to manifest size.
>>
>> The sole test change is because in that case the compressed text is 
>> smaller
>> than the compressed delta. (We already use fulltexts if the 
>> uncompressed text
>> is smaller than the compressed delta.)
>
> There is no test change in the patch. You ended up addressing it in 
> patch 2 instead?

I tweaked a parameter slightly and the test change disappeared. I forgot 
to update the patch description.

- Siddharth

>
> /Mads
>
>>
>> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
>> --- a/mercurial/revlog.py
>> +++ b/mercurial/revlog.py
>> @@ -1267,7 +1267,7 @@
>>           #   the amount of I/O we need to do.
>>           # - 'compresseddeltalen' is the sum of the total size of 
>> deltas we need
>>           #   to apply -- bounding it limits the amount of CPU we 
>> consume.
>> -        if (d is None or dist > textlen * 2 or l > textlen or
>> +        if (d is None or dist > textlen * 4 or l > textlen or
>>               compresseddeltalen > textlen * 2 or
>>               (self._maxchainlen and chainlen > self._maxchainlen)):
>>               text = buildtext()
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at selenic.com
>> http://selenic.com/mailman/listinfo/mercurial-devel
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list