[PATCH 3 of 8 "] compression: introduce a `storage.revlog.zlib.level` configuration

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Apr 2 09:50:54 EDT 2019



On 4/2/19 9:29 AM, Josef 'Jeff' Sipek wrote:
> On Sun, Mar 31, 2019 at 17:36:19 +0200, Pierre-Yves David wrote:
> ...
>> compression: introduce a `storage.revlog.zlib.level` configuration
>>
>> This option control the zlib compression level used when compression revlog
>> chunk.
>>
>> This is also a good excuse to pave the way for a similar configuration option
>> for the zstd compression engine. Having a dedicated option for each compression
>> algorithm is useful because they don't support the same range of values.
>>
>> Using a higher zlib compression impact CPU consumption at compression time, but
>> does not directly affected decompression time. However dealing with small
>> compressed chunk can directly help decompression and indirectly help other
>> revlog logic.
>>
>> I ran some basic test on repositories using different level. I am user the
> 
> s/user/using/ ?
> 
> ...
>> I also made some basic timing measurement. The "read" timing are gathered using
>> simple run of `hg perfrevlogrevisions`, the "write" measurement using `hg
>> perfrevlogwrite` (restricted to the last 5000 revisions for netbeans and
>> mozilla central). The timing are gathered on a generic machine, (not one  of
>> our performance locked machine), so small variation might not be meaningful.
> 
> You did more than one measurement, so measurement -> measurements, and
> timing -> timings?  Alternatively, keep the singular but then make the verbs
> match: are -> is.
> 
> Sorry to nit-pick, but since this text will end up in the commit messages...
> :)
> 
>> However large trend remains relevant.
>>
>> Keep in mind that these number are not pure compression/decompression time.
> 
> s/number/numbers/
> 
>> They also involve the full revlog logic. In particular the difference in chunk
>> size has an impact on the delta chain structure, affecting performance when
>> writing or reading them.
>>
>> On read/write performance, the compression level has a bigger impact.
>> Counter-intuitively, higher compression level raise better "write" performance
> 
> s/raise better/increase/ ?
> 
> This actually confuses me a bit.  Based on the table below, it looks like
> higher compression level has non-linear effect on read/write performance.
> Maybe I'm not understanding what you meant by 'raise "better"'.
> 
> While I expect to see a "hump" in *write* performance (because high zlib
> compression levels are such cpu hogs), I didn't expect to see one for *read*
> perfomance.  I suppose the read hump could be explained by the shape of the
> DAG, as you point out.

Yes, we not doing pure compression test here. This deserve an 
independant full array of of deeper testing.


>> +``revlog.zlib.level``
>> +    Zlib compression level used when storing data into the repository. Accepted
>> +    Value range from 1 (lowest compression) to 9 (highest compression). Zlib
>> +    default value is 6.
> 
> I know this is very unlikely to change, but does it make sense to say what
> an external libarary's defaults are?

I do not understand your question.


-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list