xrevlog: experimental reimplementation of revlog in C

Tue Nov 16 00:45:22 CST 2010

On Mon, 2010-11-15 at 22:36 -0500, Greg Ward wrote:
> I have finally fixed all the bugs in xrevlog revealed by my first
> round of performance testing.  (Amazing what turns up when you try to
> decompress [nearly] all of the revisions in a large revlog.)  At any
> rate, the performance test now runs without segfaults on a couple of
> fair-sized revlogs.
> 
> So far I've done 2 runs on 2 revlogs: perftest.py using the Mercurial
> Python API and perftest.c using xrevlog against the changelogs from
> Mercurial (12k revs) and our large repo at work (113k revs).
> 
> Performance testing involves four tasks:
> 
> 1) openclose: open the revlog and close (or discard) it

Pretty meaningless comparison, as we generally read most of the revlog
in this step.

> 2) readindex: open the revlog and read the entire index (dumpindex
> without bothering stdout)
> 3) readtip: open the revlog and read the data for the tip revision (hg tip)
> 4) readdata: open the revlog and read a random 500 entries in forward order
> 
> Note that task 4 has changed since my last run.  I changed it from 10%
> of the entries to 500 of them to 1) make it go faster and 2) reduce
> variability between medium and large revlogs.  Now the difference
> should be down to factors inherent to the size of the revlog (or
> that's the theory).
> 
> Each task was run 50 times, and then I took the min() of 3 repeats of
> that.  In Python I used timeit, and in C I reimplemented the same
> algorithm.  All times are the time for 50 iterations of a particular
> task, in milliseconds.
> 
> And now the results.
> 
> Python/Mercurial, medium changelog (12k revs):
>   openclose     min   1496.8 ms
>   readindex     min   6146.1 ms
>   readtip       min   1521.0 ms
>   readdata      min   3941.5 ms

That's weird. readdata implies reading the whole index, so why is it
faster?

> 
> C/xrevlog, medium changelog (12k revs):
>   openclose     min      1.2 ms
>   readindex     min    233.9 ms    # 26x faster
>   readtip       min      3.4 ms    # 450x faster
>   readdata      min    699.6 ms    # 5.6x faster
> 
> Python/Mercurial, large changelog (113k revs):
>   openclose     min    372.2 ms
>   readindex     min 147255.4 ms
>   readtip       min    483.2 ms
>   readdata      min  26509.5 ms
> 
> C/xrevlog, large changelog (113k revs):
>   openclose     min      1.4 ms
>   readindex     min   2057.8 ms    # 71x faster
>   readtip       min      5.7 ms    # 85x faster,
>   readdata      min    923.6 ms    # 27x faster
> 
> Something is fishy with reading the whole index in Python.  It should
> be 10x slower for a 10x larger revlog, but it's more than 20x slower.

That's the overhead of the lazyparser hack. Feel free to tweak the
lazyparser breakpoint over your revlog size for testing.

> I'm pretty sure there was no swapping; I had >1 GB of free physical
> memory during all the runs.  (I went to single user mode, started up a
> few ttys, and flushed all kernel caches before starting.)

"vmstat 1" is your friend.

> Oh yeah, the test system is a 7-year-old laptop with a 1400 MHz
> Pentium M, 1.5 GB of RAM, a vanilla IDE hard disk, and ~500 MB of
> swap.  Hardly a killer performance machine.  ;-)
> 
> Also of note:
>   * this is still using mmap() -- after Matt's feedback, I want to try
> a non-mmap() version, but I wanted to fix all the bugs and get new
> numbers first
>   * one obvious optimization to try: I'm malloc()'ing a fresh buffer
> for gzip to decompress every rev into -- should probably alloc it once
> for the whole revlog and resize as needed. I think.
>   * I'm still not handling the node ID -> revnum mapping, which I
> think will force me to read the whole index into memory much more
> often -- so I'll lose some advantage there, I think

Never more than once per open?

-- 
Mathematics is the supreme nostalgia of our time.