xrevlog: experimental reimplementation of revlog in C
Matt Mackall
mpm at selenic.com
Tue Nov 16 00:45:22 CST 2010
On Mon, 2010-11-15 at 22:36 -0500, Greg Ward wrote:
> I have finally fixed all the bugs in xrevlog revealed by my first
> round of performance testing. (Amazing what turns up when you try to
> decompress [nearly] all of the revisions in a large revlog.) At any
> rate, the performance test now runs without segfaults on a couple of
> fair-sized revlogs.
>
> So far I've done 2 runs on 2 revlogs: perftest.py using the Mercurial
> Python API and perftest.c using xrevlog against the changelogs from
> Mercurial (12k revs) and our large repo at work (113k revs).
>
> Performance testing involves four tasks:
>
> 1) openclose: open the revlog and close (or discard) it
Pretty meaningless comparison, as we generally read most of the revlog
in this step.
> 2) readindex: open the revlog and read the entire index (dumpindex
> without bothering stdout)
> 3) readtip: open the revlog and read the data for the tip revision (hg tip)
> 4) readdata: open the revlog and read a random 500 entries in forward order
>
> Note that task 4 has changed since my last run. I changed it from 10%
> of the entries to 500 of them to 1) make it go faster and 2) reduce
> variability between medium and large revlogs. Now the difference
> should be down to factors inherent to the size of the revlog (or
> that's the theory).
>
> Each task was run 50 times, and then I took the min() of 3 repeats of
> that. In Python I used timeit, and in C I reimplemented the same
> algorithm. All times are the time for 50 iterations of a particular
> task, in milliseconds.
>
> And now the results.
>
> Python/Mercurial, medium changelog (12k revs):
> openclose min 1496.8 ms
> readindex min 6146.1 ms
> readtip min 1521.0 ms
> readdata min 3941.5 ms
That's weird. readdata implies reading the whole index, so why is it
faster?
>
> C/xrevlog, medium changelog (12k revs):
> openclose min 1.2 ms
> readindex min 233.9 ms # 26x faster
> readtip min 3.4 ms # 450x faster
> readdata min 699.6 ms # 5.6x faster
>
> Python/Mercurial, large changelog (113k revs):
> openclose min 372.2 ms
> readindex min 147255.4 ms
> readtip min 483.2 ms
> readdata min 26509.5 ms
>
> C/xrevlog, large changelog (113k revs):
> openclose min 1.4 ms
> readindex min 2057.8 ms # 71x faster
> readtip min 5.7 ms # 85x faster,
> readdata min 923.6 ms # 27x faster
>
> Something is fishy with reading the whole index in Python. It should
> be 10x slower for a 10x larger revlog, but it's more than 20x slower.
That's the overhead of the lazyparser hack. Feel free to tweak the
lazyparser breakpoint over your revlog size for testing.
> I'm pretty sure there was no swapping; I had >1 GB of free physical
> memory during all the runs. (I went to single user mode, started up a
> few ttys, and flushed all kernel caches before starting.)
"vmstat 1" is your friend.
> Oh yeah, the test system is a 7-year-old laptop with a 1400 MHz
> Pentium M, 1.5 GB of RAM, a vanilla IDE hard disk, and ~500 MB of
> swap. Hardly a killer performance machine. ;-)
>
> Also of note:
> * this is still using mmap() -- after Matt's feedback, I want to try
> a non-mmap() version, but I wanted to fix all the bugs and get new
> numbers first
> * one obvious optimization to try: I'm malloc()'ing a fresh buffer
> for gzip to decompress every rev into -- should probably alloc it once
> for the whole revlog and resize as needed. I think.
> * I'm still not handling the node ID -> revnum mapping, which I
> think will force me to read the whole index into memory much more
> often -- so I'll lose some advantage there, I think
Never more than once per open?
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list