xrevlog: experimental reimplementation of revlog in C

Greg Ward greg at gerg.ca
Wed Nov 10 21:01:10 CST 2010

On Wed, Nov 10, 2010 at 4:14 PM, Matt Mackall <mpm at selenic.com> wrote:
> This looks promising. I've speculated that the right way to fix this is
> by making our parser C extension make an array-like class to replace
> revlog.index. Not sure it's worth rewriting the rest of revlog in C
> though.

My gut instinct says that the core of the problem is the list and dict
storing the index in both directions: it takes time to parse the
index, time to build those data structures, and memory to store them.
So I think that custom C data structure should serve both purposes.
The interesting part is mapping node IDs to revnums, which I have
avoided doing in xrevlog so far.  (Basically it needs a hashtable
where the hash function is identity.  Seems silly to spend effort
hashing SHA-1 hashes.)

As for rewriting the rest of revlog in C: oh well, I've gone and done
it, and now it's out there.  It's ~800 lines plus a stripped down copy
of your mpatch.c (~500 lines).  And so far it only supports looking up
revs by revnum, not node ID.

Honestly, I really did this because 1) I wanted to understand revlog
better and 2) it was a great excuse to learn about mmap().  Success on
both counts.  I think I need to deal with lookups by node ID before we
can see about adapting some/all of it into Mercurial.

Anyone know a good hashtable library for C?  ;-)


More information about the Mercurial-devel mailing list