Hackathon proof-of-concept: faster revsets via indices

Bryan O'Sullivan bos at serpentine.com
Mon Feb 8 16:57:04 UTC 2016


I've been itching for a while to have more efficient ways to query revision
history in Mercurial, so during our last Hackathon at Facebook I built a
proof of concept to demonstrate to myself that (a) it wouldn't be too hard
and (b) it could have nice consequences.

The POC does the following:

   -

   Incrementally updates a sqlite3 full-text index whenever you run hg log,
   so you only pay for the indexing in small bites, when you actually need it.
   - Intercepts little bits of the revset execution engine to replace a
   filter-based traversal of every commit (very slow) with a database lookup
   followed by set intersection (both very fast).

What effects does this have in practice?

I ran my POC over my local clone of the mozilla-central repo, which is the
closest open-source repo in scale to ours. Here are the results of some
queries with the index up to date.

Plain Mercurial:

$ dhg --time log -q --user nnethercote --date -1300 >/dev/null
time: real 8.910 secs

My cheezy POC:

$ dhg --time log -q --user nnethercote --date -1300 >/dev/null
time: real 1.090 secs

This is currently and quite deliberately completely unfit for production,
but it illustrates how valuable revsets could be for repos above modest
size, where revsets are today glacially slow.

What would need to be done to make this proof-of-concept "real"?

   - A little more care in the schema design and use of indices.
   - Munging a parsed revset so that as much of possible of it could be
   turned into a single SQL query. With the POC, each revset results in a
   separate query, but sqlite can probably be relied upon to be much faster at
   joins than the revset machinery.
   - Result optimization – the order of operands to revset intersection
   *significantly* affects performance.
   - Possibly result pagination if the number of results is large and we
   want incremental-ish behaviour.

Still, this was super fun for a couple of hours of work!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20160208/1c35b483/attachment.html>


More information about the Mercurial-devel mailing list