RFC: bitmap storage for precursors and phases

Bryan O'Sullivan bos at serpentine.com
Fri Feb 17 16:29:58 EST 2017


On Fri, Feb 17, 2017 at 10:30 AM, Jun Wu <quark at fb.com> wrote:

> Excerpts from Stanislau Hlebik's message of 2017-02-17 11:24:34 +0000:
> > As I said before we will load all non-public revs in one set and all
>
> The problem is, loading a Python set from disk is O(size-of-the-set).
>
> Bitmap's loading cost should be basically 0 (with mmap). I think that's why
> we want bitmap at the first place. There are other choices like packfile
> index, hash tables, but bitmap is the simplest and most efficient.
>

Hey folks,

I haven't yet seen mention of some considerations that seem very important
in driving the decision-making, so I'd appreciate it if someone could fill
me in.

Firstly, what's our current understanding of the sizes and compositions of
these sets of numbers? In theory, we have a lot of data from practical
application at Facebook, but nobody's brought it up.

To clarify why this matters, if an obsstore contains say 100 entries, then
a very naive implementation should be fine. If it's related to something
else, such as the size of, amount of activity in, or local age of a repo,
then maybe we have a different set of decisions to make.

Secondly, what operations are being performed on these sets today, and do
we know if those operations are going to change?

For instance, if a may be extended via new elements added to its end, then
an mmap-based implementation immediately becomes awkward. If a set is
traversed or we perform operations like union or intersection, then a
bitmap might make sense if the set is very small and dense, but not if it's
big and sparse.

(Jun's idea that mmap is effectively free is also not borne out in
practice, but that's a second-order issue here.)

So tell us what the needed API is, what the characteristics of the data
are, and that'll help understand how to make a good decision.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20170217/cbcc7d42/attachment.html>


More information about the Mercurial-devel mailing list