RFC: bitmap storage for precursors and phases

Sean Farley sean at farley.io
Mon Feb 13 20:04:35 EST 2017


Jun Wu <quark at fb.com> writes:

> In general, I think this is a good direction. Some random thoughts:
>
>   - general purposed
>
>     I think the bitmap is not always a cache, so it should only have
>     operations like set/unset/readfromdisk/writetodisk. Practically, I won't
>     couple cache invalidation with the bitmap implementation.
>
>     In additional, I'll try to avoid using Python-only types in the
>     interface. So once we decide to rewrite part of the implementation in
>     native C, we won't have trouble.
>
>     See "revset" below for a possibility that bitmap is used as a non-set.
>
>   - revset
>
>     This is a possibility that probably won't happen any time soon.
>
>     The revset currently uses Python set for maintaining its state. For huge
>     sets, Python sets may not be a good option. And various operations could
>     benefit from an always-topologically-sorted set, which is the bitmap.
>
>   - mmap
>     
>     My intuition is that bitmaps fit better with mmap which can reduce the
>     reading loading cost. I think "vfs.mmapread" could be a thing, and
>     various places can benefit from it - Gabor recently showed interest in
>     loading revlog data by mmap, I had patches that uses mmap to read revlog
>     index.
>
> In additional, not directly related to this series, I'm a big fan of
> single direction data flow. But the current code base does not seem to do a
> good job in this area. As we are adding more caching layers to the code
> base, it'd be nice if we have some tiny framework maintaining the dependency
> of all kinds of data, to be able to understand the data flow easily, and
> just to be more confident about loading orders. I think people more
> experienced on architecture may want to share some ideas here.

I was thinking about a more high-level approach (please feel free to
pick apart):

r = repo.filtered("bitmap1")
r2 = r.filtered("bitmap2")

So that r2 would be an intersection of bitmap1 and bitmap2 (haven't
thought about a union nor the inverse).


More information about the Mercurial-devel mailing list