RFC: bitmap storage for precursors and phases

Stanislau Hlebik stash at fb.com
Fri Feb 17 11:06:33 EST 2017


Excerpts from Stanislau Hlebik's message of 2017-02-17 11:24:34 +0000:
> Excerpts from Jun Wu's message of 2017-02-16 13:42:46 -0800:
> > Excerpts from Stanislau Hlebik's message of 2017-02-16 19:39:07 +0000:
> > > Excerpts from Stanislau Hlebik's message of 2017-02-14 09:29:25 +0000:
> > > > Excerpts from Sean Farley's message of 2017-02-13 18:30:25 -0800:
> > > > > Jun Wu <quark at fb.com> writes:
> > > > > 
> > > > > > Excerpts from Sean Farley's message of 2017-02-13 17:04:35 -0800:
> > > > > >> I was thinking about a more high-level approach (please feel free to
> > > > > >> pick apart):
> > > > > >> 
> > > > > >> r = repo.filtered("bitmap1")
> > > > > >> r2 = r.filtered("bitmap2")
> > > > > >> 
> > > > > >> So that r2 would be an intersection of bitmap1 and bitmap2 (haven't
> > > > > >> thought about a union nor the inverse).
> > > > > >
> > > > > > That does not conflict with my comments. It could be implemented as nested
> > > > > > filters, or flatten the bitmap by doing an "or" operation.
> > > > > 
> > > > > Righto. Just wanted to bring it up early before things are set in stone.
> > > > 
> > > > Current `repoview.filtered()` implementation can apply only one filter. I
> > > > think it will be error-prone to change it, won't it?
> > > 
> > > It seems that it's better to use sorted lists instead of bitmaps. In a
> > > couple of places it is expected that repo.filteredrevs supports
> > > iteration. But iteration over a bitmap is very slow. Instead we can
> > > store list of non-public revs and list of precursor revs and load them
> > > in a set.
> > > 
> > > It will be slow for the case where repo has lots of draft commits.
> > > In this case it's probably better to disable this feature completely.
> > 
> > I think it's fine to have slow iteration, as long as the iteration operation
> > is uncommon. The point is, the most common operation should be fast - that
> > is, ishidden(rev) should be fast. Stateful chg will help the read-only case,
> > the bitmap + mmap would ideally make write (like rebase) cases fast
> .
> As I said before we will load all non-public revs in one set and all
> precursor revs in another set. So `ispublic()` and `isprecursor()`
> checks are very fast, it's just a set lookup. Same with update - it's
> just an insertion in the set.
> 
> > 
> > Some code can be changed - "scmutil.filteredhash" seems to be one user that
> > iterates "filteredrevs". But what it needs is only a hash - it could hash
> > something else, like the mtime, size etc.
> 
> Bookmarks, changelog, obsstore and tags can affect filtered set. 
> For filtered repo we'll need to use size + mtime of bookmarks,
> changelog, obsstore, tags and maybe even smth else. That maybe
> error-prone.

This is implementation of two caches (nonpublic + precursor) using
serialized sorted lists and sets
https://bitbucket.org/stashlebik/hg/commits/99879579ac2848a2567810b677d8344150a7b319?at=hiddenbitmaps_lists

> > 
> > Bitmaps could also be smarter - like maintaining the min and max revisions
> > so it does not need to be exactly len(repo).


More information about the Mercurial-devel mailing list