RFC: bitmap storage for precursors and phases

Stanislau Hlebik stash at fb.com
Fri Feb 17 06:24:34 EST 2017


Excerpts from Jun Wu's message of 2017-02-16 13:42:46 -0800:
> Excerpts from Stanislau Hlebik's message of 2017-02-16 19:39:07 +0000:
> > Excerpts from Stanislau Hlebik's message of 2017-02-14 09:29:25 +0000:
> > > Excerpts from Sean Farley's message of 2017-02-13 18:30:25 -0800:
> > > > Jun Wu <quark at fb.com> writes:
> > > > 
> > > > > Excerpts from Sean Farley's message of 2017-02-13 17:04:35 -0800:
> > > > >> I was thinking about a more high-level approach (please feel free to
> > > > >> pick apart):
> > > > >> 
> > > > >> r = repo.filtered("bitmap1")
> > > > >> r2 = r.filtered("bitmap2")
> > > > >> 
> > > > >> So that r2 would be an intersection of bitmap1 and bitmap2 (haven't
> > > > >> thought about a union nor the inverse).
> > > > >
> > > > > That does not conflict with my comments. It could be implemented as nested
> > > > > filters, or flatten the bitmap by doing an "or" operation.
> > > > 
> > > > Righto. Just wanted to bring it up early before things are set in stone.
> > > 
> > > Current `repoview.filtered()` implementation can apply only one filter. I
> > > think it will be error-prone to change it, won't it?
> > 
> > It seems that it's better to use sorted lists instead of bitmaps. In a
> > couple of places it is expected that repo.filteredrevs supports
> > iteration. But iteration over a bitmap is very slow. Instead we can
> > store list of non-public revs and list of precursor revs and load them
> > in a set.
> > 
> > It will be slow for the case where repo has lots of draft commits.
> > In this case it's probably better to disable this feature completely.
> 
> I think it's fine to have slow iteration, as long as the iteration operation
> is uncommon. The point is, the most common operation should be fast - that
> is, ishidden(rev) should be fast. Stateful chg will help the read-only case,
> the bitmap + mmap would ideally make write (like rebase) cases fast
.
As I said before we will load all non-public revs in one set and all
precursor revs in another set. So `ispublic()` and `isprecursor()`
checks are very fast, it's just a set lookup. Same with update - it's
just an insertion in the set.

> 
> Some code can be changed - "scmutil.filteredhash" seems to be one user that
> iterates "filteredrevs". But what it needs is only a hash - it could hash
> something else, like the mtime, size etc.

Bookmarks, changelog, obsstore and tags can affect filtered set. 
For filtered repo we'll need to use size + mtime of bookmarks,
changelog, obsstore, tags and maybe even smth else. That maybe
error-prone.
> 
> Bitmaps could also be smarter - like maintaining the min and max revisions
> so it does not need to be exactly len(repo).


More information about the Mercurial-devel mailing list