[PATCH 02 of 22] radixlink: introduce a new data structure

Jun Wu quark at fb.com
Wed Jun 7 11:37:19 EDT 2017


Excerpts from Yuya Nishihara's message of 2017-06-08 00:21:58 +0900:
> [...] 
> I was thinking that the size of the obsstore index would be quite big, and
> reading/writing/jumping-around-in-memory it would be somewhat costly. If
> you have an idea to mitigate these costs, shrinking the cache size
> wouldn't be important.

If we store int32 for keys, and have a key reader function that reads 20
bytes in obsstore raw data, the index will be very efficient.

I think the biggest (well, 20ms ...) perf problem after this series in
real-world obsstore slowness comes from unnecessary computation around
'nonpublic()': we calculate hidden revisions today. Ideally we only check
visible nonpublic.

So I feel the right direction is to make hidden a lower layer concept, lower
than obsstore and phases to be efficient. I'm looking forward to Durham's
improvements in this area.

> > > [...]
> > > Nit: I believe we'll soon need "hg debugradixlink" command. How about splitting
> > > the module into core radixlink + pure impl + cext impl?
> > 
> > The radixlink type needs to be in C to make overhead minimal (so __getitem__
> > won't need a hash lookup). I think it might be fine to duplicate struct
> > definitions in the debug command.
> 
> I meant mercurial/radixlink.py could import pure/ attributes to implement
> dumpradixlink().

I later realized we can assign the native type to another Python variable in
pure code. Good idea.


More information about the Mercurial-devel mailing list