D7894: nodemap: introduce an option to use mmap to read the nodemap mapping

marmoute (Pierre-Yves David) phabricator at mercurial-scm.org
Fri Jan 17 11:30:40 EST 2020


marmoute added a comment.


  In D7894#116540 <https://phab.mercurial-scm.org/D7894#116540>, @martinvonz wrote:
  
  > In D7894#116407 <https://phab.mercurial-scm.org/D7894#116407>, @marmoute wrote:
  >
  >> In D7894#116397 <https://phab.mercurial-scm.org/D7894#116397>, @martinvonz wrote:
  >>
  >>> How much does this patch help performance?
  >>> I would also like to see performance numbers (even just rough ones) for the Rust version. Sorry about a possibly stupid question, but why will this on-disk nodemap be faster than building it from the index? Is it that the file is smaller and thus faster to read? Or is it more the building of the tree than the reading that's slow? You mentioned you use some private repo for testing this. How large is the `00changelog.n` file in that repo and how large is `00changelog.i`?
  >>
  >> This save hundreds of milli second at initialization of large repositories. The repositories we are looking at are about 2 millions revisions. (but this will help smaller repository too). Mozilla try is a public repository in that range. It 00changelog.i is 103MB
  >
  > And 00changelog.n?
  
  
  
    106973952 bytes for changelog.i
     83123200 bytes for the nodemap rawfiles
  
  
  
  >> The information in the .i files is just a flat list of node. So anything that need a mapping needs to build it. Building a mapping for millions of revision is slow. (I think Georges mentionned 300ms to build the mozilla-try nodemap). The nodemap we write on disk is directly usage as such. So we just need to mmap the files (mostly instant is the repository have been busy recently, eg: on server) and directly query the data from disk.
  >
  > Ah, that's what I was wondering. I was wondering while reviewing this series if your plan was to lazily from disk but I didn't see any mention of that. I guess this mmap business should have been that a hint :)
  
  Yes, so everythign we are doing is not so really "serialization" since we never actualy "deserialize" it in practice.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7894/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7894

To: marmoute, #hg-reviewers
Cc: gracinet, martinvonz, mercurial-devel


More information about the Mercurial-devel mailing list