[PATCH] changelog: load pending file directly

Jun Wu quark at fb.com
Sun May 14 12:22:57 EDT 2017


Excerpts from Gregory Szorc's message of 2017-05-13 23:20:21 -0700:
> [...]
> > Instead of marking part of the operations "read-only", I wonder if we can
> > do
> > the opposite - repo is immutable by default, write operations need to be
> > marked explicitly. We already have repo.transaction, which provides the
> > information already.
> >
> 
> I agree that the repo should be read-only by default and that some kind of
> function call is necessary to transform it into mutable/slow mode.

I tried replacing "def lock" in repoview to switch between fast and slow
path, sadly there are a few test failures, namely test-clone. I guess we
still have some code path that's writing without proper locking that needs
to be cleaned up first.

> [...]
> Yes, that's certainly possible (and is something I've considered
> implementing). However, we do need hints to go to the next level. For
> example, if you know you will be reading most/all revisions, you can do
> things like fetch all the data and decompress all chunks parallel. That's
> more efficient than going back and forth between I/O and processing.
> python-zstandard has APIs for parallel decompression BTW. If the right
> revlog APIs/hints are in place, revlog read performance can speed up
> drastically. Of course, reading *everything* doesn't scale to infinity. But
> you can set batch size to something ridiculously large like 200 MB to get
> the benefits on 99% of revlogs.

Agree. Some related notes:

  1. Currently accessing changelog index needs to take the GIL. Maybe
     re-invent changelog index features incrementally without CPython API
     (in clean C or rust) is the sane way to move forward.
  2. If revlog could spawn a thread pool implicitly, then chg needs a way to
     stop those threads. Otherwise at fork time threads will get lost.

> [...]


More information about the Mercurial-devel mailing list