Making chg stateful

Jun Wu quark at fb.com
Sun Feb 5 03:44:59 EST 2017


Excerpts from Yuya Nishihara's message of 2017-02-05 13:31:29 +0900:
> > > Since we shouldn't pass repo to revlog (it's layering violation), I think
> > > we'll need a thin wrapper for chgcache anyway.
> > 
> > I mentioned this in the second mail, "4) Where to get preloaded results (in
> > worker)", we could just expose some kind of global state, like a
> > "globalcache" module.
> 
> Does it mean any low-level objects will directly access to the global cache?
> That seems ugly (but maybe I'm biased as I'm really allergic to global data.)

Yes. But I don't think the alternative will be better. To avoid using global
state, existing functions need to take extra arguments. That's also
contagious - to preload some low-level objects, all functions through the
call stack need change.

I think the root cause of the fact that global state is considered bad is
because it's mutable by random code, and could make things hard to predict
or debug. We can enforce it to be only mutable by the chg preloading API, so
people cannot modify its state outside the preloading framework. That sounds
good enough to me.

> > > > Things to consider:
> > > > 
> > > >   a) Objects being preloaded have dependency - ex. the obsstore depends on
> > > >      changelog and other things. The preload functions run in a defined
> > > >      order.
> > > 
> > > Maybe dependencies could be passed as arguments?
> > 
> > Ideally, these expensive calculating (ex. obsstore) could be moved to the
> > index object. In the reality, that requires too much work, and obsstore
> > preloading requires a subset of "repo", including "repo.revs".
> > 
> > It's possible to decouple obsstore preloading from the repo object, but that
> > could be a lot of work too.
> 
> My opinion for obsstore is that the most costly part would be populating 100k+
> objects from file, and the other costly parts could be mitigated by some higher-
> level cache in repoview.py.
> 
> But I think this topic was discussed thoroughly between you and pyd before.
> I'm not intended to bring it up again.

Not discussed. I did mention a plan to move it to the index, but that'll
surely take a very long time.

There are multiple levels of logic related to obsstore, namely,
obsstore.getrevs, where the current logic uses revsets (thus needs a heavy
repo object). And repoview.compute*, which calls obsstore.getrevs and does
some extra simple caching and calculation.

Even without obsstore, other things (phases, bookmarks) depend on the
changelog. Therefore I think dependency problem must be solved, if we want
to have more things preloaded to achieve the best perf.

> > > >   b) The index file is not always a single file, depending on "vfs".
> > > 
> > > Yes. vfs could be owned by storage/chgcache class.
> > > 
> > > >   c) The user may want to control what to preload. For example, if they have
> > > >      an incompatible manifest, they could make changelog preloaded, but not
> > > >      manifest.
> > > 
> > > No idea about (c).
> > > 
> > > >   d) Users can add other preloading items easily, not only just the
> > > >      predefined ones.
> > > 
> > > So probably we'll need an extensible table of preloadable items.
> > 
> > If you check my prototype code, it's using a registrar to collect all
> > @preload functions.
> 
> Yes. I wanted to say we would need this kind of abstraction anyway.


More information about the Mercurial-devel mailing list