Making chg stateful
Yuya Nishihara
yuya at tcha.org
Fri Feb 3 10:11:22 EST 2017
On Thu, 2 Feb 2017 16:56:11 +0000, Jun Wu wrote:
> Excerpts from Yuya Nishihara's message of 2017-02-03 00:45:22 +0900:
> > On Thu, 2 Feb 2017 09:34:47 +0000, Jun Wu wrote:
> > > So what state do we store?
> > >
> > > {repopath: {name: (hash, content)}}. For example:
> > >
> > > cache = {'/home/foo/repo1': {'index': ('hash', changelogindex),
> > > 'bookmarks': ('hash', bookmarks),
> > > .... },
> > > '/home/foo/repo2': { .... }, .... }
> > >
> > > The main ideas here are:
> > > 1) Store the lowest level objects, like the C changelog index.
> > > Because higher level objects could be changed by extensions in
> > > unpredictable ways. (this is not true in my hacky prototype though)
> > > 2) Hash everything. For changelog, it's like the file stat of
> > > changelog.i. There must be a strong guarantee that the hash matches
> > > the content, which could be challenging, but not impossible. I'll
> > > cover more details below.
> > >
> > > The cache is scoped by repo to make the API simpler/easy to use. It may
> > > be interesting to have some global state (like passing back the extension
> > > path to import them at runtime).
> >
> > Regarding this and "2) Side-effect-free repo", can or should we design the API
> > as something like a low-level storage interface? That will allow us to not
> > make repo/revlog know too much about chg.
> >
> > I don't have any concrete idea, but that would work as follows:
> >
> > 1. chg injects an object to select storage backends
> > e.g. repo.storage = chgpreloadable(repo.storage, cache)
> > 2. repo passes it to revlog, etc.
> > 3. revlog uses it to read indexfile, where in-memory cache may be returned
> > e.g. storage.parserevlog(indexfile)
> >
> > Perhaps, this 'storage' object is similar to the one you call 'baserepository'.
>
> I'm not sure if I get the idea (probably not). How does the implementation
> in the master server look like?
I was just thinking about how to hack the real repo object without introducing
much mess. Perhaps the master server wouldn't be that different from your idea.
> It feels more like "repo.chgcache" to me and the difference is that the
> vanilla hg will be changed to access objects via it (so the interface looks
> more consistent).
Yeah, it might be like repo.chgcache.
Since we shouldn't pass repo to revlog (it's layering violation), I think
we'll need a thin wrapper for chgcache anyway.
> Things to consider:
>
> a) Objects being preloaded have dependency - ex. the obsstore depends on
> changelog and other things. The preload functions run in a defined
> order.
Maybe dependencies could be passed as arguments?
> b) The index file is not always a single file, depending on "vfs".
Yes. vfs could be owned by storage/chgcache class.
> c) The user may want to control what to preload. For example, if they have
> an incompatible manifest, they could make changelog preloaded, but not
> manifest.
No idea about (c).
> d) Users can add other preloading items easily, not only just the
> predefined ones.
So probably we'll need an extensible table of preloadable items.
> I think "storage.parserevlog(indexfile)" (treating index separately, without
> from a repo object) may have trouble dealing with "a)".
More information about the Mercurial-devel
mailing list