[PATCH 1 of 6 RFC] manifest: introduce an accessor class for manifests

Fri Nov 11 18:56:19 EST 2016

Excerpts from Pierre-Yves David's message of 2016-11-10 14:53:51 +0000:
> 
> On 11/08/2016 03:53 PM, Durham Goode wrote:
> > On 11/7/16 12:41 PM, Pierre-Yves David wrote:
> >
> >> On 11/05/2016 02:05 PM, Yuya Nishihara wrote:
> >>> On Thu, 3 Nov 2016 15:27:37 -0700, Durham Goode wrote:
> >>>> # HG changeset patch
> >>>> # User Durham Goode <durham at fb.com>
> >>>> # Date 1478208817 25200
> >>>> #      Thu Nov 03 14:33:37 2016 -0700
> >>>> # Branch stable
> >>>> # Node ID 1788ee9e1df92ac94b9be84eac6d16e3bad903a9
> >>>> # Parent  b9f7b0c10027764cee77f9c6d61877fcffea837f
> >>>> manifest: introduce an accessor class for manifests
> >>>>
> >>>> This introduces a revlogaccessor class which can be used to allow
> >>>> multiple
> >>>> objects hold an auto-invalidating reference to a revlog, without
> >>>> having to hold
> >>>> a reference to the actual repo object. Future patches will switch
> >>>> repo.manifest
> >>>> and repo.manifestlog to access the manifest through this accessor.
> >>>> This will fix
> >>>> the circular reference caused by manifestlog and manifestctx holding
> >>>> a reference
> >>>> to the repo
> >>>>
> >>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> >>>> --- a/mercurial/localrepo.py
> >>>> +++ b/mercurial/localrepo.py
> >>>> @@ -514,6 +514,11 @@ class localrepository(object):
> >>>>          # manifest creation.
> >>>>          return manifest.manifest(self.svfs)
> >>>>
> >>>> +    @unfilteredpropertycache
> >>>> +    def manifestaccessor(self):
> >>>> +        return revlogaccessor('00manifest.i', self.svfs,
> >>>> +                              self._constructmanifest)
> >>>
> >>> Honestly I don't get why manifestlog and manifestctxs have to live
> >>> longer
> >>> than the other repo properties.
> >>
> >> I'm also a bit curious about that.
> > From a user expectation perspective, I'm trying to offer the same
> > experience that changectx currently promises.  i.e. a user can hold on
> > to the changectx and access it later, regardless of what locks or
> > transactions happened in the interim.  Arguably changectx does a poor
> > job of this (since you have to access some data first in order for it to
> > be cached), but from a caller's perspective it usually "just works".  So
> > the lifetime of a changectx is already longer than the actual repo or
> > the specific changelog revlog instance.  So this change just makes
> > manifestctx appear to have that same lifetime.
> 
> I'm not sure how this is actually True for the changectx themself, This 
> might work for the most part by accident but I'm unsure this is an 
> explicit goal and (non-)API garantee. Do we actually explicitly rely on 
> that anywhere in the codebase?

I'd like note that keeping changectx objects across strip is definitely a
disaster. It's also problematic if the changectx got stripped for an aborted
transaction.

In general, I don't think keeping changesets across repo objects is a good
idea. Although I could be convinced by good use-cases.