[PATCH 1 of 6 RFC] manifest: introduce an accessor class for manifests

Fri Nov 11 04:24:26 EST 2016


On 11/10/16 2:53 PM, Pierre-Yves David wrote:
>
>
> On 11/08/2016 03:53 PM, Durham Goode wrote:
>> On 11/7/16 12:41 PM, Pierre-Yves David wrote:
>>
>>> On 11/05/2016 02:05 PM, Yuya Nishihara wrote:
>>>> On Thu, 3 Nov 2016 15:27:37 -0700, Durham Goode wrote:
>>>>> # HG changeset patch
>>>>> # User Durham Goode <durham at fb.com>
>>>>> # Date 1478208817 25200
>>>>> #      Thu Nov 03 14:33:37 2016 -0700
>>>>> # Branch stable
>>>>> # Node ID 1788ee9e1df92ac94b9be84eac6d16e3bad903a9
>>>>> # Parent  b9f7b0c10027764cee77f9c6d61877fcffea837f
>>>>> manifest: introduce an accessor class for manifests
>>>>>
>>>>> This introduces a revlogaccessor class which can be used to allow
>>>>> multiple
>>>>> objects hold an auto-invalidating reference to a revlog, without
>>>>> having to hold
>>>>> a reference to the actual repo object. Future patches will switch
>>>>> repo.manifest
>>>>> and repo.manifestlog to access the manifest through this accessor.
>>>>> This will fix
>>>>> the circular reference caused by manifestlog and manifestctx holding
>>>>> a reference
>>>>> to the repo
>>>>>
>>>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>>>> --- a/mercurial/localrepo.py
>>>>> +++ b/mercurial/localrepo.py
>>>>> @@ -514,6 +514,11 @@ class localrepository(object):
>>>>>          # manifest creation.
>>>>>          return manifest.manifest(self.svfs)
>>>>>
>>>>> +    @unfilteredpropertycache
>>>>> +    def manifestaccessor(self):
>>>>> +        return revlogaccessor('00manifest.i', self.svfs,
>>>>> +                              self._constructmanifest)
>>>>
>>>> Honestly I don't get why manifestlog and manifestctxs have to live
>>>> longer
>>>> than the other repo properties.
>>>
>>> I'm also a bit curious about that.
>> From a user expectation perspective, I'm trying to offer the same
>> experience that changectx currently promises.  i.e. a user can hold on
>> to the changectx and access it later, regardless of what locks or
>> transactions happened in the interim.  Arguably changectx does a poor
>> job of this (since you have to access some data first in order for it to
>> be cached), but from a caller's perspective it usually "just works".  So
>> the lifetime of a changectx is already longer than the actual repo or
>> the specific changelog revlog instance.  So this change just makes
>> manifestctx appear to have that same lifetime.
>
> I'm not sure how this is actually True for the changectx themself, 
> This might work for the most part by accident but I'm unsure this is 
> an explicit goal and (non-)API garantee. Do we actually explicitly 
> rely on that anywhere in the codebase?
>
> In all cases, I see changectx significantly higher level than 
> manifestct in the API. Many part of the code needs a "rich" object to 
> represent a change but very few place actually want to look and 
> manipulate the manifest directly. I'm still in the dark regarding you 
> actual plan here, why do you need to augmented the capability of the 
> manifestctx object?
>
> (note, "user" here is a bit confusing because I don't expect user to 
> be ever aware of any code internal, we are talking about caller or 
> extension writer here, right?)
Alright, I'll just send a patch that makes it pass around the 
manifestrevlog instead of the repo to the manifestlog and manifestctx. I 
think it may lead to subtle bugs in the future (like when attempting to 
access an old manifestctx), but it's probably not a big deal for now.