[PATCH 1 of 6 RFC] manifest: introduce an accessor class for manifests

Pierre-Yves David pierre-yves.david at ens-lyon.org
Thu Nov 10 09:53:51 EST 2016



On 11/08/2016 03:53 PM, Durham Goode wrote:
> On 11/7/16 12:41 PM, Pierre-Yves David wrote:
>
>> On 11/05/2016 02:05 PM, Yuya Nishihara wrote:
>>> On Thu, 3 Nov 2016 15:27:37 -0700, Durham Goode wrote:
>>>> # HG changeset patch
>>>> # User Durham Goode <durham at fb.com>
>>>> # Date 1478208817 25200
>>>> #      Thu Nov 03 14:33:37 2016 -0700
>>>> # Branch stable
>>>> # Node ID 1788ee9e1df92ac94b9be84eac6d16e3bad903a9
>>>> # Parent  b9f7b0c10027764cee77f9c6d61877fcffea837f
>>>> manifest: introduce an accessor class for manifests
>>>>
>>>> This introduces a revlogaccessor class which can be used to allow
>>>> multiple
>>>> objects hold an auto-invalidating reference to a revlog, without
>>>> having to hold
>>>> a reference to the actual repo object. Future patches will switch
>>>> repo.manifest
>>>> and repo.manifestlog to access the manifest through this accessor.
>>>> This will fix
>>>> the circular reference caused by manifestlog and manifestctx holding
>>>> a reference
>>>> to the repo
>>>>
>>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>>> --- a/mercurial/localrepo.py
>>>> +++ b/mercurial/localrepo.py
>>>> @@ -514,6 +514,11 @@ class localrepository(object):
>>>>          # manifest creation.
>>>>          return manifest.manifest(self.svfs)
>>>>
>>>> +    @unfilteredpropertycache
>>>> +    def manifestaccessor(self):
>>>> +        return revlogaccessor('00manifest.i', self.svfs,
>>>> +                              self._constructmanifest)
>>>
>>> Honestly I don't get why manifestlog and manifestctxs have to live
>>> longer
>>> than the other repo properties.
>>
>> I'm also a bit curious about that.
> From a user expectation perspective, I'm trying to offer the same
> experience that changectx currently promises.  i.e. a user can hold on
> to the changectx and access it later, regardless of what locks or
> transactions happened in the interim.  Arguably changectx does a poor
> job of this (since you have to access some data first in order for it to
> be cached), but from a caller's perspective it usually "just works".  So
> the lifetime of a changectx is already longer than the actual repo or
> the specific changelog revlog instance.  So this change just makes
> manifestctx appear to have that same lifetime.

I'm not sure how this is actually True for the changectx themself, This 
might work for the most part by accident but I'm unsure this is an 
explicit goal and (non-)API garantee. Do we actually explicitly rely on 
that anywhere in the codebase?

In all cases, I see changectx significantly higher level than manifestct 
in the API. Many part of the code needs a "rich" object to represent a 
change but very few place actually want to look and manipulate the 
manifest directly. I'm still in the dark regarding you actual plan here, 
why do you need to augmented the capability of the manifestctx object?

(note, "user" here is a bit confusing because I don't expect user to be 
ever aware of any code internal, we are talking about caller or 
extension writer here, right?)

>>> But suppose that is necessary, I agree we'll
>>> need this kind of a wrapper.
>>
>> Any reason why we are using the wrapper approach over using a weak
>> reference ? Weakref are not great but we use them in multiple spot
>> when needed. this seems it would be simpler than the current approach,
>> but I might be missing something.
> A weak reference to the localrepo instance?  I'd be fine with that, but
> I figured the community wouldn't like that since it it means the code is
> still circular (even if the memory references aren't).  If both of you
> are ok with the weak reference, I can send that as a series instead.
>>
>>> Maybe we can move the accessor to the manifestlog,
>>> but still the accessor will have to be shared (and updated
>>> transparently) by
>>> the manifestlog and its cachable manifestctxs.
>>>
>>>> +def revlogaccessor(filename, opener, constructor):
>>>> +    """Creates an accessor that provides cached and invalidated
>>>> access to a
>>>> +    revlog, via instance.revlog. This is useful for letting
>>>> multiple objects
>>>> +    hold a reference to the revlog, without having to hold a
>>>> possibly-circular
>>>> +    reference to the actual repository.  """
>>>> +
>>>> +    # We have to use a runtime type here, because the only way to
>>>> create a
>>>> +    # property is to put it on a class itself, and the property is
>>>> dynamically
>>>> +    # defined by the filename parameter.
>>>> +    class accessor(object):
>>>
>>> Perhaps we could refactor filecache to avoid dynamically creating a
>>> class, but
>>> that would be a minor issue of this RFC series.

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list