[PATCH 3 of 5 V2] manifest: introduce manifestlog and manifestctx classes
Martin von Zweigbergk
martinvonz at google.com
Mon Aug 22 18:25:25 EDT 2016
On Mon, Aug 22, 2016 at 2:57 PM, Durham Goode <durham at fb.com> wrote:
>
>
> On 8/22/16 11:24 AM, Martin von Zweigbergk wrote:
>>
>> On Wed, Aug 17, 2016 at 2:07 PM, Durham Goode <durham at fb.com> wrote:
>>>
>>> # HG changeset patch
>>> # User Durham Goode <durham at fb.com>
>>> # Date 1471465513 25200
>>> # Wed Aug 17 13:25:13 2016 -0700
>>> # Node ID 00f8d3832aad368660c69eff4be3e03a5568aebe
>>> # Parent 8ddbe86953023c40beb7be9dcb8025d1055813c5
>>> manifest: introduce manifestlog and manifestctx classes
>>>
>>> This is the start of a large refactoring of the manifest class. It
>>> introduces
>>> the new manifestlog and manifestctx classes which will represent the
>>> collection
>>> of all manifests and individual instances, respectively.
>>>
>>> Future patches will begin to convert usages of repo.manifest to
>>> repo.manifestlog, adding the necessary functionality to manifestlog and
>>> instance
>>> as they are needed.
>>
>> I'd appreciate seeing patch 5/5 squashed into this one. I don't like
>> introducing unused code like this without even tests (IIUC).
>>
>> And are we dropping 4/5? I'm not sure what the status of your
>> discussion with foozy is.
>
> I'd rather keep patch 4 until the fix lands in core, otherwise I'm not
> blocked on that diff. If I squash 5/5 into this one, then I have to squash
> 4 in as well. I can go ahead and do that, but that makes the patch a bit
> more dense.
Oh, I didn't even realize that 5/5 was blocked by 4/5, but that makes
sense now. I'll start tests now and queue this once they've finished.
Then foozy can decide whether to revert 4/5 later.
>
>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>> --- a/mercurial/localrepo.py
>>> +++ b/mercurial/localrepo.py
>>> @@ -504,6 +504,10 @@ class localrepository(object):
>>> def manifest(self):
>>> return manifest.manifest(self.svfs)
>>>
>>> + @storecache('00manifest.i')
>>> + def manifestlog(self):
>>> + return manifest.manifestlog(self.svfs, self.manifest)
>>> +
>>> @repofilecache('dirstate')
>>> def dirstate(self):
>>> return dirstate.dirstate(self.vfs, self.ui, self.root,
>>> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
>>> --- a/mercurial/manifest.py
>>> +++ b/mercurial/manifest.py
>>> @@ -914,6 +914,70 @@ class manifestrevlog(revlog.revlog):
>>> super(manifestrevlog, self).clearcaches()
>>> self._fulltextcache.clear()
>>>
>>> +class manifestlog(object):
>>> + """A collection class representing the collection of manifest
>>> snapshots
>>> + referenced by commits in the repository.
>>> +
>>> + In this situation, 'manifest' refers to the abstract concept of a
>>> snapshot
>>> + of the list of files in the given commit. Consumers of the output of
>>> this
>>> + class do not care about the implementation details of the actual
>>> manifests
>>> + they receive (i.e. tree or flat or lazily loaded, etc)."""
>>> + def __init__(self, opener, oldmanifest):
>>> + self._revlog = oldmanifest
>>> +
>>> + # We'll separate this into it's own cache once oldmanifest is no
>>> longer
>>> + # used
>>> + self._mancache = oldmanifest._mancache
>>> +
>>> + # _revlog is the same as _oldmanifest right now, but we
>>> eventually want
>>> + # to delete _oldmanifest while still allowing manifestlog to
>>> access the
>>> + # revlog specific apis.
>>> + self._oldmanifest = oldmanifest
>>> +
>>> + def __getitem__(self, node):
>>> + """Retrieves the manifest instance for the given node. Throws a
>>> KeyError
>>> + if not found.
>>> + """
>>> + if (self._oldmanifest._treeondisk
>>> + or self._oldmanifest._treeinmem):
>>> + # TODO: come back and support tree manifests directly
>>> + return self._oldmanifest.read(node)
>>> +
>>> + if node == revlog.nullid:
>>> + return manifestdict()
>>> + if node in self._mancache:
>>> + cachemf = self._mancache[node]
>>> + # The old manifest may put non-ctx manifests in the cache,
>>> so skip
>>> + # those since they don't implement the full api.
>>> + if isinstance(cachemf, manifestctx):
>>> + return cachemf
>>> +
>>> + m = manifestctx(self._revlog, node)
>>> + self._mancache[node] = m
>>> + return m
>>> +
>>> +class manifestctx(manifestdict):
>>> + """A class representing a single revision of a manifest, including
>>> its
>>> + contents, its parent revs, and its linkrev.
>>> + """
>>> + def __init__(self, revlog, node):
>>> + self._revlog = revlog
>>> +
>>> + self._node = node
>>> + self.p1, self.p2 = revlog.parents(node)
>>> + rev = revlog.rev(node)
>>> + self.linkrev = revlog.linkrev(rev)
>>> +
>>> + # This should eventually be made lazy loaded, so consumers can
>>> access
>>> + # the node/p1/linkrev data without having to parse the whole
>>> manifest.
>>> + data = revlog.revision(node)
>>> + arraytext = array.array('c', data)
>>> + revlog._fulltextcache[node] = arraytext
>>> + super(manifestctx, self).__init__(data)
>>
>> I think much of this is not yet needed. The pedantic part of me wishes
>> it was left out until it is used, but I won't insist. I suppose it
>> serves as documentation of what future patches will add.
>
> The data stuff down here is required for instantiating the contents of the
> manifestdict. The _node, p1, p2, linkrev stuff isn't used yet, sure, but
> that's the critical piece that differentiates this class from a
> manifestdict, and it's presence makes future patches clearer (like when we
> change the data to be lazily loaded, but node/p1/p2 are not).
More information about the Mercurial-devel
mailing list