[PATCH 3 of 5 V2] manifest: introduce manifestlog and manifestctx classes

Durham Goode durham at fb.com
Tue Aug 23 13:30:24 EDT 2016


On 8/22/16 3:32 PM, Martin von Zweigbergk wrote:
> On Wed, Aug 17, 2016 at 2:07 PM, Durham Goode <durham at fb.com> wrote:
>> # HG changeset patch
>> # User Durham Goode <durham at fb.com>
>> # Date 1471465513 25200
>> #      Wed Aug 17 13:25:13 2016 -0700
>> # Node ID 00f8d3832aad368660c69eff4be3e03a5568aebe
>> # Parent  8ddbe86953023c40beb7be9dcb8025d1055813c5
>> manifest: introduce manifestlog and manifestctx classes
>>
>> This is the start of a large refactoring of the manifest class. It introduces
>> the new manifestlog and manifestctx classes which will represent the collection
>> of all manifests and individual instances, respectively.
>>
>> Future patches will begin to convert usages of repo.manifest to
>> repo.manifestlog, adding the necessary functionality to manifestlog and instance
>> as they are needed.
>>
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -504,6 +504,10 @@ class localrepository(object):
>>       def manifest(self):
>>           return manifest.manifest(self.svfs)
>>
>> +    @storecache('00manifest.i')
>> +    def manifestlog(self):
>> +        return manifest.manifestlog(self.svfs, self.manifest)
>> +
>>       @repofilecache('dirstate')
>>       def dirstate(self):
>>           return dirstate.dirstate(self.vfs, self.ui, self.root,
>> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
>> --- a/mercurial/manifest.py
>> +++ b/mercurial/manifest.py
>> @@ -914,6 +914,70 @@ class manifestrevlog(revlog.revlog):
>>           super(manifestrevlog, self).clearcaches()
>>           self._fulltextcache.clear()
>>
>> +class manifestlog(object):
>> +    """A collection class representing the collection of manifest snapshots
>> +    referenced by commits in the repository.
>> +
>> +    In this situation, 'manifest' refers to the abstract concept of a snapshot
>> +    of the list of files in the given commit. Consumers of the output of this
>> +    class do not care about the implementation details of the actual manifests
>> +    they receive (i.e. tree or flat or lazily loaded, etc)."""
>> +    def __init__(self, opener, oldmanifest):
>> +        self._revlog = oldmanifest
>> +
>> +        # We'll separate this into it's own cache once oldmanifest is no longer
>> +        # used
>> +        self._mancache = oldmanifest._mancache
>> +
>> +        # _revlog is the same as _oldmanifest right now, but we eventually want
>> +        # to delete _oldmanifest while still allowing manifestlog to access the
>> +        # revlog specific apis.
>> +        self._oldmanifest = oldmanifest
>> +
>> +    def __getitem__(self, node):
>> +        """Retrieves the manifest instance for the given node. Throws a KeyError
>> +        if not found.
>> +        """
>> +        if (self._oldmanifest._treeondisk
>> +            or self._oldmanifest._treeinmem):
>> +            # TODO: come back and support tree manifests directly
>> +            return self._oldmanifest.read(node)
>> +
>> +        if node == revlog.nullid:
>> +            return manifestdict()
>> +        if node in self._mancache:
>> +            cachemf = self._mancache[node]
>> +            # The old manifest may put non-ctx manifests in the cache, so skip
>> +            # those since they don't implement the full api.
>> +            if isinstance(cachemf, manifestctx):
>> +                return cachemf
>> +
>> +        m = manifestctx(self._revlog, node)
>> +        self._mancache[node] = m
>> +        return m
>> +
>> +class manifestctx(manifestdict):
> Btw, I don't really like this inheritance. It feels like the context
> *has*, not *is*, a manifest (the revlog data). Will it make sense to
> drop the manifestdict and treemanifest classes at the end and just
> have the context objects that have both the data (manifest) and the
> metadata (p1 etc)?
No, we still use manifestdict and treemanifest to return transient 
results (like diff results). We could probably remove this inheritance, 
but it would make the migration a bit messier, since every place that 
accesses a manifestdict would have to be replaced to access a 
manifestctx and then fetch the dict out of it.  I can look into it 
though, because a future patch makes the dict contents lazy loaded and I 
don't really like how it works.
>> +    """A class representing a single revision of a manifest, including its
>> +    contents, its parent revs, and its linkrev.
>> +    """
>> +    def __init__(self, revlog, node):
>> +        self._revlog = revlog
>> +
>> +        self._node = node
>> +        self.p1, self.p2 = revlog.parents(node)
>> +        rev = revlog.rev(node)
>> +        self.linkrev = revlog.linkrev(rev)
>> +
>> +        # This should eventually be made lazy loaded, so consumers can access
>> +        # the node/p1/linkrev data without having to parse the whole manifest.
>> +        data = revlog.revision(node)
>> +        arraytext = array.array('c', data)
>> +        revlog._fulltextcache[node] = arraytext
>> +        super(manifestctx, self).__init__(data)
>> +
>> +    def node(self):
>> +        return self._node
>> +
>>   class manifest(manifestrevlog):
>>       def __init__(self, opener, dir='', dirlogcache=None):
>>           '''The 'dir' and 'dirlogcache' arguments are for internal use by
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at mercurial-scm.org
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mercurial-2Dscm.org_mailman_listinfo_mercurial-2Ddevel&d=DQIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=nuarHzhP1wi1T9iURRCj1A&m=tflSVzQmSqajCY29l0TIO5Uf3Z-7TiZrBZ5SbSFpjxE&s=QjS29GfMFKZdLuyEiYlX0BZzh_lnbGy9Vifp9BwwoyQ&e=



More information about the Mercurial-devel mailing list