[PATCH] memctx: allow the memctx to reuse the manifest node

Mateusz Kwapich mitrandir at fb.com
Thu Nov 17 08:27:43 EST 2016


Hi,
I’ll try write some tests for it – but it’s hard since we don’t have any feature that could
directly benefit from it in core.

Best,
Mateusz

On 11/16/16, 10:02 PM, "Jun Wu" <quark at fb.com> wrote:

    I like the direction!
    
    I'm a bit concerned about how other existing "memctx" methods (ex.
    "_status", "__contains__") would work. I'd like to make sure they still have
    reasonable behavior with this change. See inline comments.
    
    Excerpts from Mateusz Kwapich's message of 2016-11-16 20:15:28 +0000:
    > # HG changeset patch
    > # User Mateusz Kwapich <mitrandir at fb.com>
    > # Date 1479327311 0
    > #      Wed Nov 16 20:15:11 2016 +0000
    > # Node ID 0fd8175aa4e8a3a0cd6f637b34bfa25a103c454e
    > # Parent  c27614f2dec1405db606d1ef871dfabf72cc0737
    > memctx: allow the memctx to reuse the manifest node
    > 
    > When we have a lot of files writing a new manifest revision can be expensive.
    > This commit adds a possibility for memctx to reuse a manifest from a different
    > commit. This can be beneficial for commands that are creating metadata changes
    > without any actual files changed like "hg metaedit" in evolve extension.
    > 
    > I will send the change for evolve that leverages this once this is accepted.
    > 
    > diff --git a/mercurial/context.py b/mercurial/context.py
    > --- a/mercurial/context.py
    > +++ b/mercurial/context.py
    > @@ -1160,6 +1160,7 @@
    >                   changes=None):
    >          self._repo = repo
    >          self._rev = None
    > +        self._manifestnode = None
    >          self._node = None
    >          self._text = text
    >          if date:
    > @@ -1268,7 +1269,8 @@
    >          return None
    >  
    >      def manifestnode(self):
    > -        return None
    > +        return self._manifestnode
    > +
    >      def user(self):
    >          return self._user or self._repo.ui.username()
    >      def date(self):
    > @@ -1833,11 +1835,12 @@
    >      # this field to determine what to do in filectxfn.
    >      _returnnoneformissingfiles = True
    >  
    > -    def __init__(self, repo, parents, text, files, filectxfn, user=None,
    > -                 date=None, extra=None, editor=False):
    > +    def __init__(self, repo, parents, text, files, filectxfn=None, user=None,
    > +                 date=None, extra=None, editor=False, manifestnode=None):
    
    IIUC, "manifestnode" conflicts with "files" / "filectxfn" here. If that's
    true, I think we want:
    
      - memctx constructor change:
        - "files" / "filectxfn" are optional, and conflict with "manifestnode"
        - probably worthwhile to update the class docstring
      - if "manifestnode" is provided:
        - change "memctx._manifest" to read "self._manifestnode" directly
        - change "memctx._files" to get the files changed lazily from manifests
    
    Then "memctx"'s other methods like "status", "filenode", "__contains__"
    would probably work as expected and feel consistent.
    
    If the new implementation diverges significantly, I think it may be a good
    idea to do it in a different class, like "memlightctx" which must reuse a
    manifest node.

By providing the “files” we can commit the ctx without reading the manifest which
gives us a serious perf boost if the manifest is large.
    
    >          super(memctx, self).__init__(repo, text, user, date, extra)
    >          self._rev = None
    >          self._node = None
    > +        self._manifestnode = manifestnode
    >          parents = [(p or nullid) for p in parents]
    >          p1, p2 = parents
    >          self._parents = [changectx(self._repo, p) for p in (p1, p2)]
    > diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
    > --- a/mercurial/localrepo.py
    > +++ b/mercurial/localrepo.py
    > @@ -1695,7 +1695,11 @@
    >              tr = self.transaction("commit")
    >              trp = weakref.proxy(tr)
    >  
    > -            if ctx.files():
    > +            if ctx.manifestnode():
    
    I think we want a sanity check here to prevent buggy linkrevs that can
    give us trouble in the future. If ctx.parents' manifestnodes are differnet
    from ctx.manifestnode's parents or ctx.manifestnode (for the empty commit
    case), abort.

I’m fine with doing sanity checks. Thanks for  suggestion.
    
    > +                # reuse an existing manifest revision
    > +                mn = ctx.manifestnode()
    > +                files = ctx.files()
    > +            elif ctx.files():
    >                  m1ctx = p1.manifestctx()
    >                  m2ctx = p2.manifestctx()
    >                  mctx = m1ctx.copy()
    




More information about the Mercurial-devel mailing list