[PATCH 3 of 3] listkeypattern: add listkeypattern wireproto method

Stanislau Hlebik stash at fb.com
Wed Aug 17 16:26:37 EDT 2016


Durham, that’s exactly the implementation I wrote locally:
part generator on the server that reads ‘patterns’ (patterns or empty list) and ‘bookmarks’ (set to True) parameters from kwargs (‘cg’ is set to False);
part handler on client.
Agree, it really looks error-prone and unbundle can probably be better.

On 8/17/16, 9:08 PM, "Durham Goode" <durham at fb.com> wrote:

    On 8/16/16 5:22 PM, Pierre-Yves David wrote:
    >
    >
    > On 08/14/2016 07:17 PM, Gregory Szorc wrote:
    >> On Fri, Aug 12, 2016 at 5:09 AM, Stanislau Hlebik <stash at fb.com
    >> <mailto:stash at fb.com>> wrote:
    >>
    >>     # HG changeset patch
    >>     # User Stanislau Hlebik <stash at fb.com <mailto:stash at fb.com>>
    >>     # Date 1470999441 25200
    >>     #      Fri Aug 12 03:57:21 2016 -0700
    >>     # Node ID c2ee493e216c60ff439ab93cc1efe6ac5922d8eb
    >>     # Parent  fd2185d7c2f7aa529b2ad0a6584832fb2b1b4ecb
    >>     listkeypattern: add listkeypattern wireproto method
    >>
    >>     wireproto method to list remote keys by pattern
    >>
    >>     diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py
    >>     --- a/mercurial/wireproto.py
    >>     +++ b/mercurial/wireproto.py
    >>     @@ -353,6 +353,22 @@
    >>                            % (namespace, len(d)))
    >>              yield pushkeymod.decodekeys(d)
    >>
    >>     +    @batchable
    >>     +    def listkeypattern(self, namespace, patterns):
    >>     +        if not self.capable('pushkey'):
    >>     +            yield {}, None
    >>     +        f = future()
    >>     +        self.ui.debug('preparing listkeys for "%s" with pattern
    >>     "%s"\n' %
    >>     +                      (namespace, patterns))
    >>     +        yield {
    >>     +            'namespace': encoding.fromlocal(namespace),
    >>     +            'patterns': encodelist(patterns)
    >>     +        }, f
    >>     +        d = f.value
    >>     +        self.ui.debug('received listkey for "%s": %i bytes\n'
    >>     +                      % (namespace, len(d)))
    >>     +        yield pushkeymod.decodekeys(d)
    >>     +
    >>          def stream_out(self):
    >>              return self._callstream('stream_out')
    >>
    >>     @@ -676,7 +692,8 @@
    >>          return repo.opener.tryread('clonebundles.manifest')
    >>
    >>      wireprotocaps = ['lookup', 'changegroupsubset', 'branchmap', 
    >> 'pushkey',
    >>     -                 'known', 'getbundle', 'unbundlehash', 'batch']
    >>     +                 'known', 'getbundle', 'unbundlehash', 'batch',
    >>     +                 'listkeypattern']
    >>
    >>      def _capabilities(repo, proto):
    >>          """return a list of capabilities for a repo
    >>     @@ -791,6 +808,12 @@
    >>          d = repo.listkeys(encoding.tolocal(namespace)).items()
    >>          return pushkeymod.encodekeys(d)
    >>
    >>     + at wireprotocommand('listkeypattern', 'namespace patterns *')
    >>
    >>
    >> Why the "*" here? "others" is not used in the function implementation.
    >>
    >>
    >>     +def listkeypattern(repo, proto, namespace, patterns, others):
    >>     +    patterns = decodelist(patterns)
    >>     +    d = repo.listkeys(encoding.tolocal(namespace),
    >>     patterns=patterns).items()
    >>     +    return pushkeymod.encodekeys(d)
    >>     +
    >>      @wireprotocommand('lookup', 'key')
    >>      def lookup(repo, proto, key):
    >>          try:
    >>
    >>
    >> I think introducing a new wire protocol command is the correct way to
    >> solve this problem (as opposed to introducing a new argument on the
    >> existing command).
    >>
    >> However, if we're introducing a new wire protocol command for obtaining
    >> pushkey values, I think we should improve deficiencies in the response
    >> encoding rather than propagate its problems.
    >>
    >> The "listkeys" response encoding can't transmit the full range of binary
    >> values. This can lead to larger (and slower) responses sizes. For
    >> example, a number of pushkey namespaces exchange lists of nodes. These
    >> have to be represented as hex instead of binary. For pushkey namespaces
    >> like phases or obsolescence that can exchange hundreds or thousands of
    >> nodes, the overhead can add up.
    >>
    >> I think the response from a new listkeys command should be using framing
    >> to encode the key names and values so the full range of binary values
    >> can be efficiently transferred. We may also want a special mechanism to
    >> represent a list of nodes, as avoiding the overhead of framing on fixed
    >> width values would be desirable.
    >>
    >> Of course, at the point you introduce a new response encoding, we may
    >> want to call the command "listkeys2." If others agree, I can code up an
    >> implementation and you can add the patterns functionality on top of it.
    >
    > Sorry to be a bit late to the discussion.
    >
    > I don't think we should introduce a new wire-protocol command for this.
    >
    > Individual listkey call have been a large source of race condition and 
    > related issue. Bundle2 is able to carry listkey.pushkey call just fine 
    > and I think we should prioritize its usage. As bundle2 already have 
    > framing, we could just use your better encoding with bundle2 directly.
    > However, we should probably push things further and use dedicated part 
    > for commonly used request. Having dedicated type will help more 
    > semantic reply and request. The series we are commenting on is a good 
    > example of that need for "pattern" here pretty much only apply to 
    > bookmark, having a dedicated "channel" (within bundle2) for this would 
    > make is painless to add any new arguments we could need.
    If we create a bundle2 part for transmitting bookmark information, what 
    does the request flow look like?  getbundle seems to take kwargs, so 
    would all requests piggy back their query parameters into the getbundle 
    kwargs?  Like passing "cg=False, bookmarks=True, bookmark_pattern=foo/*" 
    as arguments that the receiver would know to produce a bookmarkpart that 
    matches that parameter (and know not to send a changegroup)?  Seems like 
    a recipe for problems if every requestable part type overloads the same 
    kwargs.
    
    Is the alternative to use the unbundle protocol to send empty parts with 
    parameters as the way of requesting data?  Then the response is sent as 
    a reply part?  Would that then eventually deprecate getbundle entirely?
    
    I'm just trying to understand how we might implement the listkeypattern 
    functionality using bundle2 without having to refactor bundle2 much.
    



More information about the Mercurial-devel mailing list