[PATCH 4 of 5 remotefilelog-http] remotefilelog: introduce new getfile method

Durham Goode durham at fb.com
Thu Jul 16 15:09:08 CDT 2015



On 7/16/15, 1:07 PM, "Augie Fackler" <raf at durin42.com> wrote:

>
>> On Jul 16, 2015, at 2:23 PM, Durham Goode <durham at fb.com> wrote:
>> 
>> 
>> 
>> On 7/16/15, 11:14 AM, "Augie Fackler" <raf at durin42.com> wrote:
>> 
>>> 
>>>> On Jul 16, 2015, at 1:36 PM, Durham Goode <durham at fb.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 7/16/15, 8:36 AM, "raf at durin42.com" <raf at durin42.com> wrote:
>>>> 
>>>>> # HG changeset patch
>>>>> # User Augie Fackler <augie at google.com>
>>>>> # Date 1435699951 14400
>>>>> #      Tue Jun 30 17:32:31 2015 -0400
>>>>> # Node ID ea8c6483a58df0eb0c296abc17cc846be80ed8cb
>>>>> # Parent  0080461e46716fa9732df990232f968ea3eb6d28
>>>>> remotefilelog: introduce new getfile method
>>>>> 
>>>>> Right now, this is a naive fetch-one-file method. The next change
>>>>>will
>>>>> mark the method as batchable and use a batch in the client so that
>>>>> many files can be requested in a single RPC.
>>>> 
>>>>> 
>>>>> diff --git a/remotefilelog/remotefilelogserver.py
>>>>> b/remotefilelog/remotefilelogserver.py
>>>>> --- a/remotefilelog/remotefilelogserver.py
>>>>> +++ b/remotefilelog/remotefilelogserver.py
>>>>> @@ -53,6 +53,7 @@ def onetimesetup(ui):
>>>>> 
>>>>>   # support file content requests
>>>>>   wireproto.commands['getfiles'] = (getfiles, '')
>>>>> +    wireproto.commands['getfile'] = (getfile, 'file node')
>>>>> 
>>>>>   class streamstate(object):
>>>>>       match = None
>>>>> @@ -155,9 +156,11 @@ def onetimesetup(ui):
>>>>>   def _capabilities(orig, repo, proto):
>>>>>       caps = orig(repo, proto)
>>>>>       if ((shallowrepo.requirement in repo.requirements or
>>>>> -            ui.configbool('remotefilelog', 'server'))
>>>>> -            and isinstance(proto, sshserver.sshserver)):
>>>>> -            caps.append(shallowrepo.requirement)
>>>>> +            ui.configbool('remotefilelog', 'server'))):
>>>>> +            if isinstance(proto, sshserver.sshserver):
>>>>> +                # legacy getfiles method which only works over ssh
>>>>> +                caps.append(shallowrepo.requirement)
>>>>> +            caps.append("getfile")
>>>> 
>>>> By always appending the getfile capability, won't this force even ssh
>>>> connections to use the new behavior? If that's intentional, do we have
>>>> numbers on how that affects perf?
>>> 
>>> I think (though I didn¹t recheck just now) that the remotefilelog
>>> codepath in the client is still preferred. We could certainly make it
>>> that way.
>> 
>> A few lines up in this patch it has:
>> 
>> + if remote.capable("getfile"):
>> +   _getfilesbatch(...
>> + else:
>> 
>> +   ...
>> +   _getfiles(
>> 
>> Which made me think it prefers getfile over getfiles.  If we reverse
>>this,
>> I can take this patch without worrying about perf (and I can test the
>>perf
>> later).
>
>That works for me. Please feel encouraged to just make that change.
>
>(if you’d rather I make the change, let me know and I can take care of it
>eventually)

I'll make it.  Should be pushed along with the rest later today.

>
>> 
>>> 
>>> I don¹t have a good way to get performance numbers for this code -
>>>heck,
>>> I only just started hacking on it. In *theory*, the request batching
>>> should do something reasonable to keep performance from being awful. If
>>> that¹s not happening in hg, we should fix request batching for ssh IMO.
>> 
>> My normal test is to take a large repo (mozilla central might work),
>>mark
>> it as a remotefilelog server, then just do a 'time hg prefetch -r tip'
>> from a client with an empty cache. This first time will be slow (since
>>it
>> builds the blobs), but the second time will be accurate.
>



More information about the Mercurial-devel mailing list