[PATCH 2 of 2 remotefilelog-ext getfile-batching] fileserverclient: add config knob to control batch size

Augie Fackler raf at durin42.com
Fri Aug 21 12:02:31 CDT 2015


On Thu, Aug 20, 2015 at 08:48:46AM -0700, Gregory Szorc wrote:
>
>
> > On Aug 18, 2015, at 13:07, Augie Fackler <raf at durin42.com> wrote:
> >
> > # HG changeset patch
> > # User Augie Fackler <augie at google.com>
> > # Date 1439925241 14400
> > #      Tue Aug 18 15:14:01 2015 -0400
> > # Node ID 802538ce2a9be544828417ff0a86aba847a63307
> > # Parent  1183fcaf20272ca5dbcc0c308e9e11b1c7c97093
> > fileserverclient: add config knob to control batch size
> >
> > Previously we'd just send one enormous batch for everything to the
> > server. This led to prolonged periods of no progress output for the
> > user. Now we send batches in smaller chunks (default is 100) which
> > gives the user some idea that things are working.
> >
> > Includes a trivial test, which doesn't really verify that the batching
> > logic is used as described, but at least prevents the boneheaded error
> > I had in an earlier (unmailed) version of this patch which forgot to
> > use configint() when loading the config setting.
>
> Could you cat the server's access log to verify batching is being performed?

Good idea. I'll do that as a followup, since it's slightly orthogonal
to this patch. Thanks!

>
> >
> > diff --git a/remotefilelog/fileserverclient.py b/remotefilelog/fileserverclient.py
> > --- a/remotefilelog/fileserverclient.py
> > +++ b/remotefilelog/fileserverclient.py
> > @@ -100,18 +100,21 @@ class cacheconnection(object):
> >
> >         return result
> >
> > -def _getfilesbatch(remote, receivemissing, progresstick, missed, idmap):
> > -    b = remote.batch()
> > -    futures = {}
> > -    for m in missed:
> > -        file_ = idmap[m]
> > -        node = m[-40:]
> > -        futures[m] = b.getfile(file_, node)
> > -    b.submit()
> > -    for m in missed:
> > -        v = futures[m].value
> > -        receivemissing(io.BytesIO('%d\n%s' % (len(v), v)), m)
> > -        progresstick()
> > +def _getfilesbatch(
> > +        remote, receivemissing, progresstick, missed, idmap, batchsize):
> > +    while missed:
> > +        chunk, missed = missed[:batchsize], missed[batchsize:]
> > +        b = remote.batch()
> > +        futures = {}
> > +        for m in chunk:
> > +            file_ = idmap[m]
> > +            node = m[-40:]
> > +            futures[m] = b.getfile(file_, node)
> > +        b.submit()
> > +        for m in chunk:
> > +            v = futures[m].value
> > +            receivemissing(io.BytesIO('%d\n%s' % (len(v), v)), m)
> > +            progresstick()
> >
> > def _getfiles(
> >     remote, receivemissing, fallbackpath, progresstick, missed, idmap):
> > @@ -234,8 +237,12 @@ class fileserverclient(object):
> >                         _getfiles(remote, self.receivemissing, fallbackpath,
> >                                   progresstick, missed, idmap)
> >                     elif remote.capable("getfile"):
> > +                        batchdefault = 100 if remote.capable('batch') else 10
> > +                        batchsize = self.ui.configint(
> > +                            'remotefilelog', 'batchsize', batchdefault)
> >                         _getfilesbatch(
> > -                            remote, self.receivemissing, progresstick, missed, idmap)
> > +                            remote, self.receivemissing, progresstick, missed,
> > +                            idmap, batchsize)
> >                     else:
> >                         raise util.Abort("configured remotefilelog server"
> >                                          " does not support remotefilelog")
> > diff --git a/tests/test-http.t b/tests/test-http.t
> > --- a/tests/test-http.t
> > +++ b/tests/test-http.t
> > @@ -20,6 +20,12 @@ Build a query string for later use:
> >   $ hgcloneshallow http://localhost:$HGPORT/ shallow -q
> >   1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
> >
> > +Clear filenode cache so we can test fetching with a modified batch size
> > +  $ rm -r $TESTTMP/hgcache
> > +Now do a fetch with a large batch size so we're sure it works
> > +  $ hgcloneshallow http://localhost:$HGPORT/ shallow-large-batch \
> > +  >    --config remotefilelog.batchsize=1000 -q
> > +  1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
> >
> > The 'remotefilelog' capability should *not* be exported over http(s),
> > as the getfile method it offers doesn't work with http.
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel at selenic.com
> > https://selenic.com/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list