[PATCH] lfs: add a progress bar when searching for blobs to upload

Martin von Zweigbergk martinvonz at google.com
Fri Aug 31 01:14:38 EDT 2018


On Wed, Aug 29, 2018 at 8:18 PM Matt Harbison <mharbison72 at gmail.com> wrote:

> On Fri, 24 Aug 2018 18:18:32 -0400, Matt Harbison <mharbison72 at gmail.com>
>
> wrote:
>
> > # HG changeset patch
> > # User Matt Harbison <matt_harbison at yahoo.com>
> > # Date 1535147146 14400
> > #      Fri Aug 24 17:45:46 2018 -0400
> > # Node ID 76eca3ae345b261c0049d16269cdf991a31af21a
> > # Parent  c9a3f7f5c0235e3ae35135818c48ec5ea006de37
> > lfs: add a progress bar when searching for blobs to upload
> >
> > The search itself can take an extreme amount of time if there are a lot
> > of
> > revisions involved.  I've got a local repo that took 6 minutes to push
> > 1850
> > commits, and 60% of that time was spent here (there are ~70K files):
> >
> >      \ 58.1%  wrapper.py:     extractpointers      line 297:  pointers
> =
> > extractpointers(...
> >        | 57.7%  wrapper.py:     pointersfromctx    line 352:  for p in
> > pointersfromctx(ct...
> >        | 57.4%  wrapper.py:     pointerfromctx     line 397:  p =
> > pointerfromctx(ctx, f, ...
> >          \ 38.7%  context.py:     __contains__     line 368:  if f not
> > in ctx:
> >            | 38.7%  util.py:        __get__        line 82:  return key
> > in self._manifest
> >            | 38.7%  context.py:     _manifest      line 1416:  result =
> > self.func(obj)
> >            | 38.7%  manifest.py:    read           line 472:  return
> > self._manifestctx.re...
> >              \ 25.6%  revlog.py:      revision     line 1562:  text =
> > rl.revision(self._node)
> >                \ 12.8%  revlog.py:      _chunks    line 2217:  bins =
> > self._chunks(chain, ...
> >                  | 12.0%  revlog.py:      decompressline 2112:
> > ladd(decomp(buffer(data, ch...
> >                \  7.8%  revlog.py:      checkhash  line 2232:
> > self.checkhash(text, node, ...
> >                  |  7.8%  revlog.py:      hash     line 2315:  if node
> > != self.hash(text, ...
> >                  |  7.8%  revlog.py:      hash     line 2242:  return
> > hash(text, p1, p2)
> >              \ 12.0%  manifest.py:    __init__     line 1565:
> > self._data = manifestdict(t...
> >          \ 16.8%  context.py:     filenode         line 378:  if not
> > _islfs(fctx.filelog(...
> >            | 15.7%  util.py:        __get__        line 706:  return
> > self._filelog
> >            | 14.8%  context.py:     _filelog       line 1416:  result =
> > self.func(obj)
> >            | 14.8%  localrepo.py:   file           line 629:  return
> > self._repo.file(self...
> >            | 14.8%  filelog.py:     __init__       line 1134:  return
> > filelog.filelog(self...
> >            | 14.5%  revlog.py:      __init__       line 24:
> > censorable=True)
>
> Any ideas how to trim down some of this overhead?


You can possibly save on some of that manifest-reading time by calling
manifestlog.readfast() like changegroup (and verify, I think) does.


>   revset._matchfiles()
> has a comment about reading the changelog directly because of the
> overhead
> of creating changectx[1].  I think that could work here too, but falls
> apart because of the need to access the filelogs too.  It seems like
> reading the changelog and accessing the filelogs directly here is too low
> level, especially with @indygreg trying to add support for non-filelog
> storage.
>
> [1]
>
> https://www.mercurial-scm.org/repo/hg/file/6f38284b23f4/mercurial/revset.py#l1113
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180830/3c03fcad/attachment.html>


More information about the Mercurial-devel mailing list