[PATCH] lfs: add a progress bar when searching for blobs to upload
Matt Harbison
mharbison72 at gmail.com
Wed Aug 29 23:17:45 EDT 2018
On Fri, 24 Aug 2018 18:18:32 -0400, Matt Harbison <mharbison72 at gmail.com>
wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison at yahoo.com>
> # Date 1535147146 14400
> # Fri Aug 24 17:45:46 2018 -0400
> # Node ID 76eca3ae345b261c0049d16269cdf991a31af21a
> # Parent c9a3f7f5c0235e3ae35135818c48ec5ea006de37
> lfs: add a progress bar when searching for blobs to upload
>
> The search itself can take an extreme amount of time if there are a lot
> of
> revisions involved. I've got a local repo that took 6 minutes to push
> 1850
> commits, and 60% of that time was spent here (there are ~70K files):
>
> \ 58.1% wrapper.py: extractpointers line 297: pointers =
> extractpointers(...
> | 57.7% wrapper.py: pointersfromctx line 352: for p in
> pointersfromctx(ct...
> | 57.4% wrapper.py: pointerfromctx line 397: p =
> pointerfromctx(ctx, f, ...
> \ 38.7% context.py: __contains__ line 368: if f not
> in ctx:
> | 38.7% util.py: __get__ line 82: return key
> in self._manifest
> | 38.7% context.py: _manifest line 1416: result =
> self.func(obj)
> | 38.7% manifest.py: read line 472: return
> self._manifestctx.re...
> \ 25.6% revlog.py: revision line 1562: text =
> rl.revision(self._node)
> \ 12.8% revlog.py: _chunks line 2217: bins =
> self._chunks(chain, ...
> | 12.0% revlog.py: decompressline 2112:
> ladd(decomp(buffer(data, ch...
> \ 7.8% revlog.py: checkhash line 2232:
> self.checkhash(text, node, ...
> | 7.8% revlog.py: hash line 2315: if node
> != self.hash(text, ...
> | 7.8% revlog.py: hash line 2242: return
> hash(text, p1, p2)
> \ 12.0% manifest.py: __init__ line 1565:
> self._data = manifestdict(t...
> \ 16.8% context.py: filenode line 378: if not
> _islfs(fctx.filelog(...
> | 15.7% util.py: __get__ line 706: return
> self._filelog
> | 14.8% context.py: _filelog line 1416: result =
> self.func(obj)
> | 14.8% localrepo.py: file line 629: return
> self._repo.file(self...
> | 14.8% filelog.py: __init__ line 1134: return
> filelog.filelog(self...
> | 14.5% revlog.py: __init__ line 24:
> censorable=True)
Any ideas how to trim down some of this overhead? revset._matchfiles()
has a comment about reading the changelog directly because of the overhead
of creating changectx[1]. I think that could work here too, but falls
apart because of the need to access the filelogs too. It seems like
reading the changelog and accessing the filelogs directly here is too low
level, especially with @indygreg trying to add support for non-filelog
storage.
[1]
https://www.mercurial-scm.org/repo/hg/file/6f38284b23f4/mercurial/revset.py#l1113
More information about the Mercurial-devel
mailing list