[PATCH] lfs: add a progress bar when searching for blobs to upload
Martin von Zweigbergk
martinvonz at google.com
Fri Aug 31 01:19:02 EDT 2018
On Thu, Aug 30, 2018 at 10:14 PM Martin von Zweigbergk <
martinvonz at google.com> wrote:
>
>
> On Wed, Aug 29, 2018 at 8:18 PM Matt Harbison <mharbison72 at gmail.com>
> wrote:
>
>> On Fri, 24 Aug 2018 18:18:32 -0400, Matt Harbison <mharbison72 at gmail.com>
>>
>> wrote:
>>
>> > # HG changeset patch
>> > # User Matt Harbison <matt_harbison at yahoo.com>
>> > # Date 1535147146 14400
>> > # Fri Aug 24 17:45:46 2018 -0400
>> > # Node ID 76eca3ae345b261c0049d16269cdf991a31af21a
>> > # Parent c9a3f7f5c0235e3ae35135818c48ec5ea006de37
>> > lfs: add a progress bar when searching for blobs to upload
>> >
>> > The search itself can take an extreme amount of time if there are a
>> lot
>> > of
>> > revisions involved. I've got a local repo that took 6 minutes to push
>> > 1850
>> > commits, and 60% of that time was spent here (there are ~70K files):
>> >
>> > \ 58.1% wrapper.py: extractpointers line 297: pointers
>> =
>> > extractpointers(...
>> > | 57.7% wrapper.py: pointersfromctx line 352: for p in
>> > pointersfromctx(ct...
>> > | 57.4% wrapper.py: pointerfromctx line 397: p =
>> > pointerfromctx(ctx, f, ...
>> > \ 38.7% context.py: __contains__ line 368: if f not
>> > in ctx:
>> > | 38.7% util.py: __get__ line 82: return
>> key
>> > in self._manifest
>> > | 38.7% context.py: _manifest line 1416: result
>> =
>> > self.func(obj)
>> > | 38.7% manifest.py: read line 472: return
>> > self._manifestctx.re...
>> > \ 25.6% revlog.py: revision line 1562: text =
>> > rl.revision(self._node)
>> > \ 12.8% revlog.py: _chunks line 2217: bins =
>> > self._chunks(chain, ...
>> > | 12.0% revlog.py: decompressline 2112:
>> > ladd(decomp(buffer(data, ch...
>> > \ 7.8% revlog.py: checkhash line 2232:
>> > self.checkhash(text, node, ...
>> > | 7.8% revlog.py: hash line 2315: if node
>> > != self.hash(text, ...
>> > | 7.8% revlog.py: hash line 2242: return
>> > hash(text, p1, p2)
>> > \ 12.0% manifest.py: __init__ line 1565:
>> > self._data = manifestdict(t...
>> > \ 16.8% context.py: filenode line 378: if not
>> > _islfs(fctx.filelog(...
>> > | 15.7% util.py: __get__ line 706: return
>> > self._filelog
>> > | 14.8% context.py: _filelog line 1416: result
>> =
>> > self.func(obj)
>> > | 14.8% localrepo.py: file line 629: return
>> > self._repo.file(self...
>> > | 14.8% filelog.py: __init__ line 1134: return
>> > filelog.filelog(self...
>> > | 14.5% revlog.py: __init__ line 24:
>> > censorable=True)
>>
>> Any ideas how to trim down some of this overhead?
>
>
> You can possibly save on some of that manifest-reading time by calling
> manifestlog.readfast() like changegroup (and verify, I think) does.
>
>
>> revset._matchfiles()
>> has a comment about reading the changelog directly because of the
>> overhead
>> of creating changectx[1]. I think that could work here too, but falls
>> apart because of the need to access the filelogs too.
>
>
I don't see changectx-creation in the profile output. What makes you think
that's a significant cost here?
> It seems like
>> reading the changelog and accessing the filelogs directly here is too
>> low
>> level, especially with @indygreg trying to add support for non-filelog
>> storage.
>>
>> [1]
>>
>> https://www.mercurial-scm.org/repo/hg/file/6f38284b23f4/mercurial/revset.py#l1113
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180830/a972a7dd/attachment.html>
More information about the Mercurial-devel
mailing list