[PATCH evolve-ext] inhibit: improve transaction marker perf

Durham Goode durham at fb.com
Sun Nov 8 11:24:16 CST 2015



On 11/8/15 12:04 AM, Martin von Zweigbergk wrote:
>
>
> On Sat, Nov 7, 2015 at 10:15 PM Durham Goode <durham at fb.com 
> <mailto:durham at fb.com>> wrote:
>
>
>
>     On 11/7/15 9:52 PM, Martin von Zweigbergk wrote:
>>
>>
>>     On Sat, Nov 7, 2015 at 5:21 PM Durham Goode <durham at fb.com
>>     <mailto:durham at fb.com>> wrote:
>>
>>         # HG changeset patch
>>         # User Durham Goode <durham at fb.com <mailto:durham at fb.com>>
>>         # Date 1446945001 28800
>>         #      Sat Nov 07 17:10:01 2015 -0800
>>         # Node ID 7c680f209f7af35c7c476eecc2f9eec13b32ad62
>>         # Parent  48547b4c77defdd17c670b1eb0eb94272edf0207
>>         inhibit: improve transaction marker perf
>>
>>         The old algorithm was a revset "::X and obsolete()". This was
>>         inefficient because
>>         it requires walking all the way down the ancestor chain
>>         (since the revset did
>>         not know it could stop walking at public nodes).
>>
>>
>>     I was hoping to reproduce the slowness on the Mozilla repo (270k
>>     revisions), but "hg log -r '::tip and obsolete()'" runs in 180
>>     ms. Do you have a better command for me to try? How many
>>     obsmarkers in the repo you tried it on?
>     You need to make sure you pass --hidden, otherwise the obsolete()
>     revset resolves to an empty set and tests against it are cheap.
>
>
> I'm still not able to reproduce this :-( Some tests take ~800 ms, but 
> most of that time seems to be related to loading obsmarkers and not 
> about iterating over the revset.
>
> I wanted to see if I could get the same results by playing with the 
> revset optimizer. I have never looked at that code before and I don't 
> know if it's a good idea. If it is, I'll have to let you do it 
> yourself since I can't even test it.
I looked into this a bit more.  I think this performance is due to the 
nature of our largest repo.  We had three large repos which were then 
merged together, so there are three distinct branches in history.  The 
--profile of this revset is as follows:

~/foo> time hg log -r '::tip and obsolete()'  --hidden --profile
| 100.0%  cmdutil.py:     getlogrevs               line 4792:  revs, 
expr, filematcher = c...
  \ 93.3%  revset.py:      __nonzero__              line 2135:  if not revs:
    | 93.3%  revset.py:      _iterfilter            line 3110:  for r in it:
    | 93.3%  revset.py:      _desccontains          line 3084:  if cond(x):
    | 89.6%  revset.py:      _consumegen            line 3460:  for l in 
self._consumegen():
    | 82.2%  revset.py:      iterate                line 3508:  for item 
in self._gen:
    | 30.4%  changelog.py:   parentrevs             line 56:  for parent 
in cl.parentrevs...
    | 13.3%  revlog.py:      parentrevs             line 231:  return 
super(changelog, sel...

Most of the time is in iterate, which is where the heapq management 
goes. I think having the three branches in history cause more heap work 
than without (in fact, if I do "(A::tip | B::tip | C::tip) and 
obsolete()", where A B and C are roots of the three histories, it's way 
faster than just ::tip and the iterate disappears from the profile.

Either way, I think my fix still applies, since it fixes the O() entirely.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151108/3b0b7dfd/attachment.html>


More information about the Mercurial-devel mailing list