4456 – mercurial.obsolete._load is slow

Bug 4456 - mercurial.obsolete._load is slow

Summary: mercurial.obsolete._load is slow

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	Mercurial
Classification:	Unclassified
Component:	evolution (show other bugs)
Version:	2.6.1
Hardware:	PC Linux

Importance:	normal bug
Assignee:	Bugzilla

URL:
Keywords:

Depends on:
Blocks:

Reported:	2014-11-17 11:53 UTC by Julien Cristau
Modified:	2014-12-08 19:49 UTC (History)
CC List:	3 users (show)

See Also:
Python Version:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Julien Cristau 2014-11-17 11:53 UTC

on the mercurial repo (over nfs):

$ hg --version; time hg diff --profile
Mercurial Distributed SCM (version 3.2.1)
(see http://mercurial.selenic.com for more information)

Copyright (C) 2005-2014 Matt Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
   CallCount    Recursive     Total(s)    Inline(s) module:lineno(function)
      103990            0      1.5786      0.7521   mercurial.obsolete:148(_fm0readmarkers)
      103989            0      0.2471      0.2471       <sorted>
      103989            0      0.3942      0.2182       mercurial.obsolete:231(_fm0decodemeta)
      198543            0      0.0528      0.0528       <_struct.unpack>
      103989            0      0.0417      0.0417       <method 'split' of 'str' objects>
      103989            0      0.0403      0.0403       <method 'iteritems' of 'dict' objects>
           1            0      2.3674      0.5258   mercurial.obsolete:565(_load)
      103990            0      1.5786      0.7521       mercurial.obsolete:148(_fm0readmarkers)
      206486            0      0.1669      0.1669       <method 'add' of 'set' objects>
      206486            0      0.0783      0.0783       <method 'setdefault' of 'dict' objects>
      103989            0      0.0178      0.0178       <method 'append' of 'list' objects>
      103994            0      0.2476      0.2471   <sorted>
           3            0      0.0004      0.0004       mercurial.phases:282(<genexpr>)
          10            0      0.0000      0.0000       mercurial.changelog:189(rev)
      103989            0      0.3942      0.2182   mercurial.obsolete:231(_fm0decodemeta)
      319949            0      0.1761      0.1761       <method 'split' of 'str' objects>
      423990            0      0.2178      0.2178   <method 'split' of 'str' objects>
      206523            0      0.1669      0.1669   <method 'add' of 'set' objects>
      206486            0      0.0783      0.0783   <method 'setdefault' of 'dict' objects>
      198547            0      0.0528      0.0528   <_struct.unpack>
      103990            0      0.0403      0.0403   <method 'iteritems' of 'dict' objects>
      111789            0      0.0233      0.0233   <method 'pop' of 'dict' objects>
      104413            0      0.0179      0.0179   <method 'append' of 'list' objects>
       50820            0      0.0132      0.0132   <method 'get' of 'parsers.index' objects>
           1            0      0.0486      0.0131   mercurial.obsolete:1028(_computeobsoleteset)
       50820            0      0.0132      0.0132       <method 'get' of 'parsers.index' objects>
         278            0      0.0207      0.0002       mercurial.phases:194(phase)
           3            0      0.0015      0.0000       mercurial.localrepo:28(__get__)
          11            0      0.0000      0.0000       <method 'add' of 'set' objects>
      112589           34      0.0123      0.0123   <len>

real    0m2.902s
user    0m2.720s
sys     0m0.140s

Comment 1 Pierre-Yves David 2014-11-20 17:15 UTC

I'm pulling only 48544 from http://selenic.com. trying to obtains a repo with the 100K markers to understand where they come from (the 50K markers load in 0.4s here so still a valid issue)

Comment 2 Pierre-Yves David 2014-11-20 18:41 UTC

So there is a about 45K duplicated markers (date as int vs date as float.0). This is properly detected as duplicated per mercurial.

I've pushed an updated version of the `debugobsconvert` in evolve to get rid of them.

Comment 3 Pierre-Yves David 2014-11-20 19:59 UTC

Simple attempt of micro optimizing the two hot functions _loads and _fm1readmarkers did not achieved anything useful. object allocation and dict manipulation is probably killing us. I hope we can find something obvious before falling back to a C version.

Comment 4 Pierre-Yves David 2014-11-26 19:55 UTC

Disablying gcc during the parsing provide a 30% speedup. (from 0.5s to 0.35s on my machine)

Comment 5 Pierre-Yves David 2014-11-29 22:33 UTC

I tried various other micro optimisation like ensuring all variable are local lookup. But is only gives an about 3% speedup. The bulk of the time is probably spent in the packing. We need to turn that into C.

Comment 6 HG Bot 2014-12-08 18:47 UTC

Fixed by http://selenic.com/repo/hg/rev/5817f71c2336
Pierre-Yves David <pierre-yves.david@fb.com>
obsstore: disable garbage collection during initialization (issue4456)

Python garbage collection is triggered by container creation. So code that
creates a lot of tuples tends to trigger GC a lot. We disable the gc during
obsolescence marker parsing and associated initialization. This provides an
interesting speedup (25%).

Load marker function on my 58758 markers repo:
before: 0.468247 seconds
after:  0.344362 seconds

The benefit is a bit less visible overall. With python2.6 on my system I see:
after:  0.60
before: 0.53

The difference is probably explained by the delaying of a costly GC. (but there
is still a win). Marking involved tuples, lists and dicts as ignorable by the
garbage collector should give us more benefit. But this is another adventure.

Thanks goes to Siddharth Agarwal for the lead.

(please test the fix)

Comment 7 Pierre-Yves David 2014-12-08 19:49 UTC

The landed changesets is not going to make a huge difference, but it helps. I encourage. people to remove their duplicated markers too. Next stop: rewrite this in C.