Why is hg identify so slow?

Matt Mackall mpm at selenic.com
Wed Jun 25 11:52:29 CDT 2008


On Wed, 2008-06-25 at 17:16 +0100, Frank A. Kingswood wrote:
> Shun-ichi GOTO wrote:
> >> I've noticed several times in the past that hg identify is quite slow
> >> compared to hg status. It would seem to me that hg identify actually has
> >> slightly less to do - no need to print out modified or unknown files.
> > 
> > If your repository has large number of revisions, it may hit  issue 557 / 548.
> > http://www.selenic.com/mercurial/bts/issue557
> > http://www.selenic.com/mercurial/bts/issue548
> > 
> > In my case (issue 548), repository is emacs:
> > Here is a result on the repository:
> 
> Issue 557 suggests this is because of the number of tags.
> My repository has about 500 tags, and about 18K revisions.
> 
> The flow in commands.py identify() is a bit convoluted (and quite 
> possibly wrong) but it seems that I can avoid the scan for tags by 
> setting --id. From the help only adding --tags should force it to read 
> the tags though.
> 
> With --id the performance is similar to hg status.
> 
> $ /usr/bin/time hg id --id
> a259f6e06bc2+
> 0.18user 0.10system 0:03.09elapsed 9%CPU
> 0inputs+0outputs (0major+2440minor)pagefaults 0swaps
> $ /usr/bin/time hg id --id
> a259f6e06bc2+
> 0.20user 0.04system 0:00.51elapsed 48%CPU
> 0inputs+0outputs (0major+2440minor)pagefaults 0swaps

With no options, it prints:

hash[+hash] [tags][+]

..which would have been more obvious if you were on the tip.

You might also ask id to just print tags (hg id -t) and just print
revision numbers (hg id -n). This will allow you to isolate the cost of
just reading tags.

You're right though, that id doesn't have to walk unknown files and thus
could be faster. I'm currently in the process of sweeping through all
users of status and walk and fixing up the ones that needlessly walk
unknown files.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list