hg log with files is very slow

Matt Mackall mpm at selenic.com
Mon Jul 7 15:45:04 CDT 2008


On Fri, 2008-07-04 at 14:25 +1000, Brian Wallis wrote:
> The mercurial eclipse plugin uses the attached style file to gather  
> the revision details about the files in a repository.
> 
> hg log --debug --style /path/to/style/log_style_with_files /path/to/ 
> repo/root
> 
> In a large repository this can take a very long time to execute (6  
> minutes in a 800M repo that has 6000-8000 files in a typical workspace  
> checkout)

Yep, it's indeed very slow. That's why it's hidden under the debug
switch.

> I ran truss on the above command and can see that the log command is  
> continuously opening, seeking, reading and closing 00changelog.d and  
> 00manifest.d.
> 
> These are not large files in our large repository:
> 
> 2% ls -l /Users/bwallis/InfoMedix/Hg-Infomedix/.hg/store/
> total 31288
> -rw-r--r--   1 bwallis  bwallis   1067452 Jul  4 12:19 00changelog.d
> -rw-r--r--   1 bwallis  bwallis    277120 Jul  4 12:19 00changelog.i
> -rw-r--r--   1 bwallis  bwallis  14386898 Jul  4 12:19 00manifest.d
> -rw-r--r--   1 bwallis  bwallis    277056 Jul  4 12:19 00manifest.i
> drwxr-xr-x  16 bwallis  bwallis       544 Jun  4 14:09 data
> -rw-r--r--   1 bwallis  bwallis      2296 Jul  4 12:19 undo
> 
> Would it be very difficult to get mercurial to cache these files  
> rather than all the open/seek/read/close. I suspect that would improve  
> the performance somewhat.
>
> In my particular case, open for 00manifest and 00changelog is called  
> 3035 times.

That's not likely to be the problem. Even if a cached open took a
millisecond, that'd still be only 6 seconds. Log without the --debug
switch likely takes less than that with about half as many opens.

The problem is this: the extra data reported by log --debug requires
comparing two manifests for every single changeset. As it happens,
uncompressing a manifest is about the single most expensive operation in
Mercurial. 

Now I can see that you've got about 4300 changesets/manifests. And a
single uncompressed manifest consisting of (filename, sha1 pairs) with
6k-8k files is likely to be about 500kB-1MB. Which means you've got
somewhere between 2 and 4GB of manifest data compressed down to 14MB.

Uncompressing each of those is apparently taking you about.. .08
seconds. That's pretty modest and isn't noticeable in most operations,
which will only want 1 or 2 manifests. But an open-ended log --debug
wants them all. 

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial mailing list