[PATCH] hg_log: speed up hg log for untracked files (issue1340)
mpm at selenic.com
Tue Sep 11 13:40:14 CDT 2012
On Tue, 2012-09-11 at 10:36 -0700, S Muralidhar wrote:
> Thanks for reviewing this, Matt. A few questions/comments below
> > This patch should be in multiple pieces:
> > - introduce the basic store method
> > - introduce the fncache method
> > - introduce the user in cmdutil + test
> I assume that step #2 will also include the fncachestore method (along
> with the fncache method) - just making sure I understood the protocol
> > I think the approach in fncache should be two-pass:
> > - check for a file matching arg in the cache directly
> > - scan for a directory match by iterating startswith over the cache
> Another point of clarification: should I bundle this all into the
> __contains__ method on fncache?
> Are there any perf concerns about adding an extra scan of the cache
> (although, it'll only happen on cache misses)?
It's not extra? It won't be instant, but it'll be much faster than going
down the log slow path.
> >> class basicstore
> >> + def __contains__(self, path):
> >> + '''Checks if this path exists in the store'''
> >> + return self.opener.exists(path)
> > Doesn't work with directories?
> I didn't follow this comment, Matt?
This appears not to be smart enough to work with directories, am I
> >> diff --git a/tests/test-glog.t b/tests/test-glog.t
> >> + +nodetag 3
> >> + nodetag 0
> > ???
> This is an example where hg log shows differences in output between
> the slow path and the fast path (we had a discussion earlier about
> this - http://bz.selenic.com/show_bug.cgi?id=3613), This particular
> testcase is testing differences between hg log and hg log -g, and one
> of them now uses the fast path for an untracked file, while the other
> doesn't (both of them used to go through the slow path earlier)
Huh. I've been focusing on this case, which I happen to hit a lot:
hg log actuallyarevision # long pause, empty output, facepalm
So, at a minimum, we'd like to get rid of the long pause.
But this test seems to be about:
hg log afile notafile # shows deletions and renames on afile
One important thing to know at this juncture is that the log code you're
hacking has been marked for death. It's just a matter of time before
it's replaced by graphlog. So divergence from existing output is not
desirable. Given the test that's getting broken here is probably not a
case we care about performance-wise, we should probably endeavor to
leave it alone, to avoid unnecessarily diverging from the graphlog code.
So instead, we should be doing this up front:
if there are files specified:
if they're all explicit filenames (not patterns):
if they all do not exist according to store:
return  # exit quickly
Eventually we should start considering ways of being smarter here, for
instance, actually giving useful output for 'hg log stable'. Git does
something like this:
$ git log skjdfs
fatal: ambiguous argument 'skjdfs': unknown revision or path not in the
Use '--' to separate paths from revisions
..but if you ask it for the log of a file that exists in history but not
in the working directory, it'll insist you add '--' (and apparently
sometimes you'll need --follow too) because it can't efficiently search
for all historic files.
But that's all a topic for later.
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel