[PATCH] largefiles: optimize status when files are specified (issue3144)

Matt Mackall mpm at selenic.com
Tue Dec 13 11:17:57 CST 2011


[back to the list!]

On Tue, 2011-12-13 at 16:24 +0100, Na'Tosha Bard wrote:
> 2011/12/12 Matt Mackall <mpm at selenic.com>
> 
> > On Fri, 2011-12-09 at 18:27 +0100, Na'Tosha Bard wrote:
> > > # HG changeset patch
> > > # User Na'Tosha Bard <natosha at unity3d.com>
> > > # Date 1323451555 -3600
> > > # Node ID aff80ad8195d64f533480c4d7bfc3a6c81f41568
> > > # Parent  fc8c7a5ccc4a928e7559013ecdf50462c271418c
> > > largefiles: optimize status when files are specified (issue3144)
> > >
> > > This fixes a performance issue with 'hg status' when files are specified
> > > on the command-line.  Previously, a large amount of largefiles code was
> > > executed, even if files were specified on the command-line and those
> > files
> > > were not largefiles.  This patch fixes the problem by first checking if
> > > non-largefiles were specified on the command-line and, just letting the
> > > normal status function handle the case if they were.
> >
> > > +                # First check if there were files specified on the
> > > +                # command line.  If there were, and none of them were
> > > +                # largefiles, we should just bail here and let super
> > > +                # handle it -- thus gaining a big performance boost.
> > > +                lfdirstate = lfutil.openlfdirstate(ui, self)
> > > +                if match._files:
> >
> > That underbar of course says "not for general use" - the proper accessor
> > is match.files().
> 
> 
> Good point; I'll fix that and re-submit.
> 
> 
> > But what happens if a user mixes explicit filenames
> > and patterns? What you're looking for is probably match.anypats().
> 
> 
> Well, if any patterns are supplied, I think we really want to fall through
> and rely on the regular code path because it will expand the patterns for
> us.
> 
> But this begs, a better question -- should that even work at all?  When I
> try the following (without largefiles enabled on my machine at all), I get
> odd output:
> 
> $ echo "foo" >> hgext/largefiles/overrides.py
> $ echo "bar" >> README
> $ hg status --include glob:hgext/largefiles/*.py README
> // I don't get any output here, but I expect to.

That's not what a 'pat' is. That's a pattern that's an include filter.
Your search says "take the set [README] and filter out anything that
doesn't match .../*.py".

> $ hg status --include glob:hgext/largefiles/*.py

"Take the set [all files by default] and filter out anything that
doesn't match .../*.py."

But then there's this:

 $ hg st -A glob:mercurial/*.py README

which is the same as:

 C:\hg>hg st -A mercurial/*.py README

"[.../*.py] + [README]"

Here we've got patterns and files mixed together in the match.

Internally a pattern is anything that's not an explicit filename (ie one
of the regex or glob types). This allows us to bypass some tree walking
when no patterns are present, and allows us to check for failures on
explicit filenames.

The match.anypats() logic is slightly more subtle than that: it is true
if either patterns are present in the base file list or include or
exclude filters are present (whether or not those filters contain
patterns themselves). This is because we can't use the files on the
command line verbatim in a fast path - some of them may be filtered out.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list