[PATCH 3 of 3] treemanifest: optimize treemanifest._walk() to skip directories

Drew Gottlieb drgott at google.com
Tue Apr 7 18:07:06 CDT 2015


Disregard this last commit. The largefiles fix doesn't work. The first two
commits are still good.

Drew

On Tue, Apr 7, 2015 at 3:58 PM Drew Gottlieb <drgott at google.com> wrote:

> # HG changeset patch
> # User Drew Gottlieb <drgott at google.com>
> # Date 1428445132 25200
> #      Tue Apr 07 15:18:52 2015 -0700
> # Node ID 290a04bae0f913f538f5c65b2db56525e1007ed6
> # Parent  73713a0f0471739a8a91cda6d7b8966ebf07cdc1
> treemanifest: optimize treemanifest._walk() to skip directories
>
> This makes treemanifest.walk() not visit submanifests that are known not to
> have any matching files. It does this by calling match.visitdir() on
> submanifests as it walks.
>
> This change also updates largefiles to be able to work with this new
> behavior
> in treemanifests. It overrides match.visitdir(), the function that dictates
> how walk() and matches() skip over directories.
>
> The greatest speed improvements are seen with narrower scopes. For example,
> this commit speeds up the following command on the Mozilla repo from 1.14s
> to 1.02s:
>   hg files -r . dom/apps/
>
> Whereas with a wider scope, dom/, the speed only improves from 1.21s to
> 1.13s.
>
> As with similar a similar optimization to treemanifest.matches(), this
> change
> will bring out even bigger performance improvements once treemanifests are
> loaded lazily. Once that happens, we won't just skip over looking at
> submanifests, but we'll skip even loading them.
>
> diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
> --- a/hgext/largefiles/overrides.py
> +++ b/hgext/largefiles/overrides.py
> @@ -1274,6 +1274,18 @@
>          if not f in notbad:
>              origbadfn(f, msg)
>      m.bad = lfbadfn
> +
> +    origvisitdirfn = m.visitdir
> +    def lfvisitdirfn(dir):
> +        ret = origvisitdirfn(dir)
> +        if ret:
> +            return ret
> +        lf = lfutil.splitstandin(dir)
> +        if lf is None:
> +            return False
> +        return origvisitdirfn(lf)
> +    m.visitdir = lfvisitdirfn
> +
>      for f in ctx.walk(m):
>          fp = cmdutil.makefileobj(repo, opts.get('output'), ctx.node(),
>                                   pathname=f)
> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
> --- a/mercurial/manifest.py
> +++ b/mercurial/manifest.py
> @@ -616,13 +616,6 @@
>          '''
>          fset = set(match.files())
>
> -        # avoid the entire walk if we're only looking for specific files
> -        if fset and not match.anypats():
> -            if util.all(fn in self for fn in fset):
> -                for fn in sorted(fset):
> -                    yield fn
> -                raise StopIteration
> -
>          for fn in self._walk(match):
>              if fn in fset:
>                  # specified pattern is the exact name
> @@ -637,8 +630,18 @@
>              if not self.hasdir(fn):
>                  match.bad(fn, None)
>
> -    def _walk(self, match):
> -        '''Recursively generates matching file names for walk().'''
> +    def _walk(self, match, alldirs=False):
> +        '''Recursively generates matching file names for walk().
> +
> +        Will visit all subdirectories if alldirs is True, otherwise it
> will
> +        only visit subdirectories for which match.visitdir is True.'''
> +
> +        if not alldirs:
> +            # substring to strip trailing slash
> +            visit = match.visitdir(self._dir[:-1] or '.')
> +            if not visit:
> +                return
> +            alldirs = (visit == 'all')
>
>          # yield this dir's files and walk its submanifests
>          for p in sorted(self._dirs.keys() + self._files.keys()):
> @@ -647,7 +650,7 @@
>                  if match(fullp):
>                      yield fullp
>              else:
> -                for f in self._dirs[p]._walk(match):
> +                for f in self._dirs[p]._walk(match, alldirs):
>                      yield f
>
>      def matches(self, match):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150407/97a7d2b8/attachment.html>


More information about the Mercurial-devel mailing list