[PATCH RFC] faster dirstate walks
Bryan O'Sullivan
bos at serpentine.com
Thu Aug 25 13:38:49 CDT 2005
On Thu, 2005-08-25 at 14:29 -0400, Chris Mason wrote:
> 1) os.walk stats every file. dirstate.changes then stats every file again.
Good.
> 2) os.walk yields every file and subdir to dirstate.traverse who yields every
> file and everything in the dirstate map. dirstate.walk then
> filters this mass and yields every file to the caller. There should be
> fewer steps in here.
Do you think that makes a performance difference, or do you just not
like it?
> I've turned os.walk into something hg specific that takes all the dirstate
> ignore and matching rules into account. The new function also takes an
> optional function (interesting()) the caller can supply to help filter out
> files it doesn't care about.
What's the difference between interesting and matchfn?
> Testing has been extremely light, this is really just proof of concept code.
> One big functional change is the results of dirstate.walk are no longer sorted.
> If the caller really wants sorted results (most don't I think) they should
> collect into an array and sort themselves.
No, callers almost always want the results sorted. The reason is that
most UI functions use the results of a walk directly. Humans like
seeing their output in sorted order.
> + def __walk(s):
Please find a different name for this.
> + if not self.ignore(ds) and \
We don't use line continuations. Please wrap these kinds of clauses in
an extra set of parens instead.
Overall, I think the change is sensible. Unfortunately, it's not any
less voodoo-like than the current walk code. If you could add a few
descriptive comments here and there, so that future readers don't get
lost, that would be a good thing.
<b
More information about the Mercurial
mailing list