[PATCH 7 of 8] ignore: add support for including subdir .hgignores

Durham Goode durham at fb.com
Fri May 15 13:22:46 CDT 2015

On 5/15/15 6:09 AM, Martin von Zweigbergk wrote:
> I agree with the other Martin. I'm not at a computer now, so it's hard 
> to check, but where are the ignore patterns used? I know it's used 
> while walking the working copy in dirstate.py. I would think that that 
> piece of code would not mind if it had to create a new union matcher 
> every time it visited a subdirectory mentioned in an include. Would 
> applying the matcher to only the path within the subdirectory be a 
> problem performance-wise?
The matcher is a bit of a hotpath (our working copies can have 500,000+ 
files) and right now every match is done entirely in the native re2 
code.  I worry that adding additional string and python logic to every 
path match would have a perf impact.  I guess I can hack something up to 
test it.

 From a simple grepping, ignores seem to only be used for dirstate.walk.
> Are the .hgignore files always read from the working copy? If not, it 
> would be nice to not have to read all the submanifests (when using 
> such) for operations that care only about some subdirectory.
 From what I can tell, it's always read from the working copy.
> Also, does the subdirectory .hgignore have to be called exactly 
> .hgignore? I'm just wondering at this point; I haven't decided whether 
> I think it should be required or not.
Nope, it can be called whatever you want.
> On Fri, May 15, 2015, 03:23 Martin Geisler <martin at geisler.net 
> <mailto:martin at geisler.net>> wrote:
>     Durham Goode <durham at fb.com <mailto:durham at fb.com>> writes:
>     > At the moment we only support globs in sub-ignore files. regexs will
>     > cause an exception. This is because we can't reliabily modify a
>     regex
>     > to have a prefix (ex: adding a prefix to '^foo|^bar' would require
>     > parsing the regex).
>     That is surprising from a high-level since regular languages are
>     closed
>     under concatenation.
>     However, I see what you're saying: blindly adding a prefix to a regex
>     doesn't do what you expect. I didn't look at the mechanics of the
>     code,
>     but if you could strip off the path elements as you descend down the
>     directory tree, you should get the right behavior.
>     So if we have
>       root/sub-dir/foo/
>                    bar/
>                    .hgignore <- '^foo|^bar'
>                    baz.txt
>            .hgignore <- '#include sub-dir/.hgignore'
>     then don't match
>       sub-dir/foo/
>       sub-dir/bar/
>     against
>       sub-dir/(^foo|^bar)
>     Instead match
>       foo/
>       bar/
>     against
>       ^foo|^bar
>     to conclude the directories should be ignored.
>     Maybe that's not what you want if you intend to make one big regular
>     expression upfront and run all paths through that.
>     --
>     Martin Geisler
>     http://google.com/+MartinGeisler
>     <https://urldefense.proofpoint.com/v1/url?u=http://google.com/%2BMartinGeisler&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=pHOG6Hz51SkYmYr%2FxoTFzw%3D%3D%0A&m=OSwxzUTuacvvTTlpHg4tkWDQsomewE5wIJjBQqpEva0%3D%0A&s=0ac94e73b8a813d7fb289cad58251579cae1285ec40b11360fb9c4500c352fa7>
>     _______________________________________________
>     Mercurial-devel mailing list
>     Mercurial-devel at selenic.com <mailto:Mercurial-devel at selenic.com>
>     http://selenic.com/mailman/listinfo/mercurial-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150515/358d7d05/attachment.html>

More information about the Mercurial-devel mailing list