Proposed Patch to hgignore

Matt Mackall mpm at selenic.com
Thu Mar 6 16:19:17 CST 2008


On Thu, Mar 06, 2008 at 10:01:09PM +0100, Jakob Krainz wrote:
> On Thu 2008-03-06 14:00:19 -0600, Matt Mackall wrote:
> > 
> > As discussed in the distant past, this feature is very difficult to
> > implement in a way that is simultaneously correct and easy to
> > understand, nevermind fast. Even though it looks simple on paper.
> > 
> > Ever wonder why regexes themselves don't have a negation operator?
> > It's the sort of deep result that computer science professors tend to
> > skim over but could easily fill a book or two.
> > 
> > -- 
> > Mathematics is the supreme nostalgia of our time.
> > 
> 
> Maybe, but my patch uses only facilities that are already present 
> in mercurial (look at the -I and -X options).

Yes, but you've made all the semantics much more complicated.

For starters, pattern matching now has ordering issues. Which means
there are now precedence issues. Across all the ignore files that are
in effect. So if I say ignore *.c in one and I say don't ignore *.c in
another, I have to know which one gets read first.

Also, if I say "ignore foo/" in one place, and "don't ignore bar.c" in
another (and we happen to have a foo/bar.c), what do we do? It's not
obvious. On the one hand, we want to traverse all files so that we can
pick up all possible exceptions. On the other, we really want to not
traverse all files, because that makes our tea cold. And we can't know
ahead of time that a particular unignore is inside a particular ignore
except for relatively trivial cases.

Then let's suppose I say "ignore fo*/" somewhere later. Should that
actually work? If so, then we've got to do break down all the rules
and apply them in order, and in this case we'll have to test against
all the directory components. Now we've gone from 1 regex call per
file to O(rules * depth) and we're looking at every file in the repo.

Now if we simply say "there are only two sets, ignore and unignore", then
we're still stuck with lots of unintuitive behavior and probably a
bunch of users who won't understand the implications of the docs.

> a valid comparison would be the hosts.allow / hosts.deny mechanism 
> of wietse venemas tcp_wrapper.
> (cf. http://itso.iu.edu/TCP_Wrappers )

First, note that that's not traversing the tree of the namespace, so
it's a much simpler problem. And people still get those sorts of rules
wrong all the time! I wrote just such a simple ruleset for a router
when I worked at Cisco and every single tester a) got it wrong and b)
had their own unique idea of how to make it more intuitive.

The current subtraction-only rules for .hgignore are simple,
unambiguous, and fast. It could stand to be better documented, with
more examples, and a note that no, you can't do an inverse match with
a regex.

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list