[patch] syntax:plain for .hgignore

Jonathan S. Shapiro shap at eros-os.com
Mon Sep 10 20:01:43 UTC 2007


On Mon, 2007-09-10 at 21:32 +0200, Johannes Hofmann wrote:
> However I agree that the performance optimization should better be
> done behind the scenes without adding new syntax options for
> .hgignore. Does anyone know an easy/robust way to check whether a
> string contains special regexp syntax or not?

Depends on the prevailing regexp syntax. For glob syntax, the special
characters are:

  *, ?, [, ]  \

Depending on how anchored globs were handled you may also need to check
for ^ and $. Rules:

  1. If none of these characters appear, it is just a string.
  2. If any of these characters appear preceded by a backslash, it
     is just a string.

For regexp, you can look up the magic characters, but it's the same
idea.

However, I am concerned about something. ThomasAH and I have been
discussing include/exclude mechanisms. This requires that the entries be
processed in order, and I think if that is done the whole thing must be
compiled to a regexp because it is no longer just a union of patterns.

So two questions:

1. Is the performance gain so compelling that it justfies the added
   complexity?

2. Is it really faster? If the RE is built correctly it really shouldn't
   be that much faster.

3. Is it worth it at all if we will need to remove it later?

I suspect that the cost you are seeing lies in *compiling* the RE rather
than executing the RE. If this is the case, there may be a better
solution. Which cost are you actually concerned about?
-- 
Jonathan S. Shapiro
Managing Director
The EROS Group, LLC
www.coyotos.org, www.eros-os.org



More information about the Mercurial-devel mailing list