[PATCH] convert: support glob patterns to exclude/include files

Tessa Starkey testarkey at gmail.com
Sun Mar 14 14:31:48 CDT 2010


On Fri, Mar 12, 2010 at 3:45 PM, Patrick Mézard <pmezard at gmail.com> wrote:

> Too bad we cannot see it in the diff, but right there we split the expressions with shlex module in posix mode. It means posix shell delimiting and escaping rules are used here to split the command tokens. The lex.wordchars set is changed so the parsing is harmless to 99% of regular filenames and probably glob patterns too. But if I specify a regexp matcher with the 're:' or 'relre:' prefixes things are different:
>
>    include "foobar\d+"
>
> is parsed as expected, the expression is preserved. But:
>
>    include foobar\d+
>
> is parsed as ('include', 'foobard+') thanks to posix escaping rules with regard to double quotes.
>
> It means we have to be very careful with documentation, and we might want to add some kind of debugging option to display parsed expressions (which would be useful even without this feature).

I will mention in the documentation that only glob patterns, and not
regexp patterns can be used in filemap rules.

>> +                self.includematch.append(match.match(self.root, '', [name]))
>
> 1- Why do you need self.root here? I think you can pass ''. The source path is usually meaningless, input paths will be relative to sink root.

You're right, self.root is not needed here.

> 2- Reusing match.match() is a good idea with one problem, for some reason glob and relpath expressions are filtered through util.canonpath() which forbids anything containing a path component equal to '.hg'. Rerun your tests with an additional rule like:
>
>    exclude .hg/foobar
>
> It will fail to start. This might looks a completely uninteresting case, except a filemap feature is being added to hgsubversion just to filter out such a directory from Jython repository.

I see two ways to handle this:

a)  In  the fillemapper, have a special case for rules with '.hg' to
work around the behaviour of match.
b) Add an option to match.match to accept paths  with .hg .

Option (a) would avoid any changes to the match.match code, which is
used in many places, but option (b) would be cleaner, in  my opinion.

For  hgsubversion,  are you using the convert extension, or are you
just creating a similar filemap?  If you are creating your own
filemap,  I think  the better solution would be to add a 'accept .hg'
option to match.match,  so we don't have write the work-around in two
places.

> I am not saying you should forget match.match() for this but we need something to avoid this particular behaviour. I think such an option >existed in the past.

Do you mean that an  option  existed for match.match(), or for
util.canonpath() ?  if this option used to exist,  maybe it was
removed for a reason?

Thank you for pointing out these problems.

- Tessa Starkey


More information about the Mercurial-devel mailing list