[PATCH 11 of 11 sparse] dirstate: integrate sparse matcher with _ignore (API)

Durham Goode durham at fb.com
Mon Jul 10 17:48:10 EDT 2017



On 7/10/17 1:04 PM, Martin von Zweigbergk wrote:
> On Mon, Jul 10, 2017 at 11:58 AM, Durham Goode <durham at fb.com> wrote:
>>
>>
>> On 7/10/17 11:55 AM, Martin von Zweigbergk wrote:
>>>
>>> On Mon, Jul 10, 2017 at 11:45 AM, Durham Goode <durham at fb.com> wrote:
>>>>
>>>> On 7/10/17 10:01 AM, Martin von Zweigbergk wrote:
>>>>>
>>>>>
>>>>> (For Durham)
>>>>>
>>>>> On Sat, Jul 8, 2017 at 4:29 PM, Gregory Szorc <gregory.szorc at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> # HG changeset patch
>>>>>> # User Gregory Szorc <gregory.szorc at gmail.com>
>>>>>> # Date 1499555309 25200
>>>>>> #      Sat Jul 08 16:08:29 2017 -0700
>>>>>> # Node ID 94f98bc84936defadb959e31012555dba170d8cd
>>>>>> # Parent  a2867557f9c2314aeea19a946dfb8e167def4fb8
>>>>>> dirstate: integrate sparse matcher with _ignore (API)
>>>>>
>>>>>
>>>>>
>>>>> Why does sparse do it this way instead of intersecting the sparse
>>>>> matcher with the user's matcher?
>>>>
>>>>
>>>>
>>>> I'm not sure I understand the question.  What is the "user's matcher"
>>>> here?
>>>> The ignore matcher?
>>>
>>>
>>> I mean the matcher the user may have provided on the command line (or
>>> match.always() by default), as in "hg status dir/" (where the matcher
>>> would be "relpath:dir").
>>>
>>>>
>>>> This code produces a matcher that returns true for any file that should
>>>> be
>>>> ignored.  Since both hgignore files and sparse-ignored files should be
>>>> ignored, I'm not sure how that could be expressed with intersection of
>>>> those
>>>> two matchers?
>>>
>>>
>>> For narrowhg, we did it the other way around: filtering in instead of
>>> filtering out. So if the narrowspec (like sparse config, IIUC) says to
>>> include foo/ and bar/ and the user provides 'glob:*c', we'd intersect
>>> that and list *.c files in those two directories (recursively).
>>
>>
>> I'd have to look at the code to be specific, but I think the dirstate ignore
>> logic covers more cases than the user provided matcher logic. I'd be
>> surprised if all commands that hit dirstate.ignore also happened to take
>> patterns at the command level.
>
> If they don't, then the sparse matcher can be passed as is.
>
>>  It just seemed cleaner to have a unified
>> matcher for ignored files at the repo level.  The user specific matcher can
>> always be added on top of it later for commands that take patterns.
>
> For narrow, we have to apply the matcher when walking the manifest
> too. The user can pass a matcher to e.g. "hg status -c ." or "hg files
> -r ." and in those cases we need to intersect the narrow matcher with
> the user-supplied one. It seemed more natural to do the same for
> dirstate walks.
>
> It also seems simpler to control which directories are visited if
> using a positive matcher than a negative one. For example, let's say
> the narrow matcher is path:dir/. The narrowhg code will then restrict
> the walk to visit only the root directory, dir/, and subdirectories of
> dir/ (both for manifest walks and dirstate walks). I think we can
> simply make negatematcher's visitdir return False iff the
> narrow/sparse matcher returns 'all', so it's probably easy to get it
> to work. It still seems more natural to me to match what should be
> included.
>

I don't have a strong opinion either way. When I made sparse, it was 
specific to the working copy, so it mapped to the ignore matcher very 
tightly. If that needs to change, that's fine.

I just want to avoid duplicating repetitive matcher logic amongst 
individual commands.


More information about the Mercurial-devel mailing list