RFC: Transparent subrepository support by match module

Klaus Koch kuk42 at gmx.net
Wed Aug 11 14:01:39 CDT 2010


Hello Martin,

The guy who gave you this forgot I told him that our oh so great WIKI
ate the stars '*' in my proposal.  See below the resurrected version.  And
yes, you are right, the '\0' is translated by the shell into '\\0', so
any other (intuitive) character etc. would be very nice.



The Mercurial match module and commands should support file paths and
file patterns into subrepositories. For example, 'hg diff sub1/a' should
print the diff of file 'a' in subrepo 'sub1'.

By default, Mercurial commits any changed files and subrepositories.
This is good, because it is easier to backout unintended commited data
than regain lost data.

In case one wants to commit only subsets of files and/or
subrepositories, they can be selected/included or excluded by shell
patterns, glob patterns or even regular expressions. This works quite
fine for files, however, for subrepositories the support not so good.

Today, one can select/include or exclude subrepositories only as a
whole. That is, one can commit all changed files and revision states in
subrepositories, but not just the states or only the files (for several
repositories). For example, '-X sub1' would exclude any changed files or
dirty state of subrepo sub1, whereas '-I sub1' would include sub1's dirty
state *and* changed files. One cannot select just the dirty state of the
subrepo sub1.

Proposed Solution
=================

The match module and the Mercurial commands should tranparently support
subrepositories, i.e., hg diff sub1/a should print the diff for file a
in subrepository sub1.

Introduce a subrepo boundary marker defining the border between an outer
repository and a subrepository for Mercurial patterns (glob, relglob,
re). As marker could be used NUL.

For example, all files in an outer repo could be matched with
'glob:**\0' or 're:.*\0'.

Compared to a options which would tell Mercurial whether it should work
recursively in regard to subrepos, with a boundary marker one could
select of files and states more easily, and one could select states and
files across subrepositories at the same time.

For example, selecting all files in the outer repository and the state
in subrepo sub1 can be done with the patterns '**\0' and '**\0sub1', but
not with a hypothetical option --recursive or --nonrecursive.

Some use cases
==============

a) all files in outer repo ('**\0')

b) all new states in subrepos of x level (for first level: '**\0*/' or
   '**\0*')

c) all files in subrepos of x level, including any nested subrepos (for
   first level: '**\0**')

d) all files in subrepos of x level, excluding any subrepos (for first
   level: '**\0**\0')

e) all new states in nested subrepos of level v in subrepos of x level
   (for first level: '**\0**\0*/')

Why the \0?
===========

a) It is the only character not allowed in POSIX file names. Windows
   does not allow it in file names. It is the only character Mercurial
   will never support for file names short of changing its internal data
   formats.

b) If you call 'hg status --no-status --print0', you would get a list
   like file1\0dir1/file1\0 Currently, the status does not recurse into
   subrepositories. So the limiting character to the subrepo directory
   names is in a way the '\0'.

c) It is not used in regular expressions (so far). Instead of NUL, we
   could use any character which is not allowed in Windows and needs
   quoting for most shells: <, >, |, :, (, ), &.

Some further examples
=====================

**\0                    # every file in outer repo
**\0/*                  # every subrepo state (no nested subrepos exist)
**\0/**                 # every file in any subrepo, but not their state
**.c\0                  # any C file in outer repo
**\0/sub[0-9]           # state of all subrepos sub0 to sub9
**\0/sub1/**.c          # any C file under any 1st level subrepo called sub1
**\0/sub1/**\0sub1sub1  # state of any nested subrepo named sub1sub1 in a
                        # subrepo named sub1 nested right below sub1






More information about the Mercurial-devel mailing list