RFC new template keywords (wcmodified,wcmodifieddate) + sample impl.

Peter Bray peter.darren.bray at gmail.com
Sun May 29 01:32:49 CDT 2011


After over three years using Mercurial on half-a-dozen personal
repositories I'm looking at including Mercurial version information in
automated builds for larger projects, a la wiki/VersioningWithMake.
The first project is XML-based documentation with C++ projects to follow.

This has lead to the following observations (not criticisms) / questions
which my reading has not provided me with decent solutions for:

- Programmatically determining if the working copy is modified.

   I have not discovered a simple way to determine if the working
   copy of a repository has been modified. Think shell code like:
   "if hg modified; then ... ", there seem to be many ways to get
   the information, hg st | wc -l, looking for the + in hg id, etc.
   Have I missed something obvious?

- Templates support for determining if the working copy is modified.

   In templates, I can't see a way to determine if the working copy
   has been modified. With "hg id -i" the plus (+) is not optional
   ("hg parents --template '{node|short}'" is the alternate). While
   in templates the plus or some indication that the working copy is
   modified does not seem to be available.

- Programmatically determining when the working copy was last modified.

   Once determined that the working copy is modified, it seems to me,
   that I need some basic way to identify that "revision" in an
   automated build system. Coding in the current date and time, will
   have the build regenerate the version information on each build,
   even though nothing has changed (e.g. make; make - rebuilds things
   While there is no complete way to determine a "revision" identifier,
   for a modified working copy (generating a hash, that never appears
   in the history of the project seems pointless), I thought that MAYBE
   determining the date of the last quantifiable change might be a
   reasonable stand-in. This only works for additions and modifications,
   not deletions and removals, but as the former are probably more common
   and it may suffice to use the timestamp on the most recently changed
   file, for this propose.
   The following shell shows my first attempt (g prefix => GNU version):
   hg status -0 -n -q -am \
    | ( cd `hg root`; gxargs -0 --no-run-if-empty gstat --format '%y' ) \
    | sort -n \
    | tail -1 \
    | perl -p -e 's/:\d\d\.\d+//' # Remove excess precision (cf isodate)

I would like to propose for discussion a few possible enhancements,
some of which I already have an initial implementation for. Please
note, I'm not a python developer, so please don't dismiss the ideas
based on a poor initial implementations or poor python coding.

- A new subcommand, "modified", which returns a exit code to indicate
   whether the working copy of the repository is modified.

   Possible Modifications:
     - CLI option to print the result as text (True|False)

   Possible Considerations / Enhancements:
     - CLI options to match options to repo.status(), so things like
       subrepos, ignored, unknown

   Initial Implementation:
     - Not developed, would be based on existing code from commands.py
     - Happy to do the development if the concept is approved.

- A new template keyword, "wcmodified", a representation of the boolean
   value of whether the working copy has been modified, with the same
   logic as the "+" modifier in "hg id".

   Possible Modifications:
     - Name: what would convey the best meaning?
     - Representation: The value is boolean, but what is the best way
       to represent that in a templating environment and what filters
       might then be appropriate.
       The strings "True" and "False" are great in textual environments,
       but what about using a template to generate a C code fragment?
       The integers 0 and 1 plus filters (like say "bool" and "plus")
       might provide more flexibility. ("bool" being 0:"False", 1:"True",
       and "plus" being 0:"", 1:"+" to allow <node>+ generation in a
       template - eg {node|short}{wcmodified|plus}). I think a filter
       like "int" (if the value is boolean), might be abused on non
       boolean keywords like node and cause overflow issues.

   Initial Implementation: Using integers are external representation

def showworkingcopymodified(repo, ctx, templ, cache, revcache, **args):
     """:wcmodified: Integer(0|1). Is the working copy of the current
     repository modified? Use the filter bool to convert to "True" or
     # Using repo.status() defaults on listsubrepos, ignored, unknown, ...
     changed = util.any(repo.status())
     return int(changed)

    and of course,

     'wcmodified': showworkingcopymodified,

    With the following sample implementation of the "bool" filter:

def booleanstring(text):
     """:bool: String. Return the textual boolean value of the provided
     return bool(text)

    N.B. util.parsebool(text) does not work on int
         (maybe stringify would help)

    and of course,

     "bool": booleanstring,

- A new template keyword, "wcmodifieddate", a date compatible with
   the existing date filters (eg isodate) that represents the time of
   last modification of the set of files that have been modified since
   the last commit. With the same limitations as mentioned above.
   Since the date filters will provided the current date and time
   (which varies on each run) when provided with None or "", I think it
   would be best to default to the change context date, when there are
   no changes or the date can not to determined (delete/remove).

   Possible Modifications:
     - Name: what would convey the best meaning?
     - Representation: Other ways to represent edge cases?

   Initial Implementation:

def showworkingcopymodifieddate(repo, ctx, templ, cache, revcache, **args):
     """:wcmodifieddate: Date. Best effort to determine the time of the
     most recent change to the working copy. Possible for modifications
     and additions, not possible for deletions and removals. Defaults
     to the last context change date, as date filters default to the
     current date and time for None and the empty string, this is not
     useful in repeatable build environments."""

     cwd = repo.getcwd()
     t, tz = ctx.date()
     mostrecentchange = -1 # could use t

     # Using repo.status() defaults on listsubrepos, ignored, unknown, ...
     for l in repo.status():
         for f in l:
             path = repo.pathto(f, cwd)
                 mtime = os.lstat(path).st_mtime
                 if mtime > mostrecentchange:
                     mostrecentchange = mtime
             except OSError, err:
                 if err.errno != errno.ENOENT:

     if mostrecentchange == -1:
         return (t, tz)
         return (int(mostrecentchange), tz)

and of course,

     'wcmodifieddate': showworkingcopymodifieddate,

Thoughts, comments and suggested improvements most welcome. If my
proposal is considered worthwhile, I'll submit a proper patch against
hg-crew incorporating any suggested improvements. I'll also provide an
implementation for "modified" subcommand to the list for review before
final patch construction, if that is also approved.



More information about the Mercurial-devel mailing list