RFC new template keywords (wcmodified,wcmodifieddate) + sample impl.

Matt Mackall mpm at selenic.com
Mon May 30 10:22:05 CDT 2011


On Sun, 2011-05-29 at 16:32 +1000, Peter Bray wrote:
> Greetings,
> 
> After over three years using Mercurial on half-a-dozen personal
> repositories I'm looking at including Mercurial version information in
> automated builds for larger projects, a la wiki/VersioningWithMake.
> The first project is XML-based documentation with C++ projects to follow.
> 
> This has lead to the following observations (not criticisms) / questions
> which my reading has not provided me with decent solutions for:
> 
> - Programmatically determining if the working copy is modified.
> 
>    I have not discovered a simple way to determine if the working
>    copy of a repository has been modified. Think shell code like:
>    "if hg modified; then ... ", there seem to be many ways to get
>    the information, hg st | wc -l, looking for the + in hg id, etc.
>    Have I missed something obvious?

Both hg id and hg st will miss out on two classes of change:

- modified subrepos
- changed branch

'hg summary' will notice these on the commit: line.

You seem to be looking for an exit code based method. I don't think we
have anything like that but I can imagine adding it to summary.

> - Templates support for determining if the working copy is modified.
> 
>    In templates, I can't see a way to determine if the working copy
>    has been modified. With "hg id -i" the plus (+) is not optional
>    ("hg parents --template '{node|short}'" is the alternate). While
>    in templates the plus or some indication that the working copy is
>    modified does not seem to be available.

Indeed, none of the commands that take templates actually pay any
attention to the working directory in their display and I don't think
it's ever occurred to anyone that they should. So this message is all a
little weird to me.

It seems you've got a multi-line shell expression that you'd like to
reduce to a single line template expression at the cost of adding a
chunk of code and several features to Mercurial. That's only a win if
those features are of general utility.

> - Programmatically determining when the working copy was last modified.
> 
>    Once determined that the working copy is modified, it seems to me,
>    that I need some basic way to identify that "revision" in an
>    automated build system. Coding in the current date and time, will
>    have the build regenerate the version information on each build,
>    even though nothing has changed (e.g. make; make - rebuilds things
>    unnecessarily).
>    While there is no complete way to determine a "revision" identifier,
>    for a modified working copy (generating a hash, that never appears
>    in the history of the project seems pointless), I thought that MAYBE
>    determining the date of the last quantifiable change might be a
>    reasonable stand-in. This only works for additions and modifications,
>    not deletions and removals, but as the former are probably more common
>    and it may suffice to use the timestamp on the most recently changed
>    file, for this propose.
>    The following shell shows my first attempt (g prefix => GNU version):
>    hg status -0 -n -q -am \
>     | ( cd `hg root`; gxargs -0 --no-run-if-empty gstat --format '%y' ) \
>     | sort -n \
>     | tail -1 \
>     | perl -p -e 's/:\d\d\.\d+//' # Remove excess precision (cf isodate)

I guess that's slightly more meaningful than +, but I don't see why it's
better than simply using 'time of build'? I can't see us doing something
like this internally.

> - A new template keyword, "wcmodified", a representation of the boolean
>    value of whether the working copy has been modified, with the same
>    logic as the "+" modifier in "hg id".
> 
>    Possible Modifications:
>      - Name: what would convey the best meaning?
>      - Representation: The value is boolean, but what is the best way
>        to represent that in a templating environment and what filters
>        might then be appropriate.
>        The strings "True" and "False" are great in textual environments,
>        but what about using a template to generate a C code fragment?
>        The integers 0 and 1 plus filters (like say "bool" and "plus")
>        might provide more flexibility. ("bool" being 0:"False", 1:"True",
>        and "plus" being 0:"", 1:"+" to allow <node>+ generation in a
>        template - eg {node|short}{wcmodified|plus}). I think a filter
>        like "int" (if the value is boolean), might be abused on non
>        boolean keywords like node and cause overflow issues.
> 
>    Initial Implementation: Using integers are external representation
> 
> def showworkingcopymodified(repo, ctx, templ, cache, revcache, **args):
>      """:wcmodified: Integer(0|1). Is the working copy of the current
>      repository modified? Use the filter bool to convert to "True" or
>      "False"."""
>      # Using repo.status() defaults on listsubrepos, ignored, unknown, ...
>      changed = util.any(repo.status())
>      return int(changed)
> 
>     and of course,
> 
>      'wcmodified': showworkingcopymodified,

What does this do when passed to 'hg log'? I think it needs to check
that ctx is a working directory parent.

> - A new template keyword, "wcmodifieddate", a date compatible with
>    the existing date filters (eg isodate) that represents the time of
>    last modification of the set of files that have been modified since
>    the last commit. With the same limitations as mentioned above.
>    Since the date filters will provided the current date and time
>    (which varies on each run) when provided with None or "", I think it
>    would be best to default to the change context date, when there are
>    no changes or the date can not to determined (delete/remove).
> 
>    Possible Modifications:
>      - Name: what would convey the best meaning?
>      - Representation: Other ways to represent edge cases?
> 
>    Initial Implementation:
> 
> def showworkingcopymodifieddate(repo, ctx, templ, cache, revcache, **args):
>      """:wcmodifieddate: Date. Best effort to determine the time of the
>      most recent change to the working copy. Possible for modifications
>      and additions, not possible for deletions and removals. Defaults
>      to the last context change date, as date filters default to the
>      current date and time for None and the empty string, this is not
>      useful in repeatable build environments."""
> 
>      cwd = repo.getcwd()
>      t, tz = ctx.date()
>      mostrecentchange = -1 # could use t
> 
>      # Using repo.status() defaults on listsubrepos, ignored, unknown, ...
>      for l in repo.status():
>          for f in l:
>              path = repo.pathto(f, cwd)
>              try:
>                  mtime = os.lstat(path).st_mtime
>                  if mtime > mostrecentchange:
>                      mostrecentchange = mtime
>              except OSError, err:
>                  if err.errno != errno.ENOENT:
>                      raise
> 
>      if mostrecentchange == -1:
>          return (t, tz)
>      else:
>          return (int(mostrecentchange), tz)
> 
> and of course,
> 
>      'wcmodifieddate': showworkingcopymodifieddate,
> 
> Thoughts, comments and suggested improvements most welcome. If my
> proposal is considered worthwhile, I'll submit a proper patch against
> hg-crew incorporating any suggested improvements. I'll also provide an
> implementation for "modified" subcommand to the list for review before
> final patch construction, if that is also approved.
> 
> Regards,
> 
> Peter
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel


-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list