RFC new template keywords (wcmodified,wcmodifieddate) + sample impl.
Matt Mackall
mpm at selenic.com
Mon May 30 10:22:05 CDT 2011
On Sun, 2011-05-29 at 16:32 +1000, Peter Bray wrote:
> Greetings,
>
> After over three years using Mercurial on half-a-dozen personal
> repositories I'm looking at including Mercurial version information in
> automated builds for larger projects, a la wiki/VersioningWithMake.
> The first project is XML-based documentation with C++ projects to follow.
>
> This has lead to the following observations (not criticisms) / questions
> which my reading has not provided me with decent solutions for:
>
> - Programmatically determining if the working copy is modified.
>
> I have not discovered a simple way to determine if the working
> copy of a repository has been modified. Think shell code like:
> "if hg modified; then ... ", there seem to be many ways to get
> the information, hg st | wc -l, looking for the + in hg id, etc.
> Have I missed something obvious?
Both hg id and hg st will miss out on two classes of change:
- modified subrepos
- changed branch
'hg summary' will notice these on the commit: line.
You seem to be looking for an exit code based method. I don't think we
have anything like that but I can imagine adding it to summary.
> - Templates support for determining if the working copy is modified.
>
> In templates, I can't see a way to determine if the working copy
> has been modified. With "hg id -i" the plus (+) is not optional
> ("hg parents --template '{node|short}'" is the alternate). While
> in templates the plus or some indication that the working copy is
> modified does not seem to be available.
Indeed, none of the commands that take templates actually pay any
attention to the working directory in their display and I don't think
it's ever occurred to anyone that they should. So this message is all a
little weird to me.
It seems you've got a multi-line shell expression that you'd like to
reduce to a single line template expression at the cost of adding a
chunk of code and several features to Mercurial. That's only a win if
those features are of general utility.
> - Programmatically determining when the working copy was last modified.
>
> Once determined that the working copy is modified, it seems to me,
> that I need some basic way to identify that "revision" in an
> automated build system. Coding in the current date and time, will
> have the build regenerate the version information on each build,
> even though nothing has changed (e.g. make; make - rebuilds things
> unnecessarily).
> While there is no complete way to determine a "revision" identifier,
> for a modified working copy (generating a hash, that never appears
> in the history of the project seems pointless), I thought that MAYBE
> determining the date of the last quantifiable change might be a
> reasonable stand-in. This only works for additions and modifications,
> not deletions and removals, but as the former are probably more common
> and it may suffice to use the timestamp on the most recently changed
> file, for this propose.
> The following shell shows my first attempt (g prefix => GNU version):
> hg status -0 -n -q -am \
> | ( cd `hg root`; gxargs -0 --no-run-if-empty gstat --format '%y' ) \
> | sort -n \
> | tail -1 \
> | perl -p -e 's/:\d\d\.\d+//' # Remove excess precision (cf isodate)
I guess that's slightly more meaningful than +, but I don't see why it's
better than simply using 'time of build'? I can't see us doing something
like this internally.
> - A new template keyword, "wcmodified", a representation of the boolean
> value of whether the working copy has been modified, with the same
> logic as the "+" modifier in "hg id".
>
> Possible Modifications:
> - Name: what would convey the best meaning?
> - Representation: The value is boolean, but what is the best way
> to represent that in a templating environment and what filters
> might then be appropriate.
> The strings "True" and "False" are great in textual environments,
> but what about using a template to generate a C code fragment?
> The integers 0 and 1 plus filters (like say "bool" and "plus")
> might provide more flexibility. ("bool" being 0:"False", 1:"True",
> and "plus" being 0:"", 1:"+" to allow <node>+ generation in a
> template - eg {node|short}{wcmodified|plus}). I think a filter
> like "int" (if the value is boolean), might be abused on non
> boolean keywords like node and cause overflow issues.
>
> Initial Implementation: Using integers are external representation
>
> def showworkingcopymodified(repo, ctx, templ, cache, revcache, **args):
> """:wcmodified: Integer(0|1). Is the working copy of the current
> repository modified? Use the filter bool to convert to "True" or
> "False"."""
> # Using repo.status() defaults on listsubrepos, ignored, unknown, ...
> changed = util.any(repo.status())
> return int(changed)
>
> and of course,
>
> 'wcmodified': showworkingcopymodified,
What does this do when passed to 'hg log'? I think it needs to check
that ctx is a working directory parent.
> - A new template keyword, "wcmodifieddate", a date compatible with
> the existing date filters (eg isodate) that represents the time of
> last modification of the set of files that have been modified since
> the last commit. With the same limitations as mentioned above.
> Since the date filters will provided the current date and time
> (which varies on each run) when provided with None or "", I think it
> would be best to default to the change context date, when there are
> no changes or the date can not to determined (delete/remove).
>
> Possible Modifications:
> - Name: what would convey the best meaning?
> - Representation: Other ways to represent edge cases?
>
> Initial Implementation:
>
> def showworkingcopymodifieddate(repo, ctx, templ, cache, revcache, **args):
> """:wcmodifieddate: Date. Best effort to determine the time of the
> most recent change to the working copy. Possible for modifications
> and additions, not possible for deletions and removals. Defaults
> to the last context change date, as date filters default to the
> current date and time for None and the empty string, this is not
> useful in repeatable build environments."""
>
> cwd = repo.getcwd()
> t, tz = ctx.date()
> mostrecentchange = -1 # could use t
>
> # Using repo.status() defaults on listsubrepos, ignored, unknown, ...
> for l in repo.status():
> for f in l:
> path = repo.pathto(f, cwd)
> try:
> mtime = os.lstat(path).st_mtime
> if mtime > mostrecentchange:
> mostrecentchange = mtime
> except OSError, err:
> if err.errno != errno.ENOENT:
> raise
>
> if mostrecentchange == -1:
> return (t, tz)
> else:
> return (int(mostrecentchange), tz)
>
> and of course,
>
> 'wcmodifieddate': showworkingcopymodifieddate,
>
> Thoughts, comments and suggested improvements most welcome. If my
> proposal is considered worthwhile, I'll submit a proper patch against
> hg-crew incorporating any suggested improvements. I'll also provide an
> implementation for "modified" subcommand to the list for review before
> final patch construction, if that is also approved.
>
> Regards,
>
> Peter
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list