[RFC] Amend commit messages

Wed Feb 23 10:18:44 UTC 2011

Hi,

This is at a slightly different tangent to your question:

> "I made a mistake in my commit message. How can I change it?"

The question largely is one directed at history editing... Of course there are
questions about have the changes been pushed, etc...

From a personal point of view I just think we should have the option --amend to
the commit command that basically does:

hg qimport -r tip ; hg qrefresh -e ; hg qfinish tip

Just like it says in http://mercurial.selenic.com/wiki/GitConcepts.

This gets around probably 60% of the problems.

I mean git users use this and they are not killing themselves with it. In fact I
implemented exactly this option in MacHg and its wonderful. See this screenshot:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: AmendOption.png
Type: image/png
Size: 63189 bytes
Desc: not available
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20110223/e8c32d36/attachment.png>
-------------- next part --------------

Then the other 40% of the problem is to hurry up LiquidHg.

I don't know about everybody else but I tend to work on my code for a while
adding, polishing moving bits around and then push stuff in a clump. I do a lot
of "history editing" before I push. My typical workflow and I would assume that
of others is: A lot of time with development there are small little tweaks. I
often commit before I compile just in case I mess something up.  Then I start
testing the new code. I will often then find some simple typos, or maybe a
missing include or something. Then I simply do an amend commit, and then once
its compiling the next changes either go in their own commits if they are bigger
or in that one if they were simple small oversights.

(I will often use histedit to move things around until they are in a nice order
for later historical understanding so that when I or someone else looks back 6
months from now I / they have some hope of following what I was doing. In fact I
often don't worry too much about the commit message. I often go back once things
are moved around and organized into shape and change the commit messages to
something detailed and sensible... Of course I do this on the stuff I haven't
yet pushed to the wider world. Note to do this editing I often make throw away
clones just in case something goes wrong with the history editing process, so I
can just step back to where I was. At some stage it would be really nice to have
a wider undo, rather than the limited rollback we now have... But thats the
topic for other emails...)

<general comment on history editing>
I commit early and very often. But many of my commits are not logical chunks or
they go down blind alleys etc. Thus history editing is really really useful. In
fact one of the largest examples of this close to home that we all have
experience with here is the Mercurial source code itself. The commit history for
the Mercurial project itself is quite clean and the quality level of the patches
before they are applied to the mercurial sources is quite high. There are very
few: "Opps, include that header", "Opps change the spelling here", "Opps, make
sure it compiles under linux", "Opps, make sure it passes this test", etc. sorts
of commits in the Mercurial source tree. Thus in one way or another the ethos
behind the Mercurial project's revision history is a clean one.

Consequently it appears to me that its somewhat iniquitous to see that the
Mercurial project itself keeps a very clean history where the patches sent in
have obviously been history edited in one way or another, but tends to espouse
the position that history editing is evil, and all you people just keep whatever
intermediate erroneous commits in your repository because its good for you.
(even though we don't keep them in our history ourselves...)

Note of course the position might be, well if you are committing to mercurial
then you are an advanced user of mercurial and you know about history editing
and you know its not so evil (just don't tell others) since we want to keep the
general message for the non-advanced users that history editing is dangerous.

In any case the clear shining example here is that the Mercurial sources
themselves are clean from the "opps" sorts of commits. So somehow users would
also like to have similarly clean and followable commit histories.
</general comment on history editing>

So in reference to the below:

1. It seems like the scheme you are proposing below just seems like it will be
adding layers of complexity.
2. Everyone on your team would need to be using the extension to get this
layered change to history.
3. This will only ever change the commit message. Easily over half of my opps
commits are small typos in the code.
4. Once I have cleaned up my commits before pushing the incidence of wrong
commit messages is very low. For instance in the Mercurial source tree how many
messages would we really need to change?

Hopefully with the LiquidHg extensions all this will be safe and users will know
exactly when they are about to change stuff which has already gone out to the
wider world...

Cheers, Jas

On Feb 23, 2011, at 9:41 AM, Gilles Moris wrote:

> Hello,
> 
> I don't remember any extension in this area, but this is a recurrent question
> on the mailing list:
> "I made a mistake in my commit message. How can I change it?"
> 
> We can redirect them to either mq or histedit, but usually we can only answer
> that this is not possible if the commit is already published.
> However, this question is legitimate. Whatever the number of reviews you can
> do, there always be some typo that can escape, like switched digits in a bug
> number or similar issues. The idea to have a message forever wrong is quite
> frustrating.
> 
> 1. Goal
> The goal is to overlay another commit description to replace a previous commit
> message. The current history (committed SHA1) is not changed and those commit
> message amendments appear as additional commits.
> 
> 2. Design
> The idea to wrap the changectx.description() method and replace the original
> commit message with something else. The design is somewhat borrowed to tags:
> get the new description from another file; if multiple heads exist for this
> file, take the latest filelog revision of that file by convention.
> 
> Now see how the something else can structured:
> 
> - as a single ".hgmessage" file located in the working directory, much like
> .hgtags". However this is more complex to handle than tags, as commit message
> can be multi-lines. This would require some markup language to delineate the
> mapping between the commit identifier and the message content. This would be
> specifically a problem during merges of this file, which would require knowledge
> of the format and increase the risk of incorrectly merge this file.
> 
> - the second solution is to create a ".hgmessage" directory this time, and have
> file for each commit amended. The file name is the full hex SHA1 identifier of
> the amended commit and the content is just the new commit message. This lowers
> the risk of merge, and in case of merge it is quite straightforward to
> understand how to merge. However, this risk is not null. The problem here is
> that the commit message description history is intermangled with the regular
> code history. If the commit amendments are made in 2 different branches that
> are not supposed to be merged from a code standpoint, we're stuck with also
> 2 divergent commit descriptions.
> 
> - so I came up with a third solution which is to put the commit messages in
> their owned named branch "hgmessage". That way the amendment history is
> decorrelated from the code history. The need to amend a commit message should
> be sufficiently rare that the overhead of a separate branch is acceptable.
> 
> 3. Implementation
> As an roughly implemented extension inlined below.
> 
> 4. Open issues
> - The name of the extension is message. The name of the command is "amend". I
> started with "message", then "editmessage". I like "amend" as it shows I do not
> edit the history.
> - More generally naming and wording.
> - Deleting one of the commit amendment files does not cancel the message
> edition as I am looking at the latest filelog revision, not at the manifest.
> Currently, you cancel a message amendment by creating a new amendment with the
> original content. That should be acceptable.
> - Performance: I did not find how to walk a specific directory of the store at
> start up, so I am walking the entire store (code taken from the cifiles
> extension). This also why I kept a directory even though I have a separate
> named branch: to provide further optimization of the start up time. Also there
> is no caching. Is some needed ?
> - Next steps: what do we do with this? I will wait for your comments to see if
> I am heading in the right direction, and probably then post the extension on
> Google Code or Bitbuckets. But then, does it deserve to be considered for
> inclusion as a bundled extension or a core patch?
> 
> Regards.
> Gilles.
> 
> 
> 
> import os, re
> from mercurial import hg, extensions, context, node, util, commands, cmdutil
> from mercurial import match as matchmod
> from mercurial.i18n import _
> 
> def getmsgfolder(ui):
>    return ui.config('editmessage', 'folder', 'hgmessage')
> 
> def getmsgbranch(ui):
>    return ui.config('editmessage', 'branch', 'hgmessage')
> 
> def msgread(filelog, node):
>    return filelog.read(node).splitlines()
> 
> def amenddesc(orig, ctx):
>    """wraps the changectx.description() method"""
>    repo = ctx._repo
>    nid = node.hex(ctx.node())
> 
>    if nid in repo.amendedmsg:
>        fl = repo.file(getmsgfolder(repo.ui) + '/' + nid)
>        heads = [fl.rev(h) for h in fl.heads()]
> 
>        if repo.ui.debugflag:
>            lines = []
>            for r in range(len(fl)-1, -1, -1):
>                amctx = repo[fl.linkrev(r)]
>                lines.append(_("amendment version: %d (%s)") %
>                             (r + 1, node.hex(amctx.node())))
>                if len(heads) > 1 and r in heads:
>                    ln = _("WARNING: message amendment conflicts in rev %s") % \
>                         ', '.join(str(fl.linkrev(h)) for h in heads if h != r)
>                    lines.append(ln)
>                lines.append(_("amended by: %s") % amctx.user())
>                lines.append(_("amended on: %s") % util.datestr(amctx.date()))
>                lines.append("")
>                lines.extend(msgread(fl, fl.node(r)))
>                lines.append("")
>                lines.append("")
>            lines.append(_("original description:"))
>            lines.append("")
>            lines.extend(orig(ctx).splitlines())
>        elif repo.ui.verbose:
>            lines = msgread(fl, fl.tip())
>            amctx = repo[fl.linkrev(len(fl)-1)]
>            lines.insert(1, _("(amended by %s on %s)") % (
>                amctx.user(),
>                util.datestr(amctx.date())))
>            if len(heads) > 1:
>                lines.insert(1,
>                             _("WARNING: message amendment conflicts in rev %s") %
>                             ", ".join(str(fl.linkrev(h)) for h in heads))
>        else:
>            lines = msgread(fl, fl.tip())
> 
>        msg = '\n'.join(lines)
>    else:
>        msg = orig(ctx)
>    return msg
> 
> def uisetup(ui):
>    extensions.wrapfunction(context.changectx, 'description', amenddesc)
> 
> def reposetup(ui, repo):
>    repo.amendedmsg = set()
> 
>    prefix = "data/%s/" % getmsgfolder(ui)
>    suffix = ".i"
>    plen = len(prefix)
>    slen = len(suffix)
>    lock = repo.lock()
>    try:
>        # this part can probably be greatly optimized by walking only one folder
>        # but using which API
>        for fn, efn, sz in repo.store.datafiles():
>            if fn[-slen:] == suffix and fn[:plen] == prefix:
>                repo.amendedmsg.add(fn[plen:-slen])
>    finally:
>        lock.release()
> 
> def editmsg(ui, repo, n, nidfn):
>    """creates the text template of the commit message to be edited"""
>    ctx = repo[n]
>    # use changelog for original message to bypass the
>    # overloaded ctx.description()
>    origdesc = repo.changelog.read(n)[4]
>    edittext = []
> 
>    # user edits the last commit message available
>    if node.hex(n) in repo.amendedmsg:
>        fl = repo.file(nidfn)
>        edittext.extend(msgread(fl, fl.tip()))
>    else:
>        edittext.extend(origdesc.splitlines())
>    edittext.append("")
> 
>    # standard HG comments when editing commit messages.
>    edittext.append(_("HG: Enter commit message."
>                      "  Lines beginning with 'HG:' are removed."))
>    edittext.append(_("HG: Leave message as is to abort edit."))
>    edittext.append("HG: --")
>    edittext.append(_("HG: user: %s") % ctx.user())
>    if ctx.p2():
>        edittext.append(_("HG: branch merge"))
>    if ctx.branch():
>        edittext.append(_("HG: branch '%s'") % ctx.branch())
>    edittext.extend([_("HG: files %s") % f for f in ctx.files()])
> 
>    # Display all the previous version of the message for reference
>    edittext.append("HG:" + '=' * 35 + "8<" + '-' * 35)
>    if node.hex(n) in repo.amendedmsg:
>        for r in range(len(fl)-1, -1, -1):
>            amctx = repo[fl.linkrev(r)]
>            edittext.append("HG: amendment version: %d (%s)" %
>                            (r + 1, node.hex(amctx.node())))
>            edittext.append("HG: amended by: %s" % amctx.user())
>            edittext.append("HG: amended on: %s" % util.datestr(amctx.date()))
>            edittext.append("HG:")
>            edittext.extend("HG: " + l for l in msgread(fl, fl.node(r)))
>            edittext.append("HG:")
>            edittext.append("HG:" + '=' * 35 + "8<" + '-' * 35)
>    edittext.append("HG: original description:")
>    edittext.append("HG:")
>    origdesc = repo.changelog.read(n)[4]
>    edittext.extend(map(lambda l: "HG: " + l, origdesc.splitlines()))
>    edittext.append("")
>    return '\n'.join(edittext)
> 
> def amend(ui, repo, rev, **opts):
>    """amend the commit message of the given revision
> 
>    Change a previous commit message. The commit is given as the only argument
>    of the command. The history is not really edited. Another commit is created
>    by this command and will overlay the previous commit message. This command
>    can be run multiple times: the log will show the latest message.
> 
>    Without -m or -l options, an editor will show up with the commit message
>    edition history as a reference.
> 
>    If divergent amendment already exists for this commit, the command will
>    refuse to work. You will have first to merge the multiple heads on the named
>    branch (by default "hgmessage") used to handle the message amendment. The -f
>    option enables to override this behavior.
> 
>    Only the commit message is amended. The -u and -d options allow only to
>    change the user and date of the amendement commit.
> 
>    Returns 0 on success, 1 if nothing changed.
>    """
>    cmdutil.bail_if_changed(repo)
> 
>    n = repo.lookup(rev)
>    msgbranch = getmsgbranch(ui)
>    hgmsgdir = getmsgfolder(ui)
>    fl = repo.file(hgmsgdir + '/' + node.hex(n))
>    if not opts.get('force') and len(fl.heads()) > 1:
>        raise util.Abort("multiple heads for %s: merge %s branch or use -f" %
>                         (node.short(n), msgbranch))
>    savectx = os.getcwd(), repo.dirstate.parents()
> 
>    os.chdir(repo.root)
>    # Create or check out the hgmessage branch
>    if msgbranch not in repo.branchtags():
>        hg.clean(repo, node.nullid, False)
>        repo.dirstate.setbranch(msgbranch)
>    else:
>        hg.clean(repo, msgbranch, False)
> 
>    if not os.path.lexists(repo.wjoin(hgmsgdir)):
>        os.mkdir(repo.wjoin(hgmsgdir))
>    nidfn = os.path.join(hgmsgdir, node.hex(n))
> 
>    text = cmdutil.logmessage(opts)
>    if not text:
>        temp = editmsg(ui, repo, n, nidfn)
>        text = ui.edit(temp, repo[n].user())
>        if text == temp:
>            # nothing changed
>            text = ""
>        else:
>            text = re.sub("(?m)^HG:.*(\n|$)", "", text)
> 
>    if not text:
>        ui.status(_("nothing changed\n"))
>        return 1
> 
>    text = text.strip() + '\n'
>    f = open(repo.wjoin(nidfn), 'w')
>    f.write(text)
>    f.close()
>    wctx = repo[None]
>    wctx.status(ignored=True, unknown=True)
>    if nidfn in wctx.ignored() or nidfn in wctx.unknown():
>        wctx.add([nidfn])
>    match = matchmod.exact(repo.root, '', [nidfn])
>    date = opts.get('date')
>    if date:
>        date = util.parsedate(date)
>    cimsg = "amend commit message of %s" % node.hex(n)
>    repo.commit(cimsg, opts.get('user'), date, match)
> 
>    # restores the saved context
>    hg.clean(repo, savectx[1][0], False)
>    os.chdir(savectx[0])
>    return 0
> 
> cmdtable = {
> 'amend':
>        (amend,
>         [('f', 'force', None, _('force edit message')),
>          ] + commands.commitopts + commands.commitopts2,
>        _('hg editmessage [OPTION] REV')),
> }
> 
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel