[PATCH 2 of 2] record: allow splitting of hunks by manually editing patches

Angel Ezquerra Moreu angel.ezquerra at gmail.com
Tue Feb 21 03:58:26 CST 2012

On Tue, Feb 21, 2012 at 10:38 AM, A. S. Budden <abudden at gmail.com> wrote:
> On 21 February 2012 09:08, Angel Ezquerra Moreu
> <angel.ezquerra at gmail.com> wrote:
>> On Mon, Feb 20, 2012 at 2:30 PM, A. S. Budden <abudden at gmail.com> wrote:
>>> # HG changeset patch
>>> # User A. S. Budden <abudden at gmail.com>
>>> # Date 1329683565 0
>>> # Node ID 58f3e35efe716309e64bf498f44ffb0117554aad
>>> # Parent  ea6086ad210db327591965ad58f9e656313ce2fe
>>> record: allow splitting of hunks by manually editing patches
>>> It is possible that unrelated changes in a file are on sequential lines.  The
>>> current record extension does not allow these to be committed independently;
>>> this patch is intended to overcome that limitation.
>>> In order to take control over which lines in the hunk are applied, an editor is
>>> opened with a single-hunk patch.  Instructions on how to edit the patch are
>>> included with the patch (this follows Git's method of doing things).  Although
>>> patch editing sounds complicated, in practice, editing is actually very simple
>>> as all the user needs to do is either replace the '-' at the start of line with
>>> a space or delete lines starting with a '+' (this is explained in the
>>> instructions).  Given how rarely I'd expect this to be used in general, I felt
>>> that this was an acceptable level of complexity.
>>> An example use case for this is in software development for deeply embedded
>>> real-time systems.  In these environments, it is not always possible to use a
>>> debugger (due to time-constraints) and hence inline UART-based printing is
>>> often used.  When fixing a bug in a module, it is often convenient to add a
>>> large number of 'printf's (linked to the UART via a custom fputc) to the module
>>> in order to work out what is going wrong.  printf is a very slow function (and
>>> also variadic so somewhat frowned upon by the MISRA standard) and hence it is
>>> highly undesirable to commit these lines to the repository.  If only a partial
>>> fix is implemented, however, it is desirable to commit the fix without deleting
>>> all of the printf lines.  A partial commit also simplifies removal of the
>>> printf lines as once the final fix is committed, 'hg revert' does the rest.  It
>>> is likely that the printf lines will be very near the actual fix, so being able
>>> to split the hunk is very useful in this case.
>>> There were two alternatives I considered for the user interface.  One was to
>>> manually edit the patch, the other to allow a hunk to be split into individual
>>> lines for consideration.  The latter option would require a significant
>>> refactor of the record module and is less flexible.  While the former is
>>> potentially more complicated to use, this is a feature that is likely to only
>>> be used in certain exceptional cases (such as the use case proposed above) and
>>> hence I felt that the complexity would not be a considerable issue.
>>> In my opinion, this is a valuable addition to Mercurial: Git can do this and I
>>> often find myself using Git on some projects for this feature alone.  However,
>>> I dislike the way that the partial commit is essentially the default way of
>>> committing in Git and I feel that it should be available for exceptional
>>> circumstances but not the default behaviour; this is one of the reasons I'd
>>> rather use Mercurial.
>> I guess that a similar thing could be done with a UI?
> I really really really hope so!  Git GUI does this quite well.
>> I'd be great to have this capability in TortoiseHg, for example.
> I think it's in the feature requests for TortoiseHG somewhere if
> memory serves me correctly.  The old GTK version supported record
> (hunk-by-hunk only) and there's something on the bug tracker
> discussing re-adding that.  In amongst the comments is discussion of
> line-by-line operation.

Yes that was probably one of the very few capabilities that were lost
when moving from TortoiseHg 1.x to 2.x. You can still do it by using
shelve but is not nearly as nice as it used to be.

>> Please forgive my ignorance, but could you please give a bit more
>> details about how it works? In particular, if you were to manually
>> edit an unapplied mq patch file _without_ your extension, in order to
>> split a hunk in two, how would you do it? Does the diff format allow
>> having two patches whose context overlaps?
> Actual editing of diffs can be quite a risky process.  You could
> certainly do it manually in an unapplied mq patch, but you're likely
> to have to mess around with recountdiff and the like to try to get it
> to work well (I've had very mixed results with this).  All that is
> required with this patch is the removal of lines starting with '+'
> signs and the replacement of '-' signs with space.  You can do more if
> you choose, but then you run the risk of the patch not applying
> cleanly (this is explained in the comment added to the patch).
> There's an example of the Git implementation (on which I based this)
> in footnote 1 of http://mercurial.selenic.com/wiki/RecordExtension
> I've no idea whether the diff format allows having two patches whose
> context overlaps.
> Does that answer your questions or have I missed the point?
> Al

I think it does. The reason why I asked how would you go about
manually splitting a hunk is that I was thinking how this could be
implemented in TortoiseHg. I don't think it is practical for
TortoiseHg to rely on the record extension if it requires you to
manually edit the patch file.

It seems that in order to support line-by line shelving/unshelving we
would need to find an automated way to remove one or more lines from a
hunk and vice-versa. So having a set of steps that you can safely
follow to do that would be great.


More information about the Mercurial-devel mailing list