Why can't I merge when there are uncommitted outstanding changes?

Giorgos Keramidas keramida at ceid.upatras.gr
Fri Apr 23 05:34:39 CDT 2010


On Thu, 22 Apr 2010 08:34:45 -0700 (PDT), Aardwolf <toiletpot at gmail.com> wrote:
> The first version control system I worked with was SVN. Then later I
> had unfortunately to work with CVS, and I liked SVN a lot better. Then
> now we switched to HG and I hoped it'd be better than SVN, but I
> appear to have a very hard time adjusting, simply because I still
> can't grasp while working in different files than other people could
> even require a merge at all or give problems with my working directory
> at all. Technically there seems to be no reason for problems here. And
> SVN handled this always exactly as I would expect.

There usually *is* a reason but it won't bite you until the most
inopportune moment.  The reason why you have to merge even if you made
changes to different files is simple: Mercurial does not do *semantic*
merges.  It does not know what the contents of the files _mean_ for
everyone and anyone who will ever read them.

Files tracked under Mercurial are very often source code.  Every file
has interactions with parts of, or it uses the interfaces provided by
some other files.  This means that if you use a model of history like
the one used by Subversion you may start with a history like this:

    --------- [30] --- [31] --- [32]

When you check out revision 32 in your working directory you run some
tests and you see that two files don't work together very well: one of
the functions from lib/libfoo/foo_open.c fails when called from another
library at lib/libbar/bar_init.c.

So you run the tests a few more times, and you finally come up with a
patch that sits on top of revision [32] and appears to fix the problem:

               State of the files       |    Local working-copy
               in the remote            |    changes
               repository               |
                                        |
    --------- [30] --- [31] --- [32] ------- [patch 1]
                                        |

You run the tests once more and you see that it all works.  So you
prepare to svn commit your [patch 1] state.  But someone else has
managed to commit his own stuff before you, so the repository really now
looks like this:


               State of the files               |    Local working-copy
               in the remote                    |    changes
               repository                       |
                                                |
    --------- [30] --- [31] --- [32] --- [33] ------ [patch 1]
                                 *              |

Now if there are no conflicts at the _file-level_ svn will permit your
patch to be committed.  But does this new state of the entire repository
work?  What if the changes in revision [33] affected other parts of the
lib/libbar/ library and introduced even *more* instances of the bogus
interface you tried to fix in [patch 1]?  What if a _third_ person
committed version [34] while you were merging your patch on top of [33]?

You will never know, until you check out a full and clean copy of the
_entire_ branch again, and run the tests once more.  By then the commit
you just pushed to the repository may have been pulled into the working
copies of any number of people.  So you just committed something that
gives you the impression of having fixed the bug, but the important
detail is that you don't really know *yet*.

What you just did with Subversion is a bit of a juggling operation
between four different states of the repository:

    1. The original, unpatched revision [32] code

    2. Your patched version of revision [32]

    3. The original, unpatched revision [33] somebody else committed

    4. The merged patch of yours on top of revision [33]

You started from state (1), and patched your source to state (2).  The
tests seemed to pass for state (2), so you committed on top of state (3)
creating a final state (4) in the repository.  The most important detail
is, however, that you are *allowed* to commit files in the repository
that create a 'mix and match' state between these four different copies
of the source tree _without_ being able to test this version well in
advance of the commit itself.

You have to race other committers between "svn update" and "svn commit"
for a consistent state of the source tree.  If nobody else commits even
a trivial change between your own "svn update" and "svn commit"
commands, then all is good.  If they do, you just created a "mixed copy"
of the source tree files that has *never* been tested as a consistent
set of files before hitting the repository.

The downsides of this are:

  * If you are asked to reason about the state of the repository before
    and after your own commit it is a bit hard or even impossible to do
    with svn (it depends on who else got a chance to push their own
    stuff between the time you checked out the source and the time you
    pushed your own commit).

  * You are literally *forced* in some cases to push changes to the
    repository that you have *not* tested as a consistent, coherent
    whole.

Now having said that there are many projects who find subversion useful
and can mostly get away with these two down-sides.  The world is not
going to end if you push a set of changes that update the documentation
of something without testing that it matches the actual code.  The world
is not going to end if you break the build for half an hour either.

But it's still annoying...



More information about the Mercurial mailing list