Differences between revisions 3 and 4

Branching and merging in Mercurial (and Git) explained

Since there seems to be a bit of confusion about just what branching methods are provided by Mercurial as opposed to e.g. Git (at least I was confused...), this page will hopefully explain the different branching mechanisms and alternatives in more detail.

Short-term branching in Mercurial

Let's start with the branching model in Mercurial, since it is somewhat simpler. Consider three developers: Alice, Bob and Clark. Alice has started work on a new project, and Bob and Clark are to take part in the development. For the sake of simplicity, let's assume we are in a local network, and all three developers can pull changes from each other. Alice's repository looks currently like this:

attachment:branching_expl_00.png

She has created two revisions, A and B. In her repository, these have gotten revision numbers 1 and 2. The revision B is the newest in the repository (called the tip in Mercurial speak, marked green), and also the parent of her current working directory state (marked with an asterisk). Now Bob and Clark both clone her repository and start working. Meanwhile, Alice continues to work too, of course. So after a short while, Alice's repository has changed:

attachment:branching_expl_01.png

Bob's repository

attachment:branching_expl_02.png

and Clark's repository contain changes too:

attachment:branching_expl_03.png

It is important to note that Bob's and Clark's changes are independent of Alice's, which means that each of them has created his own branch. Now, they want to merge their work, and since Alice is project leader, she decides to do it in her repository. She pulls both the changes made by Bob and by Clark into her repository, which now looks quite different from before:

attachment:branching_expl_04.png

Now her repository contains three so-called “heads”, which are changesets without children. The C revision is still the parent revision to her work directory, but since she pulled first Bob's and then Clark's changes, the tip of the repository is now changeset E. Mercurial supports any number of heads in the repository, and they don't even have to be named. One can give them names using the bookmarks extension, which would allow Alice to track Bob's and Clark's changes without merging them into her own branch. This is a bit similar to Git's remote-tracking branches (but it is not the same!). One should especially be aware that these heads are seen as permanent parts of Alice's development repository by Mercurial. Alice can manually strip them out of her tree, or create a clone repository containing only her changesets (thus discarding the additional heads), but this always requires a bit of extra work.

However, in this case Alice just wants to merge the changes made by Bob and Clark with her own work, which is the usual case. Thus she merges twice, first with changeset D and then with changeset E, resulting in a single branch of development again:

attachment:branching_expl_05.png

Now, Alice's repository has one single head G, which is also the parent revision of her working directory. Bob and Clark can pull this merged repository state from Alice, and everyone is synchronised once again.

Short-term branching in Git

At first, Alice tries the exact same approach in Git as we have seen before, and somehow this does not work quite as expected. The reason for this is that in Git, we only have one head. When Alice does two consecutive “git fetch” commands to get Bob's and Clark's work, these are inserted as “loose” objects in her local database. The latest result of a fetch is saved under the name FETCH_HEAD, with the consequence that the second fetch from Clark overwrites the saved SHA1 ID of Bob's data. So this is apparently not how things are usually done in Git, and Alice reads the documentation again.

To achieve the same workflow we have seen in the Mercurial example, Alice has actually two possibilities. The first solution would be to just pull Bob's changes and merge those first. In fact, if she synchronises with Bob's repository using the “pull” command, the merge takes place automatically. Afterwards, she does the same using Clark's repository, which results in a revision graph that looks exactly like the one we have seen in the Mercurial example (which is why I will not repeat it here).

The second possibility is to use two light-weight branches containing the changes Bob and Clark both make. This is an additional abstraction level that Git offers, which is not supported directly by Mercurial (although a third-party extension providing that functionality exists). Light-weight branches are completely separate branches inside one common repository. They are quite cheap with respect to space requirements, since they have some properties in common with Mercurial's heads. In fact, light-weight branches in Git are a lot like heads in Mercurial on the technical level, but they are separated much more strongly (and e.g. can easily be removed).

For her purposes, Alice creates two so-called “remote-tracking” branches in her repository, which are just branches that “remember” their origin (making it easier to update them). Afterwards, Alice can simply merge from those two mirroring branches into her own. Git can even do a single merge using both Bob's and Clark's changes at the same time (which is called an octopus merge), but this can create its own unique merging conflicts. The result is the same as before, and as soon as Bob and Clark pull from Alice's repository, they will end up with the merged state again.

Implicit (unnamed) branches

Both Git and Mercurial support unnamed local branches. Actually we have seen them already, since Mercurial's heads and Git's FETCH_HEAD feature are just the very same. We have also seen that Git does not offer many tools to work like this, since it more or less expects you to name every branch. However, one can create unnamed branches quite quickly in both systems. Consider the state of Alice's repository from the beginning:

attachment:branching_expl_06.png

Instead of pulling Bob's and Clarks changes, Alice rethinks her changes in C and steps back to the B state. She then makes different changes to try something out and commits these, resulting in yet another changeset H:

attachment:branching_expl_07.png

attachment:branching_expl_08.png

Alice has thus created an “anonymous” branch. In Mercurial, this is absolutely no problem, since it is just another head. Git however expects you to name this branch if you want to continue to work in it. If you do not, the only way to find it after switching back to the master branch is consulting the reflog, and if you wait too long, it can even be pruned by a future garbage collection cycle. By the way, although it does not require you to do so, naming such a branch is a good idea in Mercurial as well (again through the bookmarks extension).

One should perhaps note an important difference in behaviour between Mercurial and Git at this point. While for Git, branches are more or less separate entities (and can be adressed from outside), the heads in Mercurial are considered a part of the whole branch formed by the repository. Thus if you do a simple synchronisation in Git you will (by default) only get the master branch, while in Mercurial you will always get all heads. You can clone/synchronise specific heads as well in Mercurial, but this is a bit more work since the head names (bookmarks) are local to the repository and cannot be used from outside.

CategoryHowTo

-  ⇤ ← Revision 3 as of 2009-01-25 17:02:23 → 
  Size: 1684
  Editor: RobertFendt
  Comment: Initial version
+   ← Revision 4 as of 2009-01-25 17:09:31 → ⇥
  Size: 7677
  Editor: RobertFendt
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 8:
+attachment:branching_expl_00.png
-Line 9:
+Line 11:
+attachment:branching_expl_01.png
-Line 12:
+Line 16:
+attachment:branching_expl_02.png
-Line 14:
+Line 20:
+attachment:branching_expl_03.png
-Line 15:
+Line 23:
+attachment:branching_expl_04.png

Now her repository contains three so-called “heads”, which are changesets without children. The C revision is still the parent revision to her work directory, but since she pulled first Bob's and then Clark's changes, the tip of the repository is now changeset E. Mercurial supports any number of heads in the repository, and they don't even have to be named. One can give them names using the bookmarks extension, which would allow Alice to track Bob's and Clark's changes without merging them into her own branch. This is a bit similar to Git's remote-tracking branches (but it is not the same!). One should especially be aware that these heads are seen as permanent parts of Alice's development repository by Mercurial. Alice can manually strip them out of her tree, or create a clone repository containing only her changesets (thus discarding the additional heads), but this always requires a bit of extra work.

However, in this case Alice just wants to merge the changes made by Bob and Clark with her own work, which is the usual case. Thus she merges twice, first with changeset D and then with changeset E, resulting in a single branch of development again:

attachment:branching_expl_05.png

Now, Alice's repository has one single head G, which is also the parent revision of her working directory. Bob and Clark can pull this merged repository state from Alice, and everyone is synchronised once again.

== Short-term branching in Git ==

At first, Alice tries the exact same approach in Git as we have seen before, and somehow this does not work quite as expected. The reason for this is that in Git, we only have one head. When Alice does two consecutive “git fetch” commands to get Bob's and Clark's work, these are inserted as “loose” objects in her local database. The latest result of a fetch is saved under the name FETCH_HEAD, with the consequence that the second fetch from Clark overwrites the saved SHA1 ID of Bob's data. So this is apparently not how things are usually done in Git, and Alice reads the documentation again.

To achieve the same workflow we have seen in the Mercurial example, Alice has actually two possibilities. The first solution would be to just pull Bob's changes and merge those first. In fact, if she synchronises with Bob's repository using the “pull” command, the merge takes place automatically. Afterwards, she does the same using Clark's repository, which results in a revision graph that looks exactly like the one we have seen in the Mercurial example (which is why I will not repeat it here).

The second possibility is to use two light-weight branches containing the changes Bob and Clark both make. This is an additional abstraction level that Git offers, which is not supported directly by Mercurial (although a third-party extension providing that functionality exists). Light-weight branches are completely separate branches inside one common repository. They are quite cheap with respect to space requirements, since they have some properties in common with Mercurial's heads. In fact, light-weight branches in Git are a lot like heads in Mercurial on the technical level, but they are separated much more strongly (and e.g. can easily be removed).

For her purposes, Alice creates two so-called “remote-tracking” branches in her repository, which are just branches that “remember” their origin (making it easier to update them). Afterwards, Alice can simply merge from those two mirroring branches into her own. Git can even do a single merge using both Bob's and Clark's changes at the same time (which is called an octopus merge), but this can create its own unique merging conflicts. The result is the same as before, and as soon as Bob and Clark pull from Alice's repository, they will end up with the merged state again.

== Implicit (unnamed) branches ==

Both Git and Mercurial support unnamed local branches. Actually we have seen them already, since Mercurial's heads and Git's FETCH_HEAD feature are just the very same. We have also seen that Git does not offer many tools to work like this, since it more or less expects you to name every branch. However, one can create unnamed branches quite quickly in both systems. Consider the state of Alice's repository from the beginning:

attachment:branching_expl_06.png

Instead of pulling Bob's and Clarks changes, Alice rethinks her changes in C and steps back to the B state. She then makes different changes to try something out and commits these, resulting in yet another changeset H:

attachment:branching_expl_07.png

attachment:branching_expl_08.png

Alice has thus created an “anonymous” branch. In Mercurial, this is absolutely no problem, since it is just another head. Git however expects you to name this branch if you want to continue to work in it. If you do not, the only way to find it after switching back to the master branch is consulting the reflog, and if you wait too long, it can even be pruned by a future garbage collection cycle. By the way, although it does not require you to do so, naming such a branch is a good idea in Mercurial as well (again through the bookmarks extension).

One should perhaps note an important difference in behaviour between Mercurial and Git at this point. While for Git, branches are more or less separate entities (and can be adressed from outside), the heads in Mercurial are considered a part of the whole branch formed by the repository. Thus if you do a simple synchronisation in Git you will (by default) only get the master branch, while in Mercurial you will always get all heads. You can clone/synchronise specific heads as well in Mercurial, but this is a bit more work since the head names (bookmarks) are local to the repository and cannot be used from outside.

Diff for "BranchingExplained"

Branching and merging in Mercurial (and Git) explained

Short-term branching in Mercurial

Short-term branching in Git

Implicit (unnamed) branches