Handling thirdparty 'vendor' branches

Giorgos Keramidas keramida at ceid.upatras.gr
Thu Jul 10 07:31:23 CDT 2008


I'm trying to understand how we can create 'vendor branches' in
Mercurial.

The main idea of vendor branches in the places where I've used them
(i.e. in CVS and Perforce repositories) is that there is a place where
'clean imports of code drops' are committed, i.e. in the style of
Perforce the depot path

    //depot/vendor/bind/9.x/...

would be used to import clean, unmodified snapshots of the BIND 9.X
releases.

Then in other parts of the repository, we would `integrate' changes from
the bind vendor branch, by checking out the target branch:

    //depot/project/main/...

and pulling changes from the vendor branch of bind:

    p4 integ //depot/vendor/bind/9.x/... contrib/bind/...

Having the history of 'imports' in //depot/vendor/bind/9.x/... is pretty
nice, and it is very useful that other branches can pull only when they
are ready to pull & merge.  Another nice aspect of the vendor branches
is that there is a well-known place where clean copies of the code
exist, including the full history of when, who and why they did the
import of the snapshots.

I've been trying to understand how we could simulate something similar
with Mercurial.  My attempts to do something similar with Perforce and
CVS vendor branches so far are described below.

  * Do they look like the 'right thing' to do to track vendor
    code independently in a second repository, with an import
    history of its own?

  * Are there any gotchas that I missed, but I should be careful
    about?

Using a clean 'vendor branch' repository for each thirdparty component
======================================================================

Using a separate 'vendor branch' repository for each component, I tried
creating the following repositories:

  /tmp/vendor/foo
  /tmp/hgtest

In the first repository, I created 2 'sample imports':

  keramida at kobe:/tmp/vendor/foo$ hg short
  1:e465e1e83467 | 2008-07-10 14:17 +0300 | keramida: Import foo version 8.0 (800040)
  0:cb38408752cc | 2008-07-10 14:16 +0300 | keramida: Import foo version 8.0 (800001)
  keramida at kobe:/tmp/vendor/foo$

The manifest of the 'foo' project branch contains at this point:

  keramida at kobe:/tmp/vendor/foo$ hg manifest
  README.TXT
  version.h
  keramida at kobe:/tmp/vendor/foo$

Then I started an unrelated `project' in `/tmp/hgtest' with a different
set of files:

  keramida at kobe:/tmp/hgtest$ hg short -r0
  0:a915c132d2f1 | 2008-07-10 14:19 +0300 | keramida: Add new script:
  tslog
  keramida at kobe:/tmp/hgtest$ hg manifest 0
  bin/tslog
  keramida at kobe:/tmp/hgtest$

The next step was to pull (with the -f option) from /tmp/vendor/foo.
After pulling I ended up with a tree with two 'origin' commits:

  keramida at kobe:/tmp/hgtest$ hg glog
  @  changeset:   2:e465e1e83467
  |  user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
  |  date:        Thu Jul 10 14:17:43 2008 +0300
  |  summary:     Import foo version 8.0 (800040)
  |
  o  changeset:   1:cb38408752cc
     parent:      -1:000000000000
     user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
     date:        Thu Jul 10 14:16:22 2008 +0300
     summary:     Import foo version 8.0 (800001)

  o  changeset:   0:a915c132d2f1
     user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
     date:        Thu Jul 10 14:19:27 2008 +0300
     summary:     Add new script: tslog

  keramida at kobe:/tmp/hgtest$

Now I have two unrelated manifests, which don't fit the way I want the
vendor code to look:

  keramida at kobe:/tmp/hgtest$ hg manifest 2
  README.TXT
  version.h
  keramida at kobe:/tmp/hgtest$ hg manifest 0
  bin/tslog
  keramida at kobe:/tmp/hgtest$

Renaming the vendor code locally was easy:

  keramida at kobe:/tmp/hgtest$ hg up -C 2
  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
  keramida at kobe:/tmp/hgtest$ mkdir -p contrib/foo
  keramida at kobe:/tmp/hgtest$ hg rename * contrib/foo
  keramida at kobe:/tmp/hgtest$ hg st
  A contrib/foo/README.TXT
  A contrib/foo/version.h
  R README.TXT
  R version.h
  keramida at kobe:/tmp/hgtest$ hg ci -m 'Rename foo version 8.0 (800040) to contrib/foo'

Then, the vendor 'branch' of the history is merged with the latest local
version:

  keramida at kobe:/tmp/hgtest$ hg update --clean 0 && hg merge
  1 files updated, 0 files merged, 2 files removed, 0 files unresolved
  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
  (branch merge, don't forget to commit)
  keramida at kobe:/tmp/hgtest$ hg ci -m 'Merge import of foo version 8.0 (800040)'
  keramida at kobe:/tmp/hgtest$

Now I have a history with two 'starting' commits, which seems ok:

  keramida at kobe:/tmp/hgtest$ hg glog
  @    changeset:   4:ee857a4a0f3b
  |\   tag:         tip
  | |  parent:      0:a915c132d2f1
  | |  parent:      3:bce5b70cb331
  | |  user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
  | |  date:        Thu Jul 10 15:19:12 2008 +0300
  | |  summary:     Merge import of foo version 8.0 (800040)
  | |
  | o  changeset:   3:bce5b70cb331
  | |  user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
  | |  date:        Thu Jul 10 15:17:39 2008 +0300
  | |  summary:     Rename foo version 8.0 (800040) to contrib/foo
  | |
  | o  changeset:   2:e465e1e83467
  | |  user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
  | |  date:        Thu Jul 10 14:17:43 2008 +0300
  | |  summary:     Import foo version 8.0 (800040)
  | |
  | o  changeset:   1:cb38408752cc
  |    parent:      -1:000000000000
  |    user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
  |    date:        Thu Jul 10 14:16:22 2008 +0300
  |    summary:     Import foo version 8.0 (800001)
  |
  o  changeset:   0:a915c132d2f1
     user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
     date:        Thu Jul 10 14:19:27 2008 +0300
     summary:     Add new script: tslog

  keramida at kobe:/tmp/hgtest$

This is very nice.  It's also a fairly accurate representation of what
is going on with the history of the source code, because there really
_are_ two projects whose histories are consolidated with a merge to form
the full source tree I want to use.

It also seems to work nicely for future imports & merges:

  keramida at kobe:/tmp/hgtest$ hg inc /tmp/vendor/foo
  comparing with /tmp/vendor/foo
  searching for changes
  changeset:   2:4e4d8e109a26
  tag:         tip
  user:        Giorgos Keramidas <keramida at ceid.upatras.gr>
  date:        Thu Jul 10 14:22:30 2008 +0300
  summary:     Import foo version 8.0 (800052)

  keramida at kobe:/tmp/hgtest$ hg heads \
      --template '{rev}:{node|short} | {date|age} | {author|user} | {desc|firstline}\n'
  5:4e4d8e109a26 | 62 minutes | keramida | Import foo version 8.0 (800052)
  4:ee857a4a0f3b | 5 minutes | keramida | Merge import of foo version 8.0 (800040)

Then merging the two heads does the Right Thing(TM) with the locally
renamed files:

  keramida at kobe:/tmp/hgtest$ hg up -C 4
  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
  keramida at kobe:/tmp/hgtest$ hg merge
  merging contrib/foo/README.TXT and README.TXT to contrib/foo/README.TXT
  merging contrib/foo/version.h and version.h to contrib/foo/version.h
  0 files updated, 2 files merged, 0 files removed, 0 files unresolved
  (branch merge, don't forget to commit)
  keramida at kobe:/tmp/hgtest$ hg st
  M contrib/foo/README.TXT
  M contrib/foo/version.h
  keramida at kobe:/tmp/hgtest$ hg ci -m 'Merge import of foo version 8.0 (800052)'

That's amazing.  It's _exactly_ what I wanted Hg to do.  To track the
rename of the vendor code only in the 'target' tree, and then DTRT when
merges need to touch/affect files in the target tree.  Fantastic :)


Now what?
=========

Now my main concerns about this are:

  * Does this look like a reasonable way to handle thirdparty
    code imports?

  * Am I missing something that may potentially mess things up,
    if I keep pulling and merging from a 'clean vendor branch' of
    imports like this?

  * This seems to work nicely with just one thirdparty component,
    but do you see any potential pitfalls when pulling and
    merging dozens of thirdparty components?




More information about the Mercurial mailing list