Thoughts and suggestions around working with shared libraries
Fred Sundvik
fsundvik at nic.fi
Sat Jan 15 13:43:40 CST 2011
Hi,
I have been thinking of moving to mercurial from perforce for a while
now. However what's mostly been stopping me, is the support for shared
and external libraries. Now, when the subrepository support finally
seems to be stable and good enough, things are changing.
But I still have a few issues. First let me explain my proposed project
setup. I have two types of libraries. First shared libraries that are
just my own code. They in there own repositories and are constructed in
a traditional way, with a main branch, and possible feature branches.
* version1
|
* version2
| \
| * feature1
| |
| * feature1 v2
| /
* merged with feature 1
Then we have external libraries, that are mainly maintained by someone
else, and could be controlled by any version control system. However
theese quite often needs local patches, that should stay local, and can
be merged when the official version changes. Occassionally we need to
make official patches that are pushed to the official server. Theese
libraries also get their own repositories.
* initial version
| \
| * official version 1
| / |
* | merged with official version
| |
* | local patch
| |
| * official version2
| / | \
* | | merged with official version 2
| | |
| | * official patch by us
| | /
| * official patch merged and updated to the official version
| /
* merged with the patched version
And finally we have different projects that consists of subrepositories,
so the structure is something like this. Where each library is a
subrepository, using relative paths. The projects can then of course do
their local patches to the libraries, and different projects can use
different versions of the libraries.
libs/local_library_1
libs/local_library_2
external_libs/external_library
src/
Everything might seem to be ok on the surface, however there's a few
problems, so far. Let's start with the external library setup. I would
like to have the official branch point to the external repository, and
the default branch just behave as a normal mercurial repository. This is
not something that perforce can do either, but it's something that I
really would want to see support for.
The workaround is of course, to use the native version control, grab the
required version, and then just copy the files manually to the official
branch. The suprepository extension supports external version control
system, but this doesn't help here, since I want the subrepository to be
at the root. It also doesn't support branching like I want.
Instead my proposal is to be able to create a branch as a remote
repository of any type, mercurial, git, svn and so on. This branch will
always have the initial repository as the base. Inside this branch, you
would be using the native commands of that version control system.
You can commit snapshots at any time that you don't have any locally
changed files, with an additional parameter to hg commit, to not confuse
the system if the external branch is a mercurial one. Snapshots are
saved normally, but they only contain the history of the snapshots, not
the full external history. This is clearly the simplest way to do it,
and it also saves a lot of space. You can always switch to this branch
and use the native version control to see the full history.
Along with the snapsots, a special file is included, with the path to
the external repository and it's version, just like the subrepositories
extension currently does. Internal version control files and
directories(like .svn) are not included in the snapshot.
You can use hg update to switch to any snapshot, and in that case, it
always calls the native version control, to switch to that version.
When you switch to another branch, which again shouldn't be allowed with
local changes, the internal version control files are copied to a hidden
place, for example inside the .hg directory. Along with them is stored
the changeset of the active snapshot. When you switch back, the first
thing it does, is updating to the last active snapshot and then copies
back the internal version control files. And finally does a native
version control update to the correct version, followed by a hg update
to that version.
The special case, where the local state doesn't exist, remember the
special version control files are stored only locally, is also easy to
support, just start with an empty workspace, and call the native version
control system to update.
Note that this system, now allow us to do mercurial branching for those
snapshots, as long as we only store one official last external state,
but theese branches also always are in sync with a changeset of the
external version control system.
But remember, we wanted to do local changes, that are not in sync with
the external version control system, like the default branch in the
example above. This can now be done using normal mercurial merge and
branching support.
The above might seem confusing, partially because my native language is
not English, but mostly because I skipped a lot of details, but I have a
very clear idea, of how almost every special case could work, so just
ask me if you don't understand.
We now support the external library setup above. And now when we have
this support, the subrepositories should be simplified, to support only
relative paths and nothing else. I'm always in favor of simplicity, and
two systems that can be used for the same things are never good, both
codewise and userwise.
Continuing in terms of simplicity I also propose that the .hg
directories of subrepositories, should be stored inside the main .hg
one. This would get rid of the pull/update problem, pull would always
pull all subrepositories, making offline work. I don't know the internal
structure of mercurial well enough, to tell exactly how they should be
stored, but I'm sure you could come up with a way.
But there's one huge problem left with the setup above. When you have
done project specific changes to the subrepository, and want to push
back some changes only to the main repository. In this case perforce is
really superior, allowing you to integrate single files or directories,
or even take parts of the file(allthough in that case, you need to force
another integrate, if you need further changes later). This problem is
not releated to only this use case, it's a common problem when merging
branches. For example you have a release branch, but fixed a bug, in the
default branch, and now you want only this single bugfix to the release
branch.
I have seen suggestions about cherry picking and hg transplant,
export/import, or to use the mq extension, but to be honest they are a
mess, there should really be an easier way for such a common problem.
One option would be to have some kind of forced merge, the two branches
are merged, but you as a user are given a chance to select exactly which
parts you should merge. Theese merges are not part of further normal
merges, but they still stores extra meta information, so that you can
see where the merge came from. To filter the merge before you make the
choices, you could also specify file wildcard patterns.
Transplant already almost does this, but for almost all use cases, it's
way too complicated, remember simplicity. It's also not documented very
well. I have hard time figuring out exactly what it does. My suggestion
might also not be the best, so I'm open to other better suggestions, or
corrections, if transplant for example is perfect for this case.
This message is getting way too long, so I think I stop here for now.
Ok, just some general comments about Mercurial.
Overall Mercurial is quite good, but it definitely suffers from not
being an integrated package, with all the different extensions, some
that does almost the same thing. It also suffers from trying to do any
possible workflow that you can imagine.
What you should do instead, is to figure a minimum set of workflows,
that should suit all projects. Document those workflows very well, make
sure that those workflows can be handled by mercurial natively, without
any extensions. Make the most common things that users do, as easy to do
as you can. I mean for example pulling should automatically update,
unless you want otherwise. This could need a new set of commands, and
different parameters, but that would be well worth it IMO.
More information about the Mercurial
mailing list