{i} This page does not meet our wiki style guidelines. Please help improve this page by cleaning up its formatting.

Request Tracker Specific

Some repositories are extremely difficult to convert. For me, the one that was most difficult to convert came from Best Practical, the makers of Request Tracker.

Many repositories follow the model that is suggested and encouraged by the Subversion maintainers: Each project gets its own repository or its own directory at the top of a shared repository. Each project then gets three folders underneath this named trunk, branches, and tags. The Request Tracker repository (as of this writing) has about 19,700 revisions (about 9,700 of which are related to Request Tracker itself, the others are for other projects). Early in the history of the project, the trunk directory was abandoned and deleted, and all development was done from branches.

Mercurial strongly encourages that all development is done using a model closer to the Subversion style, with a trunk, branches, and tags. As such, I had to find a way to extract the trunk up to the deletion point, extract the branches, rebuild the trunk, merge the branches, and finally get the tags reapplied. This was definitely not an easy task.

In the end, I wound up writing up a program that handled the actual work of performing the conversion process. The results of that are available at http://bitbucket.org/pedersen/request-tracker-converter/

Note that this script is not a general purpose tool. It has been tailored to convert the Request Tracker repository only. As such, certain pieces are hard coded in that relate to Request Tracker. They can be updated and/or removed by you, but you need to be aware of these limitations. It should not be overly difficult to turn this into some sort of general purpose tool, but I reached my goal and called it done. I'll happily accept patches to do more, but have no intention of doing more myself.

The tool makes several assumptions, as well: The user is running Linux (required for some of the path names to work), and the user has a directory named src/vcs in their home directory, and the user has the "dot" tool installed (part of the graphviz package).

Once the tool is run, it will read the log history from the project, and turn that log history into a png graph showing tags and the branching tree. Using this graph, it is possible to recreate the path of the actual trunk. From my viewing, I found that the path of the trunk was continuous until work on 3.2 release began. After that, the trunk seemed to follow the development versions (3.3, 3.5, 3.7), with the release versions being branches of their own.

During my initial reviews, I found a number of smaller issues with the branches and tags caused by the way in which Subversion tracks the source revisions. I had to manually modify the resulting branches and tags dictionaries in order for their information to be correct. Once I had this done, the main work on splitting and recreating could begin.

First, the convert extension is used to create a reduced copy of the Subversion repository, showing only the data in the Request Tracker project. After that, the trunk is split from this repository into its own Mercurial repository. Once that is done, the remainder of the branches are split into their own Mercurial repositories, with their work being on the default/trunk in those repositories.

I next had to take advantage of something done by Subversion: The initial revision in each of those branch repositories is the result of a copy. As such, I am able to ignore the initial revision in the merge back. But now we come to the most significant issue for the conversion: Merging branch repositories (in which all work is done on default) into named branches in the main repository.

This required the creation of a new feature in the convert extension, the branchmap. It allows the user to specify that branchA in a source repository is to be named branchB in the destination repository. This feature will be included with the 1.3 Mercurial release, and is already in the development version of Mercurial.

Along with the merge back, I had to learn how to use the splicemap feature of the convert extension. This sounds more complex than it really is, especially with the fact that I was dealing with branching only, which provides a single parent revision. I had to locate the long revision id in order to use this. From the command line, this is "hg id --debug -r <version_number>". I needed the long revision id of both the source revision to use as the starting point *and* the destination revision to use as the new parent for that starting point. After finding this, I write up a splicemap file which contains two entries separated by a space: the source revision id and the destination revision id. If I was specifying a second parent for the source revision id, I could just a third entry onto the line.

With these two features in hand, I can write a simple merge procedure: It merges all changes from a branch repository into the trunk repository. Optionally, it can ignore the branchmap and attach the branch to the trunk. This is how I managed to rebuild the whole trunk. Finally, it deletes the branch repository from disk.

Finally, with that completed, I wrote a simple routine that manages to merge all of the branches into the trunk repository correctly. Unfortunately, I could not find an easy way to automate this, so each branch has its own line in that routine.

For a final cleanup, I then rebuilt the tags, attaching them to the correct location in the final repository using the Subversion revision that the convert extension stores in the resulting repository. After that, I delete an extra branch that was created but had no commits, and was only there due a typo. That gets followed by one final conversion to allow Mercurial to try to reorganize the repository to save disk space.

Generalized Lessons

It is very possible to use this script as a baseline for converting other difficult repositories. The primary drawback is the time required. For Request Tracker, the total time is about 13 hours from start to finish. It could be greatly enhanced by running multiple branch conversions in parallel, but I did not put forth the time to make it do so. As it stands, it will only run in single-threaded mode.

The primary points to change will be in the walkLog,splitBranch, and mergeBranchDirs methods.

walkLog :: Update the branch detection and tags detection. Also make sure to fix any manual entries for you. Finally, remove references to /rt/ and replace as appropriate for your conversion.

splitBranch :: Update the filemap that gets written. You will need to make sure that files are renamed and/or excluded as appropriate for your conversion.

mergeBranchDirs :: Rebuild the trunk and branches in the right order for your desired tree.

It's also worth checking out finalCleanup to make sure it will do what you wish it to do.


CategoryConversion

ProblematicConversions (last edited 2013-08-27 07:55:32 by mw)