Command Server Project Proposal

My first meeting with Mercurial was in my previous work place. At the time I started working there, the development teams were using a (god forsaken) source control called StarTeam. I quickly grew tired of it and started looking for alternatives to take its place. Among the top DVCSs at the time, my absolute favorite was Mercurial due to its user-friendly approach, low learning curve, cross platform and a very open and helpful community.
Since then I've been following Mercurial looking for opportunities to give back. GSoC looks like a great one.

Most of my programming experience is in C++, Iv'e also done some Java and C# here and there. I've been using Python for a lot of small tasks the past couple of years but I've always wanted to see how a real application is written using it and in my opinion Mercurial is an excellent example of one.

When integrating with Mercurial, the recommended approach by the Mercurial team is to use the command-line interface. Mercurial goes to great lenghts to make sure the command-line interface doesn't change very often, thus ensuring existing tools who rely on it stability when upgrading. The other, unrecommended option (available to Python applications) is to use Mercurial's internal API [1], yielding better performance and more control at the cost of possibly breaking between releases of Mercurial (an example of such tools can be seen here [2], [5]).

The command server will aim to be the best of those two worlds. It will maintain stability throughout Mercurial releases and offer better performance over calling the command-line interface directly.

Existing tools I've looked at (MercurialEclipse, TortoiseHg, VisualHG, MacHG etc.) take the recommended approach and use the command-line interface. This is done by opening a process for every hg command. Tools written in Python usually import Mercurial and call it directly (saving process creation).

The specifics of how the requests to the server is to be determined, but an initial thought is something of this sort: "<path-to-repository>;<command-line>". The servers answer might look like this: "<exit-status>;<output>". An attempt was made by a Mercurial developer (hgrpc, source here [4]) to write something that behaves roughly like that. It doesn't offer anything beyond that in terms of performance (it does save process creation though). It can be improved in that regard by caching the repository object for a path it serves, reusing it in subsequent requests. Doing something with the ui object.

There are many hg commands that give meaningful output to the user (status, log, diff...) other than an exit status. The tedious part is parsing the output and this part is bound to be duplicated among tools. In this regard, Mercurial helps by offering a way to customize its output (as explained here [3]). Tools use this facility to arrange output of commands such as 'hg log' in a way that suits their needs. We might be able to use this to provide output that can be parsed more easily.

The above request/response 'protocol' is quite simple and can be taken further, perhaps by introducing a small protocol for the output of various hg commands. For instance, a response to an 'hg status' might look like this in JSON (or some other suitable format):



Basically this will remove all the boilerplate code tools need to write to parse output of certain hg commands (the server will have to take care of that) by having it in a nice data structure. But going this route means that the transition from how tools integrate with Mercurial today won't be as seamless.

The command server also opens up the possibility of querying a Mercurial repository without having to clone it locally (or having Mercurial installed for that matter), similiar to what hgweb offers. Some tools only need read abilities from a repository. They can benefit by talking to the command server (that'll run on the centralized server) rather than keeping a local clone that is constantly being updated. Another idea could be for GUI tools to add a 'repository explorer' that will let the user explore the tree, logs, diffs of a remote repository (somewhat similiar to SVN's repository explorer).

I see the command server being useful mostly for applications written in languages other than Python. But Python applications that choose not to mess with Mercurials internal API can also gain some performance improvements.


SummerOfCode/2011/IdanKamara (last edited 2013-08-29 12:34:47 by AugieFackler)