[GSoC] My application (draft)

altanis at ceid.upatras.gr altanis at ceid.upatras.gr
Sat Mar 29 19:53:11 CDT 2008

This is my application for GSoC. I explained the core concepts of the
project as though this would be read by someone who doesn't know anything
about it, although I am not sure I should have. Any comments and
corrections are welcome.



Mercurial is a Distributed Version Control System. Central to the concept
of Version Control, is the ability to copy a project's code repository, or
'clone' it, in Mercurial terminology.

Sometimes, it is useful to obtain only a part of a repository, up to a
point in the project's history. That is called shallow partial cloning (or
History Trimming). Examples include:
-a user wants to test the latest version of the program, not take part in
development: they only need the latest snapshot, to compile and use.
-a user that has limited bandwidth (eg is on dialup) or space (eg is on a
company or school network, with accounts with limited quotas) would prefer
to download only the latest snapshots of a repository and work on that.
-a version-controlled project is very long and extensive, and the changes
beyond a point in a project's history are useless for routine development

This is in no way an exhaustive list. In general, it is useful for a VCS
to offer the flexibility of partial history cloning.

However, Mercurial can only clone whole, intact repositories. For Google's
Summer of Code 2008, I will implement shallow partial cloning.

Some work has already been done in Mercurial that can provide a basis for
this project. This consists of Chris Mason's History Punching:


and Brendan Cully's Overlay Repositories:


Both are old (17 and 11 months respectively), unmantained, and do not work
with the current version of Mercurial. However, their design can be
re-implemented on the current Mercurial codebase.

The two approaches are quite different. History Punching tries to solve
the History Trimming project directly, providing a switch for clone that
removes certain patches from history when cloning. That calls for some
tuning, as Mercurial's revision logs don't store all revisions fully, but
store some and also store information to reconstruct all the rest.
Overlay Repositories are a broader concept which provides varying
functionality, but can also serve as the basis for shallow partial
cloning. Using this concept, hg can create an Overlay Repository, which is
an abstract repository, which, while not storing all the required
information, acts as a full repository by relaying requests for
information it doesn't hold to some other (in theory possibly more than
one), full repository. Partial cloning emerges when the amount of data
that the overlay repository will hold is determined: it can begin as a
partial clone of another repository (the parent), and when the user
requests an older piece of information, it can relay the request to the
parent, and respond with the information returned by the parent.

Which method will be used will be determined during the Community Bonding
Period of the GSoC timeline. History Punching, appears to be more in line
with the principles of fast, focused programming (KISS, YAGNI), is more
straightforward and focused on the resulting feature. However, the Overlay
Repository is perhaps a more useful feature for the project in general,
albeit more daunting to implement for a student, and I will mainly
implement parts of it integral to History Trimming (although, if time
permits it, I will try to complete as much of the Overlay Repository
concept as I can).

About me:
I am a university student, on the fifth year of the Computer Science and
Informatics Department of the University of Patras. I have completed
several coding projects for my school, in C, C++, Java, Prolog and Python.
I recently discovered Python and have found it very elegant and powerful.
I have never used a VCS, but for SoC I studied VCS concepts and I believe
I now understand them sufficiently.

My only commitment is my diploma thesis, which I am working on, and will
probably still be working on in the summer. Also, I might also need a
couple of loose weeks during my schools exams period (in June), to study
for three examinations (that isn't to say that I won't work during that
period, only a little less).


More information about the Mercurial-devel mailing list