|Deletions are marked like this.||Additions are marked like this.|
|Line 8:||Line 8:|
|IRC : vsh||Irc : vsh|
|Line 10:||Line 10:|
The updated version is available at http://bitbucket.org/vsh/hg-shallow/src/
The updated version is available at http://bitbucket.org/vsh/shallow-proposal/src
|Line 13:||Line 13:|
|== Shallow Cloning in Mercurial [GSoC proposal] ==
:Author: Vishakh Harikumar <firstname.lastname@example.org>
:Description: Google Summer of Code proposal to work on Mercurial Shallow Clone feature
|Line 17:||Line 15:|
|=== Abstract ===
The Shallow Cloning proposal is regarding adding support for shallow cloning in
Mercurial. This feature will allow cloning most recent parts of [large] repositories
in situations constrained by limits on resources such as storage space and
network bandwidth and reliability, preventing creation of a full clone.
=== Introduction ===
Mercurial is widely used by people and organizations as their tool for version
control. Many large repositories are managed by it. The drawback is that
anybody who wants to work with the repository has to clone the repository in its
entirety. The use cases for similar situations boil down to cloning limited
subset of the complete repository from a particular revision aka the shallow clone.
The shallow clone should work seamlessly with other other clones, which may be
full or shallow, when performing push or pull operations. When earlier history is
required it should be possible to deepen the clone by retrieving earlier revisions.
Guidelines for the implementation are in the Shallow Clone Plan and will
also include discussions with the rest of the community to flesh out details.
=== Goals ===
The goals I see for the project are:
* Implementing Trimming History
* Creation of local Shallow Clones
* Push, Pull for local Shallow Clones
* Tests to define Shallow Clones
* Support deepening of Shallow Clones
* Update bundle format and wire protocol
* Additional tests for network clones
===== Trimming History =====
Trimming of history will allow removing unwanted history from the repository from
individual revisions and ranges, to entire branches. I plan to implement this using
the punch approach as described in the wiki. This involves removing deltas from the
datafile and updating its length in the indexfile to -1. Problems to solve in the
approach are situations where deltas might not patch correctly and making sure hg
itself is aware of the trimmed history. Trimming will allow the size of the
repository to be reduced and keep only parts of the history that are needed.
===== Creation of local Shallow Clones =====
Local Shallow Cloning will work by keeping the complete changelog while truncating
and using the trimming command to remove all history from manifests and file logs
before that of the shallow root. This phase will also involve making decisions
about mercurial's view of shallow clones, such as the storage of the full version
and the deltas of the text, and modification to revlog and bundle format to support
shallow clones. Tests at this stage will be defining the structure of the clone
and used for regression testing as more goals are added.
===== Push, Pull and Bundle local Repos =====
===== Tests to define Shallow Clones =====
At this point shallow cloning of local repository will be complete. I will write
additional tests to exercise all possible cases. A comprehensive test suite will
define all the functions of shallow clones and can further be used to test shallow
clones that have been created over the network.
===== Support Shallow Cloning over network =====
Cloning over networks is done with the wire protocol. It does not currently support
shallow cloning, since it cannot work with individual changesets ,only a stream of
changegroups encoded in the bundle format. First I will update the bundle format
to inlclude enough information to create shallow clone at given revision. This will
be useful in the wire protocol. There already exists a plan for updating the wire
protocol. I will be coordinating with others working on the same, and add support
for shallow clones. This will enable shallow cloning over networks as well.
===== Additional tests for Network Shallow Clones =====
Write tests for wire protocol, bundle format and network clones. This should
complete the test suite for Shallow clones. I will also be updating the wiki
and help to cover all aspects of shallow clones.
=== Timeline ===
I am working through the details of shallow clones and will probably start
coding it before the official start date of the program. I have my final exams
in the first 2 weeks of May. The rest of the time I should be able to concentrate
on Shallow cloning.
* Implementing Trimming History [ 2 weeks ]
* Creation of local Shallow Clones [ 1 week ]
* Push, Pull for local Shallow Clones [ 1.5 weeks ]
* Tests for local Shallow Clones [ .5 weeks ]
* Support deepening of Shallow Clones [ 1.5 weeks ]
* Update bundle format and wire protocol[ 1 week ]
* Shallow cloning over network [ 2 weeks ]
* Additional tests/ing for network clones[ 1 week ]
=== About ===
I am a final year BTech student at MPSTME, India. I have written programs in C,
Basic and short stints with Java and Visual Basic(they made me do it :). Currently
most of my programming is in Python. I discovered Mercurial over a year ago and
have been using it for all my projects since. I have read through earliest commits in
mercurial repo when I found mercurial and in the process gained a better understanding
of its internals. I have since read through many modules in tip, for a better
understanding of shallow cloning as well. I intend to make contributions to Mercurial
in the future,via GSoC or otherwise.
This document and all related work are available at http://bitbucket.org/vsh/hg-shallow/
* Email: email@example.com
* IRC: vsh
=== References ===
|= Journal =
100616 write script to get size stats of revlog in a repo, look into discovery.py
100617 investigate consequences of pruning revisions.
100618 look into performance issue.
100619-100622 fix trimming in pull and changegroupsubset
Email: <vsh426 AT SPAMFREE gmail DOT com>
Irc : vsh
GSoC proposal: The updated version is available at http://bitbucket.org/vsh/shallow-proposal/src
100616 write script to get size stats of revlog in a repo, look into discovery.py 100617 investigate consequences of pruning revisions. 100618 look into performance issue. 100619-100622 fix trimming in pull and changegroupsubset