Interested in working on Partial Cloning

Peter Arrenbrecht peter.arrenbrecht at gmail.com
Wed Mar 18 12:46:43 CDT 2009


On Wed, Mar 18, 2009 at 7:27 AM, Madhusudan C.S <madhusudancs at gmail.com> wrote:
> Hi Peter and all,
>       I am a prospective GSoC 2009 student. I am willing to
> work on Partial Cloning. I read the description on the ideas page.
> Seems interesting to me. The ideas page says it is a work in
> progress and Peter is working on it. I also read the comments
> on the idea. A link to issue 105 is given there. I am going through
> the comments on the issue tracker. The number comments there
> are pretty large. Going through them carefully one after the other.
> Other than that I have cloned the mercurial repository and going
> through the mercurial source code. Can any of you please link me
> to the work that is in progress now, if there is anything other than
> the issue 105. Can any of you kindly give links and pointers to other
> materials and documents I may have to go through to get a fair
> idea about the internal concepts and architecture of Mercurial for
> me to better understand what needs to be done and what not?

Hi Madhusudan

Great to hear you are interested in this area. Beware, though, that it
is a fairly complex project, especially for a newcomer to Mercurial.
Nevertheless, here are a few comments.

We have to differentiate two separate but related projects in this area:

 * "shallow cloning", which clones only more recent history, and
 * "narrow cloning", which clones only a subset of the files and
folders (issue105).

My current work is focused on the former. You can find it at:

  http://bitbucket.org/parren/hg-shallow/
  http://www.selenic.com/mercurial/wiki/index.cgi/ShallowClone

Last year's SoC had a project addressing the latter, though sadly it
did not succeed:

  http://bitbucket.org/frechtenstein/partial-soc/  (a queue repository)

Background on Mercurial's internals can be found on the wiki:

  http://www.selenic.com/mercurial/wiki/index.cgi/DeveloperInfo (see
"Mercurial Internals")

I plan on pushing shallow clones forward in the near future (needs yet
another rewrite). This will involve extending the revlog format with
more general support for locally modified fields (like nulled
parentids). It will also require extending the wire protocol
(selectable diff parent, announce root node, etc.). And then a lot of
other work. The extended revlog format might come in handy for narrow
clones as well (we'll likely be faking manifests instead of
parentids).

An essential part of both projects is getting a good understanding of
the different scenarios one might encounter, the corner-cases they
provoke, and writing tests with good coverage. The latter is somewhat
tricky as we shall likely have to test the same basic scenario with
varying modes of interaction (local clone / remote clone via http /
via ssh / via bundle / from/to other partial repos with more/less
data, etc.).

For narrow clones, this is still fairly unexplored territory, I'm
afraid. Last year's student failed to make progress here. So you
should be prepared to spend a considerable amount of time on just
thinking about scenarios and their implications, and *communicating
and discussing* them, not actual coding. This is an area where I think
someone already acquainted with Mercurial as a user would have an
easier time.

Forgive me if I sound a little discouraging, but last year's
experience left me wary. So I'd rather not pick this up again unless I
get a strong feeling the student is up to the task. But I'd be ever so
happy should your endeavours turn out to instill that feeling.

-parren



More information about the Mercurial-devel mailing list