[PATCH 00 of 14] shallow clone patches and status

Peter Arrenbrecht peter.arrenbrecht at gmail.com
Thu Jul 29 11:05:02 CDT 2010


On Fri, Jul 16, 2010 at 9:15 AM, Vishakh H <vsh426 at gmail.com> wrote:
> This patch series enables creation of shallow clones with truncated revlogs.
>
> All shallow clones contain the complete changelog, but for manifests
> and filelogs we keep only the nodes needed in shallow clone. If any
> parent nodes in the revlogs are missing we create a punched entry
> for the missing node with nullrevs for parents.
>
> Shallow clones are created with shallowroot stored in .hg/shallow file.
> All revlogs in shallow clone are created with REVLOGSHALLOW flag, and
> can be identified as shallow. Thus on subsequent interactions with full
> clone the shallow one can mention its shallowroot and get only the manifest
> and file nodes of missing descendants of shallowroot.
>
> Comparing sizes of repos with empty working dir:
>
> Repo     Full   Shallow at/near tip
> hg       28M    9M
> linux2.6 1.4G   290M

Hi Vishakh,

I finally got around to looking at your patches. Sorry for the delay.
Here are some high-level comments. I'll send detailed comments in
replies to the individual patches tomorrow. Thanks by the way for
splitting them into incremental chunks. I guess in the final commit
these would again get collapsed, but it's fine for review. Thanks also
for the work you did so far. It looks promising, though I fear the
road is still long.

At the Paris sprint I understood that Matt wanted to add shallow
cloning in a two-step process:

 1) Add punching of unwanted data. A shallow clone still contains the
entire changeset graph and all the manifests, but punched entries for
individual filerevs which are outside of the shallow scope.

 2) Add pruning of unwanted manifests and filerevs. This is what you
did in one step if I understand your work correctly.

Maybe I am mistaken. Please check with Matt.

It seems to me that possibly-target-absent filerevs [1] are not fully
addressed. You do have code for the initial clone, but I see no code
for the situation when a merge with an absent branch pulls in files
unchanged from how there were on that absent branch. This, however, is
one of the most crucial problems with shallow clones. Both because you
need to make sure changegroupsubset() sends enough data, and because
you need to determine how you are going to store the data locally. The
latter could be problematic if it is possible that you already have a
punched entry for the formerly-absent filerev in your revlog. Then
where do you store it now? Do you rewrite the revlog? Or can you prove
this situation cannot arise? The latter is what I would hope.

The above leads me to another gripe. You now have a test. One test.
Which does not really expose any really interesting situations, I
think. No merges in there. No pushing back from the shallow clone. No
merging inside the shallow clone.

I am attaching three files with a bunch of scenarios from my earlier
work on shallow clones, one of which contains unit tests for my revlog
changes. Also, if you look at the *.rextile files in [2] you'll see a
bunch of elaborated test situations. I'm not saying you have to be as
verbose and incremental as I did it there. But you should cover the
situations as well. You should also test what commands like log, glog,
incoming, outgoing, etc. do in your shallow clones.

I'll be on vacation for the next three weeks (starting Saturday) so
I'm afraid I won't be able to follow up on this. Hope you'll find
other sparring partners to continue your work.

-parren

[1] http://mercurial.selenic.com/wiki/ShallowClone#Send_filerevs_which_are_.28possibly.29_absent_in_the_target
[2] http://bitbucket.org/parren/hg-shallow/src/p.pull-http/tut/src/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-shallow
Type: application/octet-stream
Size: 1895 bytes
Desc: not available
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20100729/c83bcb73/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-shallow-situations
Type: application/octet-stream
Size: 9412 bytes
Desc: not available
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20100729/c83bcb73/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-revlog-partial.py
Type: application/octet-stream
Size: 6049 bytes
Desc: not available
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20100729/c83bcb73/attachment-0002.obj>


More information about the Mercurial-devel mailing list