[PATCH 0 of 1] RFC: bundle code path refactoring

Sune Foldager cryo at cyanite.org
Sun Jun 5 17:36:21 CDT 2011


On 2011-06-05 16:41, Matt Mackall wrote:
>On Sun, 2011-06-05 at 16:29 +0200, Sune Foldager wrote:
>> This is a just a light refactoring patch, but it's essentially a backout of:
>>
>> changeset:   12343:6a6149487817
>> user:        Matt Mackall <mpm at selenic.com>
>> date:        Mon Sep 20 14:32:21 2010 -0500
>> summary:     bundle: encapsulate all bundle streams in unbundle class
>>
>> I don't know the reason for the above changeset, though, and it doesn't have
>> a longer description or comments to clarify it. I'll elaborate on my reasons:
>
>Well it's still a work in progress, but the basic concept is quite
>simple. A bundle stream is a 'thing', an object if you will. And an
>unbundler is how we represent that object and attach properties (such as
>capabilities and state) to it.

Unfortunately this only allow "end-user access" to the data. No headers,
no compression.

>Admittedly this is a bit confusing because it's named 'unbundler' and
>not simply 'bundlestream'.

Well all it can generally do is help with unbundling..

>
>> In order to move towards a more unified wire format for bundles, we'd like to
>> transfer "full" (i.e. with format header) bundles for pull (changegroup(subset)
>> and getbundle). We already do this for push on http, and push on ssh also
>> accepts it in newer code, although we don't have a capability and thus don't
>> use it.
>>
>> The unbundler (unbundle10) is used to parse the bundle format. It also does
>> on-the-fly decompression of the stream, if the bundle is compressed. Without
>> the present patch we will not be able to use compressed bundles in the new
>> wire commands. Note that we currently never do so, instead relying on a
>> compression wrapper on top of the data.
>
>Let's look at an existing use case. 'hg bundle' generates bundles with
>various types of compression. It does so by calling
>localrepo.changegroup, which generates a changegroup stream (aka an
>uncompressed bundle). It then iterates through these chunks, passes them
>through a compressor, thus creating a new stream of chunks that get
>written to disk.
>
>This is more or less identical to what you'd want to do in the wire
>protocol, except rather than writing to disk, you'd want to create a new
>stream. ie:
>
>compressedbundlestream = bundlecompressor(localrepo.changegroup(...),
>alg)
>
>Converting changegroup.writebundle into bundlecompressor() is probably
>all of 10 minutes work.

Sure, but writebundle is pretty horrible with its semi-parsing of the
bundle format. But yes, this is likely possible.. it will still keep
the weirdness of the unbundler which is only useful to anyone but the
end-user if compressed, alive though (as well as the confusion about
what a bundle is).

>And we could of course do this without having a class in there. But the
>point here is that localrepo.changegroup now returns an object of the
>type expected by localrepo.addchangegroup. In other words, the plugs now
>match between the two most basic components.

After this patch, localrepo.changegroup still returns an object of the
type expected (and, indeed, documented, although that's likely an
oversight).


>What you're proposing is a step back: 

Meh, that's how you see it; I see it as a step forward :)

>cutting off the plug on one side
>and having bare wires because sometimes we want to go through a
>converter box and then forcing everyone to wire up their own connectors.
>Instead we should have some standard adapters.

What converter box? I am not converting anything, and this is the
point. If we use the bundlestream, IT will magically convert things
when we don't want to since it can only return uncompressed
bundles regardless of whether the bundle was compressed or not when it
was put in. This is not very obvious to the person who just want to pass
on the stream to a wire (or to anything else for that matter) as it is.

It also means that bundle compression always has to be the last step
you do before writing it somewhere, and that you can't sensibly put
a newly compressed bundle into a bundlestream again. All this is
extremely illogical, I think, and makes it difficult to use such
bundle streams in a general way.

Also, whatever its name, unbundle10 is clearly an unbundler: it only
contains methods to help with taking the existing bundle apart. That
and non-transparent access to the stream.

With the changes coming up, unbundle10 will uniformly handle bundles
with or without headers alike, and the bundle producers will have the
same options.


More information about the Mercurial-devel mailing list