[PATCH 2 of 2] bundle2: add a part to import external bundles

Pierre-Yves David pierre-yves.david at ens-lyon.org
Sat Oct 11 20:37:57 CDT 2014



On 10/11/2014 04:54 PM, Mads Kiilerich wrote:
> On 10/11/2014 10:38 AM, Mike Hommey wrote:
>> # HG changeset patch
>> # User Mike Hommey <mh at glandium.org>
>> # Date 1413016657 -32400
>> #      Sat Oct 11 17:37:37 2014 +0900
>> # Node ID 538073e6c7f6ae680cbdd9bcb44c214fd6653eb1
>> # Parent  225306662bbf7c8cc6d867d484eab1e038f6c2a5
>> bundle2: add a part to import external bundles
>
> It took me some time to figure out what this was all about.
>
> What do "a part" mean here? "Functionality"/"support"/"infrastructure"?
> Or more like "multi part"?\

It refer to a bundle2 part. This is a central concept of bundle2 and 
fairly clear in this context.

> It also seems essential that it is client side support only - that
> deserves mentioning in the short summary in the first line.

That is a good point.

>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
>> --- a/mercurial/bundle2.py
>> +++ b/mercurial/bundle2.py
>
>> +class digester(object):
>> +    """helper to compute multiple digests for the same data."""
>> +
>> +    DIGESTS = {
>> +        'md5': util.md5,
>
> Why care about md5 at all?
>
> Why support multiple digests? That seems like unnecessary complexity.
> The rest of Mercurial might be moving away from sha1 one day, but then
> this should be handled like everything else.
>
> Digests and validation could perhaps be put in a different patch than
> the multi bundle handling itself so we got smaller parts that were
> easier to review.
>
>>   capabilities = {'HG2X': (),
>>                   'b2x:listkeys': (),
>>                   'b2x:pushkey': (),
>>                   'b2x:changegroup': (),
>> +                'digests': tuple(digester.DIGESTS.keys()),
>
> As mentioned, I don't see the point in supporting multiple digests, but
> I would expect these references to external bundles requires a new
> capability flag?
>
>> + at parthandler('b2x:import-bundle')
>> +def handleimportbundle(op, inpart):
>> +    """apply a bundle on the repo, given an url and validation
>> information
>> +
>> +    The part contains a bundle url, as well as hash digest(s) and size
>> +    information to validate the downloaded content.
>> +    The expected format for this part is:
>> +        <URL>
>> +        <SIZE in bytes>
>> +        <DIGEST-TYPE>: <DIGEST-VALUE>
>> +        ...
>> +    Multiple digest types are supported in the same part, in which
>> case they
>> +    are all validated.
>> +    This currently only supports bundle10, but is intended to support
>> bundle20
>> +    eventually.
>> +    """
>> +    lines = inpart.read().splitlines()
>> +    try:
>> +        url = lines.pop(0)
>> +    except IndexError:
>> +        raise util.Abort(_('import-bundle format error'))
>> +    parsed_url = urlparse.urlparse(url)
>> +    if parsed_url.scheme not in ('http', 'https'):
>> +        raise util.Abort(_('import-bundle only supports http/https
>> urls'))
>> +
>> +    try:
>> +        size = int(lines.pop(0))
>> +    except (ValueError, IndexError):
>> +        raise util.Abort(_('import-bundle format error'))
>> +    digests = {}
>> +    for line in lines:
>> +        if ':' not in line:
>> +            raise util.Abort(_('import-bundle format error'))
>> +        k, v = line.split(':', 1)
>> +        digests[k.strip()] = v.strip()
>> +
>> +    real_part = digestchecker(urllib2.urlopen(url), size, digests)
>
> How will this work with authentication and server certificate validation
> etc? I would expect it should use (and perhaps extend) the existing URL
> abstraction layers in url.py and elsewhere.
>
> I also wonder why it only supports http(s). I would expect the most
> common case would be to get all bundles from the same location. Other
> protocols can of course be added later, but it seems wrong to have any
> protocol knowledge at all at this level.

Probably simplicity? What would be the other protocol you want 
supported and how would that work?

I can see a case for local file system bundle  in some case.

The list of supported protocol should probably be advertise so we can 
change it later (eg: a client may acces the server through ssh behind a 
firewall withou http access).

> Breaking bundles up in multiple parts do of course raise the question of
> whether the reference between them should use "relative" or "absolute"
> "paths". There are pros and cons, but I think I would prefer relative in
> most cases.

I do not get what your are trying to say here


-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list