Note:

This page appears to contain material that is no longer relevant. Please help improve this page by updating its content.

{i} This page does not meet our wiki style guidelines. Please help improve this page by cleaning up its formatting.

Status: Draft/Experiment

Proposed new discovery/changegroup protocol

The wire protocol has several flaws:

See WireProtocol for the current protocol.

Overview

The Protocol

Discovery

Instead of exploring the missing remote nodes like the current protocol, this proposal explores the local nodes, exploiting more local knowledge and having a stateful approach (which can't be done on the server).

It only needs one new server call:

discover(nodes)

  return a boolean array, value is True if the server has the node, False otherwise. (TODO: investigate what is cheaper, between list and boolean array / to avoid races, should we pass the head as argument?)

The client keeps all it's nodes between three states: unknown, missing and common. At each iteration, it builds a sample of the nodes in unknown state, send it to discover and update the sets accordingly. The sample size should obviously be bounded.

Proposed sample construction: first breadth first search, starting from the nodes at headsof(unknown), keep nodes at a distance 0, 1, 2, 4, 8, 16, ... from the heads. Second breadth first search starting at rootsof(unknown), keep the nodes at a distance 0, 1, 2, 4, 8, 16, ... from the roots. If the sample is bigger than MAX_SAMPLE_SIZE, take a random sample from it, but make sure it includes the heads. If the number of unknown nodes is smaller than MAX_SAMPLE_SIZE, send all the unknown nodes to discover.

During testing in largish repository (NetBeans, Linux Kernel), the number of iteration was almost always lower than 5.

Proposed tweak: the server can compute it's own unknown set and return a sample on it's own. Testing showed that it didn't make a big difference in the number of iterations (it usually removes at most one round trip)

Changegroup

The current changegroup() uses base nodes, it should instead use common nodes.

changegroup(roots, heads)

 find all changesets ancestors from heads and not descended from roots and return them as a single changegroup

A changegroup is a single stream containing:

A group is a list of chunks:

Prototypes

Prototypes can be found here:

Wishlist


CategoryInternals

DiscoveryPlan (last edited 2013-04-18 18:37:27 by KevinBullock)