RFC: Incoming/outgoing discovery with fewer roundtrips / more bandwidth

Matt Mackall mpm at selenic.com
Wed Sep 30 12:31:40 CDT 2009


On Tue, 2009-09-29 at 08:28 +0200, Peter Arrenbrecht wrote:
> Folks,
> 
> I recently wanted to update an oldish (roughly hg 1.2) clone to crew
> tip. It needed 44 discovery roundtrips, which took unpleasantly long
> (use `hg incoming --debug` to see them). Specifically, I was at
> aaaf4af1c173 versus crew at 32ec70799172.
> 
> So I tried to come up with a different approach.

Before you get too far down this road, let me suggest some guidelines:

- client asks short questions
- client keeps query state
- server may give long answers
- server keeps no state between queries
- time to compute query answers should be reasonable for a 1M cset repo
- total discovery traffic should not exceed 1% of the bundle size

That last is important. We don't want to hear about 1M csets when
pulling only 10 csets from a large repo. Discovery time should be
similarly proportional to transfer time, but that's harder to quantify.

I've been thinking about this a fair amount today (ugh, now yesterday)
and I think something that combines deterministic answers with random
hints looks quite promising. Something that looks like:

client: what are the immediate descendants of set X?
server: set Y; also you might or might not know about random set Z
client: ahh thanks, how about the descendents of (some mix of X, Y, and
Z)
server: set Y'; also set Z'
client: ok, looks like I don't know about anything in Y', better send me
everything starting there.

Not sure if this is sufficient for push.

-- 
http://selenic.com : development and support for Mercurial and Linux




More information about the Mercurial-devel mailing list