Unbound size of discovery
Gregory Szorc
gregory.szorc at gmail.com
Mon Jun 30 18:20:27 CDT 2014
The size of the wire protocol payload for discovery requests and
responses is proportional to the number of heads in the peer
repositories. For esoteric repositories, such as Mozilla's Try
repository which grows to over 10,000 heads before it is reset, we can
see discovery response payloads grow to over 1 MB! We've also brushed up
against default HTTP server limits. Mozilla has hit both HTTP header
size and count limits due to x-hgarg-n headers during discovery.
Fortunately, we operate our own servers, so we can increase the limits.
But sometimes there is a load balancer or security device between your
Mercurial server and your users (e.g. EC2 - although I'm not sure ELB
imposes such limits).
This kind of unbounded growth is not good for scalability and
performance. It may rule out Mercurial as a solution for you.
One idea I had was to limit returned heads to only public changesets.
Another is to allow servers to execute a config-defined revset as part
of calculating returned heads. These could likely result in clients
sending redundant changeset data to the remote. But for certain
scenarios (such as Mozilla's Try where nearly every head stems from a
public changeset), the redundancy should be negligible.
I've also had other crazy ideas such as having the client skip heads and
go straight to querying for existence of ancestors in the pushed
changeset(s).
Perhaps these modes of operation are influenced by a capability. e.g. if
a remote advertises its heads count, the client can make a determination
as to whether classical full-heads-based discovery is appropriate.
Before I get too far down the rabbit hole, I was curious what solutions
have been considered/attempted for dealing with this "discovery bloat."
More information about the Mercurial-devel
mailing list