Naturally, as a distributed version control system, Mercurial remains highly available for individual users even when remote servers are unavailable.
However, in many organisations, build, review, CI and change control tooling may depend heavily on the centralised repositories, and high availability of these repositories would therefore be desirable in order to avoid major interruptions. How might one deploy a Mercurial service to maximise availability?
Facebook has created a Mercurial extension hgsqlExtension that uses a MySQL database for centralised storage of repository data.
This improves availability by allowing individual servers to be taken down for repair without any downtime to the system as a whole. However, to some extent, this merely substitutes the problem of making Mercurial highly available with the problem of making MySQL highly available.
Information about Mozilla's infrastructure for running Mercurial is available at https://mozilla-version-control-tools.readthedocs.io/en/latest/hgmo/index.html. tl;dr Mozilla operates a read/write SSH server and a separate layer of read-only HTTP mirrors. The mirrors are kept in sync using a fault-tolerance replication system based on Apache ZooKeeper and Kafka.
Pretty much everything is open source and available in the version-control-tools repository at https://hg.mozilla.org/hgcustom/version-control-tools/. The Kafka-based replication tool lives in pylib/vcsreplicator.