Dealing with broken connections

Matt Mackall mpm at selenic.com
Fri Mar 14 15:20:37 CDT 2008


On Fri, 2008-03-14 at 20:46 +0100, Guido Ostkamp wrote:
> On Fri, 14 Mar 2008, Benoit Boissinot wrote:
> 
> > On Fri, Mar 14, 2008 at 10:50 AM, Guido Ostkamp <hg at ostkamp.fastmail.fm> wrote:
> >> Hello,
> >>
> >>  I would like to know how stable Mercurial is when it comes
> >>  deal with broken connections.
> >>
> >>  When I do a clone of a very large remote repository and
> >>  the connection breaks in between and I have to kill
> >>  the Mercurial process which got stuck, what happens?
> >
> > It removes the repository.
> >>
> >>  Am I left with a broken repository?
> >>
> > It doesn't leave anything.
> >
> >>  Can I somehow resume the cloning, for example by using
> >>  'hg pull' instead of 'hg clone' to retrieve the rest of it?
> >
> > No, but what you can do for unreliable connections is to provide a 
> > bundle (the full history in one file) that you can get via rsync or 
> > http.
> 
> well, the problem is that this doesn't happen inhouse where a bundle would 
> certainly be an option but out in the wild (internet) where connection 
> breaks are rather common.
> 
> Looking at generic large repositories like OpenSolaris I am the one who 
> does the cloning and I don't think I can get them to provide the stuff as 
> bundle.
> 
> Would it for some technical reason be impossible to resume a cloning 
> process or is it just not yet implemented at the moment?

Yes. The wire protocol streams changelog followed by manifest data
followed by file data. This minimizes seeking and maximizes
compression. 

This has two implications for resume. First, if you get interrupted in
the middle, odds are you don't have a single valid changeset (because
you're likely to be missing its files).

Second, if a commit happens between starting a clone and resuming a
clone, the contents of the new stream will change at various places in
the middle. That is, we'll have new changesets, manifests, and files
inserted in the middle of the stream. So we can't simply say "resume
transfer at byte x, please". Nor can we say "resume at changeset x" or
"file f".

Fortunately, there's a workaround you can do today: you can use clone -r
and pull -r to pull the history incrementally. This will behave
similarly to cloning a repo in the distant past and doing occasional
updates.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial mailing list