Network Problems

Benoit Boissinot bboissin at gmail.com
Tue Sep 9 09:12:00 CDT 2008


On Tue, Sep 9, 2008 at 3:49 PM, Waldemar Augustyn
<waldemar at beechwoods.com> wrote:
> Hello,
>
> I have been chasing network problems for some time and I have posted a
> number of messages to the general mercurial list.  At this point, I was able
> to narrow the problem down to ssh/popen/auth.  I still don't know exactly
> what's causing it but things have gotten sufficiently arcane that I decided
> to post to the developers list only.
>
> [snip]
> The suspects:
>
> The problem seems to be either in ssh, ssh authentication, or popen.  Why
> popen?  It's because it re-pipes std streams which are also used by ssh for
> ssl and authentication.  There is, perhaps, an opportunity for a race.  But
> that's just a theory.  I was unable to reproduce the problem using echo
> "hello\nbetween 0" piped to ssh but maybe I was not trying very hard.
>
> The latest finds:
>
> I modified sshrepo.py to run ssh with -vv.  The output is printed after a
> hung connection is killed.  I am not sure if it prints all ssh output, but
> it's the best to this point.
>

Can you try with -vvv (btw you probably don't need to modify the .py,
just use --sshcmd)

> Note, the first part, starting with "--nvram: hg out" is the output from a
> successful transaction.   The second part, starting with "--pms: hg out" is
> the output from a hung transaction.  The failed transaction log was produced
> by mercurial after the ssh child was explicitly killed from the command
> line.
>
> (successful):
[snip]
> remote: debug1: Next authentication method: publickey
> remote: debug1: Offering public key: /home/wnet/.ssh/id_dsa
> remote: debug2: we sent a publickey packet, wait for reply
> remote: debug1: Server accepts key: pkalg ssh-dss blen 434
> remote: debug2: input_userauth_pk_ok: fp
> 1e:67:1f:f1:5c:a5:b6:c6:23:31:9a:95:0c:bf:53:5e
> remote: debug1: Authentication succeeded (publickey).
> remote: debug2: fd 4 setting O_NONBLOCK
> remote: debug2: fd 6 setting O_NONBLOCK
> remote: debug2: fd 7 setting O_NONBLOCK
[continues...]

> (failed):
[snip]
> remote: debug1: Next authentication method: publickey
> remote: debug1: Offering public key: /home/wnet/.ssh/id_dsa
> remote: debug2: we sent a publickey packet, wait for reply
> remote: debug1: Server accepts key: pkalg ssh-dss blen 434
> remote: debug2: input_userauth_pk_ok: fp
> 1e:67:1f:f1:5c:a5:b6:c6:23:31:9a:95:0c:bf:53:5e
> abort: no suitable response from remote hg!

I suspect it is not exiting the "dispatch_run()" from sshconnect2.c.
Having more debug output could
help (as well as attaching a gdb to it after it hangs).

regards,

Benoit


More information about the Mercurial-devel mailing list