> hg pull pulling from https://hg.mozilla.org/mozilla-central searching for changes ** unknown exception encountered, please report by visiting ** http://mercurial.selenic.com/wiki/BugTracker ** Python 2.6.2 (r262:71600, Oct 19 2011, 11:12:53) [GCC 4.3.2 [gcc-4_3-branch revision 141291]] ** Mercurial Distributed SCM (version 2.0.2) ** Extensions loaded: hgk, mq, purge Traceback (most recent call last): File "/.../local/bin/hg", line 38, in <module> mercurial.dispatch.run() File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 27, in run sys.exit(dispatch(request(sys.argv[1:]))) File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 64, in dispatch return _runcatch(req) File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 87, in _runcatch return _dispatch(req) File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 684, in _dispatch cmdpats, cmdoptions) File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 466, in runcommand ret = _runcommand(ui, options, cmd, d) File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 738, in _runcommand return checkargs() File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 692, in checkargs return cmdfunc() File "/.../local/lib64/python2.6/site-packages/mercurial/dispatch.py", line 681, in <lambda> d = lambda: util.checksignature(func)(ui, *args, **cmdoptions) File "/.../local/lib64/python2.6/site-packages/mercurial/util.py", line 458, in check return func(*args, **kwargs) File "/.../local/lib64/python2.6/site-packages/mercurial/extensions.py", line 139, in wrap util.checksignature(origfn), *args, **kwargs) File "/.../local/lib64/python2.6/site-packages/mercurial/util.py", line 458, in check return func(*args, **kwargs) File "/.../local/lib64/python2.6/site-packages/hgext/mq.py", line 3229, in mqcommand return orig(ui, repo, *args, **kwargs) File "/.../local/lib64/python2.6/site-packages/mercurial/util.py", line 458, in check return func(*args, **kwargs) File "/.../local/lib64/python2.6/site-packages/mercurial/commands.py", line 4247, in pull modheads = repo.pull(other, heads=revs, force=opts.get('force')) File "/.../local/lib64/python2.6/site-packages/mercurial/localrepo.py", line 1463, in pull force=force) File "/.../local/lib64/python2.6/site-packages/mercurial/discovery.py", line 45, in findcommonincoming abortwhenunrelated=not force) File "/.../local/lib64/python2.6/site-packages/mercurial/setdiscovery.py", line 148, in findcommonheads commoninsample = set(n for i, n in enumerate(sample) if yesno[i]) File "/.../local/lib64/python2.6/site-packages/mercurial/setdiscovery.py", line 148, in <genexpr> commoninsample = set(n for i, n in enumerate(sample) if yesno[i]) IndexError: list index out of range
This looks like a rather nasty error in the Mozilla repository. It also cannot be cloned: HGPLAIN=1 hg clone https://hg.mozilla.org/mozilla-central warning: hg.mozilla.org certificate with fingerprint 10:78:e8:57:2d:95:de:7c:de:90:bd:22:e1:38:17:67:c5:a7:9c:14 not verified (check hostfingerprints or web.cacerts config setting) warning: hg.mozilla.org certificate with fingerprint 10:78:e8:57:2d:95:de:7c:de:90:bd:22:e1:38:17:67:c5:a7:9c:14 not verified (check hostfingerprints or web.cacerts config setting) destination directory: mozilla-central requesting all changes adding changesets transaction abort! rollback completed abort: 00changelog.i@9b5f1ccdb021: unknown parent! The most important bug in hg 2.0.2 here seems that it produces a backtrace on pull instead of that error. $ hg vers Mercurial Distributed SCM (version 2.0.1)
I can successfully clone mozilla-central, so is this problem solved? At any rate, this looks like something that shouldn't be possible.
I still get a backtrace with hg pull. Other users have reported "clean" transaction aborts: http://www.serpentine.com/bugzilla/show_bug.cgi?id=718186 - mulitple reports of 'abort: 00changelog.i@c27a041a2ce4: unknown parent!' after hg update https://bugzilla.mozilla.org/show_bug.cgi?id=718186
Ok, here's what the error message means: client says: "send me all changesets that are descendants of my heads X, Y, and Z" server sends along a stream of changesets. client gets to one and says "hey, this one has a parent I've never heard of? FAIL" As of 1.9, Mercurial has a new set-based discovery protocol that should radically reduce the number of round trips. If both ssh and http services are using 1.9+, they should be experiencing the same issues as the application-level protocol is unified. That would point to some sort of http infrastructure issue. But I suspect you've got an old hg being run over ssh and instead we've hit a discovery protocol bug. For instance, this trace: $ hg pull --debug using http://hg.mozilla.org/releases/comm-aurora sending capabilities command pulling from http://hg.mozilla.org/releases/comm-aurora query 1; heads sending batch command searching for changes all local heads known remotely sending getbundle command ..seems to have not done enough work. If my graph looks like: a-b-c-d and the remote graph looks like: a-b-c-d-f \ / e Then discovering that d is known remotely is not necessarily sufficient to discover we need to send e?
"That would point to some sort of http infrastructure issue." Yes - I can reproduce the problem through our whole stack but when bypassing it internally and hitting the apache/wsgi direct I never reproduce the problem. The one thing standing in the way is a varnish cache. :( I've tweaked http headers to allow for the 1024 header sizes, bumped up limits for max request sizes and such, and the underlying "abort .. unknown parent!" still shows up for some. still digging...
Regarding the original bug, could it be that the server sent a partial response? As far as I can tell the issue is the the known() call sent fewer results than expected (we could check len(sample) == len(yesno) earlier too, or even the known call. And looking at cshields comment, it points indeed to the result being somehow truncated.
It is possible that the original report for this bug specifically was triggered during a varnish restart (which would have cut it off randomly) while trying to fix the unknown parent issue.
Ok, the next thing I'd look at is whether Mercurial's private http headers (X-HgArg-%d) traverse varnish safely. That probably calls for wireshark. Also note that whether we use magic headers depends on whether we can safely stuff the whole request in the URL or not. I guess it's possible we put the whole command in headers and Varnish decides it can use an old response, but we should be setting up an appropriate Vary header, so that shouldn't happen.
Degrading to bug: known to be caused by third party config. We probably need to patch up $TITLE still.
Just speculation right now (as I have to leave the keyboard for a couple of hours) but I wonder if varnish is treating vary: in a case-sensitive way. x-hgarg-1: .... vary: X-HgArg-1
Good catch. Varnish uses memcmp and there's no sign of folding. And Python helpfully folds header names to lowercase in httplib.
Ok, going to close this. According to: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 "The field-names given are not limited to the set of standard request-header fields defined by this specification. Field names are case-insensitive." ..which officially makes this a Varnish bug.
--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:27 EDT --- This bug was previously known as _bug_ 3204 at http://mercurial.selenic.com/bts/issue3204