Hello, While writting push benchmark for mercurial using https://bitbucket.org/octobus/bighgperf I dig into a weird bug that seems introduced recently. When pushing to a hgweb, the client hang after transfering changesets. Strace show a hanging read() on the server socket, while hgweb is in the select() loop. clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk= is a clone of mercurial-2017 stripped of 'last(all(), 1000)' changesets. I bisected the issue and it appear that the issue is introduced by f0a851542a05. I'll try to provide a reproductible test case, thanks! % env -i HGRCPATH= /home/ppepiot/src/bighgperf/mercurial/hg serve --cwd /home/ppepiot/src/bighgperf/repos -a localhost -p 0 --config web.push_ssl=False --config 'web.allow_push=*' --webdir-conf /home/ppepiot/src/bighgperf/hgweb.config --debug --traceback listening at http://localhost:41001/ (bound to 127.0.0.1:41001) 127.0.0.1 - - [24/Apr/2018 16:52:11] "GET /.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk%3D?cmd=capabilities HTTP/1.1" 200 - 127.0.0.1 - - [24/Apr/2018 16:52:11] "GET /.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk%3D?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=heads+%3Bknown+nodes%3Dc19e66dacaa184feba31136c18a369ba995ddfe4+aefb75730ea34f545f0756bf8441fc9ae07bf8dc x-hgproto-1:0.1 0.2 comp=zstd,zlib,none,bzip2 listing keys for "phases" 127.0.0.1 - - [24/Apr/2018 16:52:11] "GET /.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk%3D?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=phases x-hgproto-1:0.1 0.2 comp=zstd,zlib,none,bzip2 listing keys for "bookmarks" 127.0.0.1 - - [24/Apr/2018 16:52:12] "GET /.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk%3D?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks x-hgproto-1:0.1 0.2 comp=zstd,zlib,none,bzip2 % env -i HGRCPATH= /home/ppepiot/src/bighgperf/mercurial/hg --cwd /home/ppepiot/src/bighgperf/repos/mercurial-2017 push -f http://localhost:41001/.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk= --debug pushing to http://localhost:41001/.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk%3D using http://localhost:41001/.tmp/clone-partial-mercurial-2017-bGFzdChhbGwoKSwgMTAwMCk%3D sending capabilities command query 1; heads sending batch command searching for changes all remote heads known locally preparing listkeys for "phases" sending listkeys command received listkey for "phases": 15 bytes checking for updated bookmarks preparing listkeys for "bookmarks" sending listkeys command received listkey for "bookmarks": 42 bytes 1000 changesets found list of changesets: [...] 8d0b0b533e09354cc2f9d002ad2a93e0035e340a 93943eef696f413fe743682bbc7201e2c10a3ffa 1eee42aed306d9212beebfe519f7ae8c6e3fb913 e218830f6f0a3229decafc6d9c2e3b41cf630cc6 aefb75730ea34f545f0756bf8441fc9ae07bf8dc bundle2-output-bundle: "HG20", 4 parts total bundle2-output-part: "replycaps" 205 bytes payload bundle2-output-part: "changegroup" (params: 1 mandatory) streamed payload bundle2-output-part: "phase-heads" 48 bytes payload bundle2-output-part: "bookmarks" 23 bytes payload sending unbundle command sending 2530267 bytes
I successfully reproduced it locally but only when I give `hg serve` a webdir-conf. If I launch hg serve directly in the target repository, I cannot reproduce the issue. I can produce a strace trace file if needed.
I can also reproduce on a .t test for a Mozilla extension. Thanks for bisecting: it really helps. I'll try to look into this today.
Patch up for review at https://phab.mercurial-scm.org/D3427
Patch landed on hg-committed as (currently) 877185d.
Fixed by https://mercurial-scm.org/repo/hg/rev/877185de62cf Gregory Szorc <gregory.szorc@gmail.com> hgweb: reuse body file object when hgwebdir calls hgweb (issue5851) An unintended side-effect of f0a851542a05 was that the request body file object (which uses a util.cappedreader) was constructed twice when hgwebdir called into hgweb. Since we attempt to read all remaining data from this file object when Content-Length is defined and since there were two instances of this object and the client supplied no additional data to read, this resulted in deadlock. The fix implemented in this commit is to reuse the request body file object when it is passed from hgwebdir to hgweb. A test demonstrating `hg clone` and `hg push` via hgwebdir has been added. Without this patch, the test hangs when doing `hg clone`. Surprisingly, this must mean that we have effectively no test coverage of the wire protocol when run via hgwebdir. Differential Revision: https://phab.mercurial-scm.org/D3427 (please test the fix)
Bug was set to TESTING for 7 days, resolving