We’re running Mercurial 2.1.1 under IIS using CGI on Windows 2008. We have two server-side hooks (written in PowerShell) that run on pretxnchangegroup that can take up to a minute to run. We’re noticing that if developer #2 pushes while developer #1 is pushing (his python.exe CGI process has locked the repo and our hooks are running), as expected, developer #2’s CGI process sits and waits for developer #1’s push to finish. However, once developer #1’s push succeeds, developer #2’s CGI process doesn’t detect that the repo is available/unlocked, and never locks the repo or runs any hooks. It just hangs, using no CPU or increasing in memory. I would expect that developer #2 would get a message about “waiting for lock” message, but the last message Mercurial outputs is “searching for changes”. Hitting CTRL+C doesn’t stop the push. Developer #2 has to kill hg.exe, or I have to log into our Mercurial server and kill developer #2’s CGI process. No repository corruption occurs on either the client or the server. # Steps to Reproduce On the server: > hg init push-hangs > cd push-hangs > echo '[hooks]' > .hg\hgrc > echo 'pretxnchangegroup.sleep = echo. | powershell -NoProfile -Command "Start-Sleep -Seconds 10"' >> .hg\hgrc On the client: > hg clone http://server/push-hangs no changes found updating to branch default 0 files updated, 0 files merged, 0 files removed, 0 files unresolved > hg clone http://server/push-hangs push-hangs2 no changes found updating to branch default 0 files updated, 0 files merged, 0 files removed, 0 files unresolved > cd push-hangs > echo '' > a.txt > hg add a.txt > hg commit -m "Adding file." > cd ..\push-hangs2 > echo '' > b.txt > hg add b.txt > hg commit -m "Adding file." > hg push While that is pushing, within ten seconds, open a new console: > cd push-hangs > hg push Notice that when the first push finishes, the second hangs and never finishes.
I hacked hgweb.cgi to have it output stderr to a file so I could see what's going on: import os errlog = "C:/inetpub/logs/httperr.%d.log" % os.getpid() sys.stderr = open(errlog, "w") sys.stderr.write("Writing to standard error.\n") sys.stderr.flush() Per suggestion from mpm on the users mailing list, I added a bunch of debugging statements to wireproto.py: try: proto.getfile(fp) sys.stderr.write("%d: at step 1\n" % os.getpid()); sys.stderr.flush() lock = repo.lock() sys.stderr.write("%d: at step 2\n" % os.getpid()); sys.stderr.flush() try: if not check_heads(): sys.stderr.write("%d: at step 3\n" % os.getpid()); sys.stderr.flush() # someone else committed/pushed/unbundled while we # were transferring data return pusherr('unsynced changes') # push can proceed sys.stderr.write("%d: at step 4\n" % os.getpid()); sys.stderr.flush() fp.seek(0) sys.stderr.write("%d: at step 5\n" % os.getpid()); sys.stderr.flush() gen = changegroupmod.readbundle(fp, None) try: sys.stderr.write("%d: at step 6\n" % os.getpid()); sys.stderr.flush() r = repo.addchangegroup(gen, 'serve', proto._client()) except util.Abort, inst: sys.stderr.write("abort: %s\n" % inst); sys.stderr.flush() finally: sys.stderr.write("%d: at step 7\n" % os.getpid()); sys.stderr.flush() lock.release() sys.stderr.write("%d: at step 8\n" % os.getpid()); sys.stderr.flush() return pushres(r) And this is the output from the hung process: 9876: at step 1 9876: at step 2 9876: at step 3 9876: at step 7 It looks like its hanging on the call to lock.release(). Should it even be getting the lock in the first place?
--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:30 EDT --- This bug was previously known as _bug_ 3401 at http://mercurial.selenic.com/bts/issue3401
Bulk close: no activity for >2 years -> WONTFIX
Bulk change recent WONTFIX -> new, more descriptive ARCHIVED state (sorry for the spam)