Apache 2.2.27 (Win64) mod_wsgi/3.5 Python 2.7.8 Issue seems an awful lot like 3953, however, it was fixed in 2.6, and I'm seeing this issue in 2.7. If we sync a lot of different repos at once, we get HTTP 500 errors being thrown, with tracebacks like this in the apache logs: mod_wsgi (pid=460): Exception occurred processing WSGI script 'E:/webroot/hgweb.wsgi'. Traceback (most recent call last): File "C:\\Python27\\lib\\site-packages\\mercurial\\hgweb\\hgwebdir_mod.py", line 153, in __call__ return self.run_wsgi(req) File "C:\\Python27\\lib\\site-packages\\mercurial\\hgweb\\hgwebdir_mod.py", line 218, in run_wsgi return hgweb(repo).run_wsgi(req) File "C:\\Python27\\lib\\site-packages\\mercurial\\hgweb\\hgweb_mod.py", line 68, in __init__ r.baseui.setconfig('ui', 'report_untrusted', 'off', 'hgweb') File "C:\\Python27\\lib\\site-packages\\mercurial\\ui.py", line 165, in setconfig cfg.set(section, name, value, source) File "C:\\Python27\\lib\\site-packages\\mercurial\\config.py", line 64, in set self._data[section][item] = value File "C:\\Python27\\lib\\site-packages\\mercurial\\util.py", line 237, in __setitem__ self._list.remove(key) ValueError: list.remove(x): x not in list Or: mod_wsgi (pid=460): Exception occurred processing WSGI script 'E:/webroot/hgweb.wsgi'. Traceback (most recent call last): File "C:\\Python27\\lib\\site-packages\\mercurial\\hgweb\\hgwebdir_mod.py", line 153, in __call__ return self.run_wsgi(req) File "C:\\Python27\\lib\\site-packages\\mercurial\\hgweb\\hgwebdir_mod.py", line 218, in run_wsgi return hgweb(repo).run_wsgi(req) File "C:\\Python27\\lib\\site-packages\\mercurial\\hgweb\\hgweb_mod.py", line 68, in __init__ r.baseui.setconfig('ui', 'report_untrusted', 'off', 'hgweb') File "C:\\Python27\\lib\\site-packages\\mercurial\\ui.py", line 165, in setconfig cfg.set(section, name, value, source) File "C:\\Python27\\lib\\site-packages\\mercurial\\config.py", line 64, in set self._data[section][item] = value File "C:\\Python27\\lib\\site-packages\\mercurial\\util.py", line 237, in __setitem__ self._list.remove(key) We're throwing several of these a second on average. Can reproduce at will. Does not happen at low loads, only when around hundred simultaneous syncs are happening do we see this issue. Anything I can do to help, let me know. --Steve
#3953 was actually fixed in 2.8.2 (released Jan 1), but this does indeed seem identical. As your backtrace fingerprint actually matches 3.1, going to mark this confirmed.
Thanks for the quick response. That's really appreciated. Sorry about the version snafu, got Hg and Python versions confused... Long day at the office. This is severely impacting a rather busy Hg server. We typically see bursts of hundreds of repo's being synced simultaneously, and a significant fraction of them are failing now. The race condition is happening more and more often as server load increases. Is it appropriate to set the priority to urgent? If I can be of any assistance, let me know.
Fixed by http://selenic.com/repo/hg/rev/af62f0280a76 Matt Mackall <mpm@selenic.com> hgweb: avoid config object race with hgwebdir (issue4326) Turns out hgwebdir passes full repo objects to each hgweb request instance, but with a shared baseui. We explicitly break the sharing. (please test the fix)
Ran a test against our test server with the code change provided. Test was 450 simultaneous syncs. No HTTP 500 errors thown. All fixed! Thanks guys! --Steve