Upon upgrading various servers at Mozilla from 4.1.2 to 4.2.2, we started seeing a number of abandoned transactions on the server. This appears to occur with both SSH and HTTP servers. I believe e9646ff34d55 and 1bfb9a63b98e are to blame. Before those changesets, various IOError during ui._write_err() and ui.flush() were ignored. After those changesets, they get converted into exceptions. What I think is happening is something like this: 1) Client connects 2) Transaction opened on server 3) Client disconnects via ^C 4) Server-side process keeps going or triggers a KeyboardInterrupt, which causes the transaction to unwind 5) `self.report(_("transaction abort!\n"))` from transaction._abort (or any other write or flush on ui) attempts to write to a pipe that has disappeared, triggering IOError. 6) Post e9646ff34d55 and 1bfb9a63b98e, the IOError/StdioError is unhandled by transaction.py, leading to an orphaned journal file and abandonded transaction errors Essentially, the underlying bug is that IOError on stdout and stderr interfere with transaction semantics.
Bug marked urgent for 10 days, bumping
Fixed by cde4cfeb6e3e "ui: restore behavior to ignore some I/O errors (issue5658)"
Bug was set to TESTING for 7 days, resolving
Fixed by https://mercurial-scm.org/repo/hg/rev/2debf1e3cfa4 Gregory Szorc <gregory.szorc@gmail.com> tests: test behavior of IOError during transactions (issue5658) ui._write(), ui._write_err(), and ui.flush() all trap IOError and re-raise as error.StdioError. If a caller doesn't catch StdioError when writing to stdio, it could bubble all the way to dispatch. This commit adds tests for I/O failures around various transaction operations. The most notable badness is during abort. Here, an uncaught StdioError will result in incomplete transaction rollback, requiring an `hg rollback` to recover. This can result in a client "corrupting" a remote repo via terminated HTTP and SSH socket. (please test the fix)
Bug was set to TESTING for 12 days, resolving