When I run hg --traceback convert -s svn http://svn.xxx.com/svn/... ~/hg/... I get this traceback: File "/usr/lib/pymodules/python2.6/hgext/convert/convcmd.py", line 398, in convert c.convert(sortmode) File "/usr/lib/pymodules/python2.6/hgext/convert/convcmd.py", line 312, in convert parents = self.walktree(heads) File "/usr/lib/pymodules/python2.6/hgext/convert/convcmd.py", line 109, in walktree commit = self.cachecommit(n) File "/usr/lib/pymodules/python2.6/hgext/convert/convcmd.py", line 267, in cachecommit commit = self.source.getcommit(rev) File "/usr/lib/pymodules/python2.6/hgext/convert/subversion.py", line 433, in getcommit self._fetch_revisions(revnum, stop) File "/usr/lib/pymodules/python2.6/hgext/convert/subversion.py", line 814, in _fetch_revisions for entry in stream: File "/usr/lib/pymodules/python2.6/hgext/convert/subversion.py", line 122, in __iter__ entry = pickle.load(self._stdout) IOError: [Errno 4] Interrupted system call abort: Interrupted system call Apparently, a signal is sent to Python which interrupts pickle.load(). Is there something which Mercurial can do or do I have to file a bug report with Python? If you can't do anything about this, what signal could that be? SIGCHILD? Or is there a timer running?
Could it be the child process retrieving svn logs is being killed because it consumes too much memory or something like this?
Forgot the version info: hg version 1.5.2 OS: Linux xxx 2.6.32-22-generic-pae #33-Ubuntu SMP i686 GNU/Linux Subversion: version 1.6.6
@pmezard: No, when I call "svn log" or the other operations which I see with "-debug -v", then they proceed. ulimit also looks good: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
Ok, then could it be the http/svn server closing the connection? Perhaps you can measure how long it takes for the error to happen. To explain a bit, before starting the revision conversion, convert extension discovers the whole revision graph it has to convert. In svn case, it may request large amount of changelog for many branches and your server may not tolerate it. Also, because the svn python bindings where historically very leaky, this operation is performed in a child process, killed once the log is retrieved which probably explains the weird error message. You can try passing --config convert.svn.startrev=1000 or something not too large and see if it works better.
Making title more accurate
I had a strange failure which ended with "Error 200 OK" but that was because of a corrupted Subversion repository which we fixed. Now, I'm using convert to update this first clone so I only need about 10 revisions. There is no timeout, the processing happens very fast and then it fails. time(1) says 1.246s. I doubt it's a memory issue. But killing the child could be the problem. Can you tell me how to delay the killing until after the pickle.load() returns None?
Oh, and it happens sometimes after "scanning source..." and before "sorting..." and sometimes after "converting...". So it's not only "svn log" but also the download of the revisions itself which fails.
Can you run strace on it? strace -o trace.out -f <hg command> That might give some hints.
And can you try to comment this line: http://hg.intevation.org/mercurial/file/b5ef95b5ae14/hgext/convert/subversion.py#l100 An early termination issue was reported a long time ago and it seemed to fix it.
Sometimes, it fails after all the revisions are converted and sometimes, it completes without error. Looks like a race condition. strace makes it better but I got it to fail again ;-) I've cleaned the trace (500KB). Should I add it or sent it to you directly?
Uncommenting exit(0) doesn't help.
What I see is a lot of calls to execve("/usr/local/bin/hg", ["/usr/local/bin/hg", "debugsvnlog"],...) What happens when this script throws an exception? Is such an exception correctly pickled?
One step closer. I added "traceback.print_exc()" in "get_log_child()" in the exception handler for IOError. I get this stack trace: Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/hgext/convert/subversion.py", line 90, in get_log_child receiver) File "/usr/lib/pymodules/python2.6/libsvn/ra.py", line 736, in svn_ra_get_log return apply(_ra.svn_ra_get_log, args) File "/usr/local/lib/python2.6/dist-packages/hgext/convert/subversion.py", line 80, in receiver fp, protocol) IOError: [Errno 32] Broken pipe Like 80 is the call to "pickle.dump()" in the inner function "receiver()".
Add it. Or at least the interesting bits.
Okay, I found a bug but that didn't solve the whole issue: Search for a line for entry in self._getlog([self.module], from_revnum, to_revnum): The returned stream is never closed. You must change this into: stream = self._getlog([self.module], from_revnum, to_revnum) try: for entry in stream: ... finally: stream.close() I'll now analyze the strace dump. I can't post this easily because it contains internal data.
Ouch. Could you give us a "Name <email@domain>" for bug report/fix attribution?
You should check the main loop in _fetch_revisions(): It can stop before all the output of the child has been read which causes the child to terminate on SIGPIPE. I'm Aaron Digulla <digulla@hepe.com>
I pushed a first fix for the missing .close() in gettags() before rereading the issue again. So it's not solved but improving. In crew: http://hg.intevation.org/mercurial/crew/rev/46bb49134498
About the SIGPIPE issue, it seems to be the expected behaviour and in theory python ignores it (http://docs.python.org/library/signal.html). Could you check this is the case on your system? Something like: import signal print signal.getsignal(signal.SIGPIPE) should print '1' (signal.SIG_IGN) when run in get_log_child() Otherwise, we may have to handle it explicitely.
Fixed by http://hg.intevation.org/mercurial/crew/rev/46bb49134498 (convert/svn: close gettags() log stream (issue2196))
It does print '1' (one) in get_log_child() but operations like read(), flush() and close() will still fail with an errno of EPIPE. The problem is that the parent aborts with '[Errno 4] Interrupted system call'. Right now, my guess is that this happens: 1. The parent starts to fetch a log 2. Child starts to send log entries 3. Parent has enough, breaks out of the loop in _fetch_revisions() and closes the input stream. 4. Eventually the child tries to send more log entries which fails with EPIPE. Child terminates. 5. Parent gets unexpected SIGCHILD while reading data from a different child. Maybe the solution would be to terminate the child from the parent instead of killing it implicitly by closing the input stream.
Okay, I have a solution: Instead of returning the the streams in util.popen2/3(), you *must* return the process instance. Then, in logstream.close() and similar places, you *must* call process.wait() so the SIGCHILD signal is properly processed in the parent process.
Here is a patch against 1.5.2 which works around the issue for people who can't wait for 1.5.4. It contains a lot of debug output which I left in because I'm running out of time. Clean up at your leisure :-)
Damn ... that helps but still doesn't always solve the issue :-(
I tried to write a version of logstream.__iter__ which reads the characters from the child individually. This works ... it's not even very slow. But the code is pretty hideous.
What python version do you have? I am suprised this is the first time someone raise this issue, so I'd like to know why.
I'm running Python 2.6.5. Maybe it's because my computer and the lan is so fast that sometimes, child processes terminate before the parent can read everything and sometimes afterwards.
Updated patch for convert for Mercurial 1.6
See http://selenic.com/repo/hg/rev/5a0fdc715769 Augie Fackler <durin42@gmail.com> check-code: disallow use of hasattr()
--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:09 EDT --- This bug was previously known as _bug_ 2196 at http://mercurial.selenic.com/bts/issue2196 Imported an attachment (id=1384) Imported an attachment (id=1385) Imported an attachment (id=1386)
I think I encounter the same issue when using hgsubversion and mercurial 2.6: --- [rM] ... pulled N revisions updating to branch default abort: Interrupted system call --- At other times the error is: abort: Error --- Mercurial gives no further information. hgsubversion: f0bde4bf1f78 Subversion: 1.7.6 bindings: Subvertpy 0.9.0
Bug was inactive for 640 days, archiving