trying to make tracebacks reproducible

Christian Ebert blacktrash at gmx.net
Tue Jul 6 03:16:35 CDT 2010


* Matt Mackall on Monday, July 05, 2010 at 14:05:49 -0500
> On Sun, 2010-07-04 at 01:42 +0200, Christian Ebert wrote:
>> * Christian Ebert on Sunday, July 04, 2010 at 01:19:31 +0200
>>> * Christian Ebert on Sunday, July 04, 2010 at 01:03:50 +0200
>>>> * Christian Ebert on Friday, July 02, 2010 at 02:12:44 +0200
>>>>> * Martin Geisler on Thursday, July 01, 2010 at 23:33:24 +0200
>>>>>> Christian Ebert <blacktrash at gmx.net> writes:
>>>>>>> Since a few days -- sorry for being vague, but this is actually
>>>>>>> part of the problem, I _sometimes_ get tracebacks with
>>>>>>> crew-stable (basically 1.6 I'd say). I cannot reproduce them
>>>>>>> "reliably", i.e. in the following example, I reissued the command
>>>>>>> and got the diff as expected. I can reduce the loaded extensions
>>>>>>> etc. but I'd like to reproduce this reliably first. It seems to
>>>>>>> happen at random - well, in the true sense of the word, if you
>>>>>>> look at the final lines of the traceback.
>>>>>> 
>>>>>> Heh, nice one :-)
>>>>>> 
>>>>>>> I'd be grateful for any ideas.
>>>>>> 
>>>>>>> mod = _origimport(name, globals, locals)
>>>>>>> File "/sw/lib/python2.6/random.py", line 59, in <module>
>>>>>>> LOG4 = _log(4.0)
>>>>>>> ValueError: math domain error
>>>>>> 
>>>>>> Can you make it fail if you do something like
>>>>>> 
>>>>>> while python -c 'import random; print random.LOG4'; do done
>>>>> 
>>>>> That's evil!
>>>>> 
>>>>> Yes, I can/could. After rebooting the problem has gone away.
>>>>> Probably I overtortured my machine with multithreaded video
>>>>> conversion. Well, let's hope this is not a sign of senility - of
>>>>> the machine I mean.
>>>> 
>>>> And now for the strangest thing: it does NOT happen with
>>>> Mercurial 1.5! But reliably with 1.6. Will bisect.
>>> 
>>> And the winner is:
>>> 
>>> changeset:   11182:3c368a1c962d
>>> branch:      stable
>>> parent:      11171:3b3261f6d9ba
>>> user:        Brodie Rao <brodie at bitheap.org>
>>> date:        Mon May 03 14:00:34 2010 -0500
>>> summary:     pager: fork and exec pager as parent process
>>> 
>>> Conditions:
>>> 
>>> [extensions]
>>> pager=
>>> [pager]
>>> less -FX
>> 
>> and (!)
>> 
>> [diff]
>> git = True
>> 
>>> and running a cpu intensive video conversion. Then, with every
>>> second or third call -- obviously one that involves the pager --
>>> I get the traceback ...
>> 
>> So: "hg diff" under the above conditions breaks.
>> 
>> Very weird.
> 
> You've probably found a kernel bug. In particular, some failure to
> save/restore floating point state correctly during task switch. Since
> this is expensive, most operating systems try to avoid doing it if
> floating point is not in use by a given task.

I see (sort of). Nothing one can do about it then?

@Brodie out of curiosity: what errors did you get without your
patch using just util.popen (as far as I understand)?

c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org/
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions


More information about the Mercurial-devel mailing list