trying to make tracebacks reproducible

Matt Mackall mpm at selenic.com
Mon Jul 5 14:05:49 CDT 2010


On Sun, 2010-07-04 at 01:42 +0200, Christian Ebert wrote:
> * Christian Ebert on Sunday, July 04, 2010 at 01:19:31 +0200
> > * Christian Ebert on Sunday, July 04, 2010 at 01:03:50 +0200
> >> * Christian Ebert on Friday, July 02, 2010 at 02:12:44 +0200
> >>> * Martin Geisler on Thursday, July 01, 2010 at 23:33:24 +0200
> >>>> Christian Ebert <blacktrash at gmx.net> writes:
> >>>>> Since a few days -- sorry for being vague, but this is actually
> >>>>> part of the problem, I _sometimes_ get tracebacks with
> >>>>> crew-stable (basically 1.6 I'd say). I cannot reproduce them
> >>>>> "reliably", i.e. in the following example, I reissued the command
> >>>>> and got the diff as expected. I can reduce the loaded extensions
> >>>>> etc. but I'd like to reproduce this reliably first. It seems to
> >>>>> happen at random - well, in the true sense of the word, if you
> >>>>> look at the final lines of the traceback.
> >>>> 
> >>>> Heh, nice one :-)
> >>>> 
> >>>>> I'd be grateful for any ideas.
> >>>> 
> >>>>> mod = _origimport(name, globals, locals)
> >>>>> File "/sw/lib/python2.6/random.py", line 59, in <module>
> >>>>> LOG4 = _log(4.0)
> >>>>> ValueError: math domain error
> >>>> 
> >>>> Can you make it fail if you do something like
> >>>> 
> >>>> while python -c 'import random; print random.LOG4'; do done
> >>> 
> >>> That's evil!
> >>> 
> >>> Yes, I can/could. After rebooting the problem has gone away.
> >>> Probably I overtortured my machine with multithreaded video
> >>> conversion. Well, let's hope this is not a sign of senility - of
> >>> the machine I mean.
> >> 
> >> And now for the strangest thing: it does NOT happen with
> >> Mercurial 1.5! But reliably with 1.6. Will bisect.
> > 
> > And the winner is:
> > 
> > changeset:   11182:3c368a1c962d
> > branch:      stable
> > parent:      11171:3b3261f6d9ba
> > user:        Brodie Rao <brodie at bitheap.org>
> > date:        Mon May 03 14:00:34 2010 -0500
> > summary:     pager: fork and exec pager as parent process
> > 
> > Conditions:
> > 
> > [extensions]
> > pager=
> > [pager]
> > less -FX
> 
> and (!)
> 
> [diff]
> git = True
> 
> > and running a cpu intensive video conversion. Then, with every
> > second or third call -- obviously one that involves the pager --
> > I get the traceback ...
> 
> So: "hg diff" under the above conditions breaks.
> 
> Very weird.

You've probably found a kernel bug. In particular, some failure to
save/restore floating point state correctly during task switch. Since
this is expensive, most operating systems try to avoid doing it if
floating point is not in use by a given task.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list