[PATCH STABLE] worker: ignore meaningless exit status indication returned by os.waitpid()

Jun Wu quark at fb.com
Thu Feb 23 20:32:29 EST 2017


Looks great to me. Thanks for examining it carefully!

I can confirm it from FreeBSD kernel code [1]:

  sys_wait4  -> kern_wait | sys_wait6
    -> kern_wait6
         -> proc_to_reap # write status

[1]: https://github.com/freebsd/freebsd/blob/127162a8d9ad171b0f45cb8553384d013722a495/sys/kern/kern_exit.c


Excerpts from FUJIWARA Katsunori's message of 2017-02-24 06:05:29 +0900:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1487883654 -32400
> #      Fri Feb 24 06:00:54 2017 +0900
> # Branch stable
> # Node ID d879917b416a305a42ab92a6d3ac2121d6830560
> # Parent  aa25989b0658dcefbd4c1bce7c389f006f22af30
> worker: ignore meaningless exit status indication returned by os.waitpid()
> 
> Before this patch, worker implementation assumes that os.waitpid()
> with os.WNOHANG returns '(0, 0)' for still running child process. This
> is explicitly specified as below in Python API document.
> 
>     os.WNOHANG
>         The option for waitpid() to return immediately if no child
>         process status is available immediately. The function returns
>         (0, 0) in this case.
> 
> On the other hand, POSIX specification doesn't define the "stat_loc"
> value returned by waitpid() with WNOHANG for such child process.
> 
>     http://pubs.opengroup.org/onlinepubs/9699919799/functions/waitpid.html 
> 
> CPython implementation for os.waitpid() on POSIX doesn't take any care
> of this gap, and this may cause unexpected "exit status indication"
> even on POSIX conformance platform.
> 
> For example, os.waitpid() with os.WNOHANG returns non-zero "exit
> status indication" on FreeBSD. This implies os.kill() with own pid or
> sys.exit() with non-zero exit code, even if no child process fails.
> 
> To ignore meaningless exit status indication returned by os.waitpid()
> with os.WNOHANG, this patch skips assignment to problem[0], if
> non-blocking os.waitpid() returns 0 as pid.
> 
> This patch also adds "assert not st", to detect this kind of breakage
> immediately. In this unexpected case, blocking os.waitpid() with
> explicit pid returns '(0, non-zero)' without any exception.
> 
> diff --git a/mercurial/worker.py b/mercurial/worker.py
> --- a/mercurial/worker.py
> +++ b/mercurial/worker.py
> @@ -123,6 +123,15 @@ def _posixworker(ui, func, staticargs, a
>              if p:
>                  pids.discard(p)
>                  st = _exitstatus(st)
> +            elif not blocking:
> +                # ignore st in this case, because it might be non-zero
> +                # on some platforms, even though it should be zero
> +                # according to Python document
> +                #
> +                # See also https://bugs.python.org/issue27808 
> +                continue
> +            else:
> +                assert not st
>              if st and not problem[0]:
>                  problem[0] = st
>      def sigchldhandler(signum, frame):


More information about the Mercurial-devel mailing list