[PATCH 1 of 6] util: add iterfile to workaround a fileobj.__iter__ issue with EINTR
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Mon Nov 14 19:31:58 EST 2016
On 11/14/2016 11:49 PM, Gregory Szorc wrote:
> On Mon, Nov 14, 2016 at 3:35 PM, Jun Wu <quark at fb.com
> <mailto:quark at fb.com>> wrote:
>
> # HG changeset patch
> # User Jun Wu <quark at fb.com <mailto:quark at fb.com>>
> # Date 1479166374 0
> # Mon Nov 14 23:32:54 2016 +0000
> # Node ID 7e3bb7754d338399dee3ee41770b7d6624b81fc1
> # Parent 038547a14d850f14ecd2671852093dc07848a134
> # Available At https://bitbucket.org/quark-zju/hg-draft
> <https://bitbucket.org/quark-zju/hg-draft>
> # hg pull https://bitbucket.org/quark-zju/hg-draft
> <https://bitbucket.org/quark-zju/hg-draft> -r 7e3bb7754d33
> util: add iterfile to workaround a fileobj.__iter__ issue with EINTR
>
> The fileobj.__iter__ implementation in Python 2.7.12 (hg changeset
> 45d4cea97b04) is buggy: it cannot handle EINTR correctly.
>
> In Objects/fileobject.c:
>
> size_t Py_UniversalNewlineFread(....) {
> ....
> if (!f->f_univ_newline)
> return fread(buf, 1, n, stream);
> ....
> }
>
> According to the "fread" man page:
>
> If an error occurs, or the end of the file is reached, the
> return value
> is a short item count (or zero).
>
> Therefore it's possible for "fread" (and "Py_UniversalNewlineFread") to
> return a positive value while errno is set to EINTR and ferror(stream)
> changes from zero to non-zero.
>
> There are multiple "Py_UniversalNewlineFread": "file_read",
> "file_readinto",
> "file_readlines", "readahead". While the first 3 have code to handle the
> EINTR case, the last one "readahead" doesn't:
>
> static int readahead(PyFileObject *f, Py_ssize_t bufsize) {
> ....
> chunksize = Py_UniversalNewlineFread(
> f->f_buf, bufsize, f->f_fp, (PyObject *)f);
> ....
> if (chunksize == 0) {
> if (ferror(f->f_fp)) {
> PyErr_SetFromErrno(PyExc_IOError);
> ....
> }
> }
> ....
> }
>
> It means "readahead" could ignore EINTR, if "Py_UniversalNewlineFread"
> returns a non-zero value. And at the next time "readahead" got
> executed, if
> "Py_UniversalNewlineFread" returns 0, "readahead" would raise a
> Python error
> without a incorrect errno - could be 0 - thus "IOError: [Errno 0]
> Error".
>
> The only user of "readahead" is "readahead_get_line_skip".
> The only user of "readahead_get_line_skip" is "file_iternext", aka.
> "fileobj.__iter__", which should be avoided.
>
> There are multiple places where the pattern "for x in fp" is used. This
> patch adds a "iterfile" method in "util.py" so we can migrate our
> code from
> "for x in fp" to "fox x in util.iterfile(fp)".
>
>
> Bleh. Is this bug reported upstream? (Not that it helps us much.)
>
> This seems like a pretty trivial workaround. So LGTM.
This seems okay and I trust Greg review here. This is pushed.
--
Pierre-Yves David
More information about the Mercurial-devel
mailing list