[PATCH V2] posix: insert seek between reads and writes on Solaris (issue4943)

Gregory Szorc gregory.szorc at gmail.com
Mon Nov 30 19:18:12 CST 2015


On Mon, Nov 30, 2015 at 4:35 PM, Matt Mackall <mpm at selenic.com> wrote:

> On Mon, 2015-11-30 at 15:28 -0800, Gregory Szorc wrote:
> > On Mon, Nov 30, 2015 at 2:23 PM, Matt Mackall <mpm at selenic.com>
> > wrote:
> >
> > > On Thu, 2015-11-26 at 18:10 -0800, Gregory Szorc wrote:
> > > > # HG changeset patch
> > > > # User Gregory Szorc <gregory.szorc at gmail.com>
> > > > # Date 1447625312 28800
> > > > #      Sun Nov 15 14:08:32 2015 -0800
> > > > # Branch stable
> > > > # Node ID a7d5be598b347ae2c26566aa2bcdcf5cb1acb30b
> > > > # Parent  6979fe2a6d75105affcacd9e298262a92641cb98
> > > > posix: insert seek between reads and writes on Solaris
> > > > (issue4943)
> > > >
> > > > At least some versions of Solaris fail to properly write to file
> > > > descriptors opened for both reading and writing. When the
> > > > descriptor
> > > > is seeked and read, subsequent writes effectively disappear into
> > > > a
> > > > black hole.
> > >
> > > A quick check of all the "a+" users suggests this is restricted to
> > > revlog writing. So I think it'd be much simpler to do:
> > >
> > > diff -r 61fbf5dc12b2 mercurial/scmutil.py
> > > --- a/mercurial/scmutil.py      Tue Nov 24 21:41:12 2015 +0000
> > > +++ b/mercurial/scmutil.py      Mon Nov 30 16:21:33 2015 -0600
> > > @@ -516,6 +516,12 @@
> > >          fp = util.posixfile(f, mode)
> > >          if nlink == 0:
> > >              self._fixfilemode(f)
> > > +
> > > +        # The position when opening in append mode is
> > > implementation
> > > defined, so
> > > +        # make it consistent across platforms by positioning at
> > > EOF
> > > +        if mode in ('a', 'a+'):
> > > +            fp.seek(0, os.SEEK_END)
> > > +
> > >          return fp
> > >
> > >      def symlink(self, src, dst):
> > > diff -r 61fbf5dc12b2 mercurial/windows.py
> > > --- a/mercurial/windows.py      Tue Nov 24 21:41:12 2015 +0000
> > > +++ b/mercurial/windows.py      Mon Nov 30 16:21:33 2015 -0600
> > > @@ -99,12 +99,6 @@
> > >      '''Open a file with even more POSIX-like semantics'''
> > >      try:
> > >          fp = osutil.posixfile(name, mode, buffering) # may raise
> > > WindowsError
> > > -
> > > -        # The position when opening in append mode is
> > > implementation
> > > defined, so
> > > -        # make it consistent with other platforms, which position
> > > at EOF.
> > > -        if 'a' in mode:
> > > -            fp.seek(0, os.SEEK_END)
> > > -
> > >          if '+' in mode:
> > >              return mixedfilemodewrapper(fp)
> > >
> > > Thoughts?
> >
> >
> > My patch is about inserting extra seeks between reads and writes:
> > yours is
> > about seeking when opening in append mode. Apples and oranges.
>
> Indeed!
>
> > Did you intend to suggest inserting the seek to EOF on every write()
> > to a
> > file opened in append mode? That will blow up the number of system
> > calls
> > during writing. It probably doesn't matter. But it feels wasteful on
> > the
> > platforms where it isn't necessary.
>
> Problem is: we don't know where it isn't necessary.
>

> In fact, I'm now pretty unclear on what the problem is. The standard
> problem here is that in POSIX:
>
> fh = open("foo", O_APPEND)
>
> ..does not specify where the file pointer is (and it's known to vary
> from system to system), but Posix DOES specify that all writes first
> reposition the pointer to the end of file.
>
> If Solaris isn't doing that, we've got big problems. But I suspect
> what's actually happening here is not related to the kernel's handling
> of the file offset at all, but is in fact connected to Python's own
> internal I/O buffering. In particular, Python has a single read and
> write buffer that it tries to manage around seeking:
>
> (bufferedio.c)
>
> /*
>     Implementation
> notes:
>
>
>     * BufferedReader, BufferedWriter and BufferedRandom try to share
> most
>       methods (this is helped by the members `readable` and `writable`,
> which
>       are initialized in the respective
> constructors)
>     * They also share a single buffer for reading and writing. This
> enables
>       interleaved reads and writes without flushing. It also makes the
> logic
>       a bit trickier to get
> right.
>     * The absolute position of the raw stream is cached, if possible, in
> the
>       `abs_pos` member. It must be updated every time an operation is
> done
>       on the raw stream. If not sure, it can be reinitialized by
> calling
>       _buffered_raw_tell(), which queries the raw stream
> (_buffered_raw_seek()
>       also does it). To read it, use RAW_TELL().
>
>
> It looks like there are at least 2 bugs here that affect versions we care
> about:
>
> changeset:   71984:cf2010e9f941
> branch:      2.7
> parent:      71979:c816479f6aaf
> user:        Antoine Pitrou <solipsis at pitrou.net>
> date:        Sat Aug 20 15:40:58 2011 +0200
> summary:     Issue #12213: Fix a buffering bug with interleaved reads and
> writes that
>
> changeset:   70060:0d24d4c537a6
> branch:      2.7
> parent:      70051:534a9e274d88
> user:        Antoine Pitrou <solipsis at pitrou.net>
> date:        Fri May 13 00:31:52 2011 +0200
> summary:     Issue #12062: In the `io` module, fix a flushing bug when
> doing a certain
>
> Looks like both of these can trigger if reads and writes occur at
> different positions inside the same 8k buffer window.
>
> Antoine is attached to both of these, so maybe he knows what's up.
>

See also (from hg's repo):

changeset:   36484:3686fa2b8eee
user:        Gregory Szorc <gregory.szorc at gmail.com>
date:        Sun Sep 27 18:46:53 2015 -0700
summary:     windows: insert file positioning call between reads and writes

Also worth repeating what's in the comments in Mercurial's bug tracker (
https://bz.mercurial-scm.org/show_bug.cgi?id=4943) that this issue seems to
go away on Python 3.0. I'm pretty sure CPython is automatically inserting
the extra seek() call on that Python version to work around platform
behavior differences.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151130/708032c4/attachment.html>


More information about the Mercurial-devel mailing list