Speed on Windows for big repos

Matt Mackall mpm at selenic.com
Sun Jan 27 18:52:17 CST 2008


On Mon, 2008-01-28 at 01:09 +0100, Adrian Buehlmann wrote:
>  > hg clone --noupdate --time --lsprof http://www.selenic.com/hg/ mercurial
> requesting all changes
> adding changesets
> adding manifests
> adding file changes
> added 5930 changesets with 11128 changes to 800 files
>     CallCount     Total(s)    Inline(s) module:lineno(function)
>          3859    105.8043    105.8043   <win32file.FlushFileBuffers>

That looks interesting. And completely wrong. That's from here:

mercurial/util_win32.py:
class posixfile_nt(object):
...
    def flush(self):
        try:
            win32file.FlushFileBuffers(self.handle)
        except pywintypes.error, err:
            raise WinIOError(err)

Here's what file.flush is supposed to do:
 |  flush(...)
 |      flush() -> None.  Flush the internal I/O buffer.
 |  

And here's what we're doing:

FlushFileBuffers(hFile)

Clears the buffers for the specified file and causes all buffered data
to be written to the file.

Now the wording in both cases isn't very clear but the buffer in the
first case is internal to the application or the C library getting
pushed out to the OS (so it's like fflush(3)) and in the second case the
buffers are the operating system's buffers getting pushed out to disk
(like fsync(2)). The first is fast and the second is gawdawful slow.

So what do we need to do for flush() here? Probably nothing at all. As
far as I can tell, WriteFile() is more or less analogous to write(2) in
UNIX, which doesn't have any application-side buffering so flush is
unneeded.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list