[PATCH 4 of 4] changegroup: increase write buffer size to 128k

Augie Fackler raf at durin42.com
Mon Oct 17 19:56:31 EDT 2016


On Sun, Oct 16, 2016 at 01:35:37PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1476650123 25200
> #      Sun Oct 16 13:35:23 2016 -0700
> # Node ID 4582f12754622ae049afefee05176ef107d99a7e
> # Parent  9e2f957b05ac5c76595280a6084ba01d7b369a05
> changegroup: increase write buffer size to 128k

This patch seems very reasonable. The others make me a tad nervous for
this late in the cycle - I don't /think/ we'll find weird concurrency
problems (past series from you on revlog handling likely have sniffed
out any potential bad actors around concurrency), but in the name of
paranoia let's land those right after we release 4.0. They look fine
to me as-is, for what it's worth, so please remail them on November 2
with me on the CC line.

>
> By default, Python defers to the operating system for choosing the
> default buffer size on opened files. On my Linux machine, the default
> is 4k, which is really small for 2016.
>
> This patch bumps the write buffer size when writing
> changegroups/bundles to 128k. This matches the 128k read buffer
> we already use on revlogs.
>
> It's worth noting that this only impacts when writing to an explicit
> file (such as during `hg bundle`). Buffers when writing to bundle
> files via the repo vfs or to a temporary file are not impacted.
>
> When producing a none-v2 bundle file of the mozilla-unified repository,
> this change caused the number of write() system calls to drop from
> 952,449 to 29,788. After this change, the most frequent system
> calls are fstat(), read(), lseek(), and open(). There were
> 2,523,672 system calls after this patch (so a net decrease of
> ~950k is statistically significant).
>
> This change shows no performance change on my system. But I have a
> high-end system with a fast SSD. It is quite possible this change
> will have a significant impact on network file systems, where
> extra network round trips due to excessive I/O system calls could
> introduce significant latency.
>
> diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
> --- a/mercurial/changegroup.py
> +++ b/mercurial/changegroup.py
> @@ -88,17 +88,19 @@ def writechunks(ui, chunks, filename, vf
>      """
>      fh = None
>      cleanup = None
>      try:
>          if filename:
>              if vfs:
>                  fh = vfs.open(filename, "wb")
>              else:
> -                fh = open(filename, "wb")
> +                # Increase default buffer size because default is usually
> +                # small (4k is common on Linux).
> +                fh = open(filename, "wb", 131072)
>          else:
>              fd, filename = tempfile.mkstemp(prefix="hg-bundle-", suffix=".hg")
>              fh = os.fdopen(fd, "wb")
>          cleanup = filename
>          for c in chunks:
>              fh.write(c)
>          cleanup = None
>          return filename
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list