[PATCH] worker: change partition strategy to every Nth element

Augie Fackler raf at durin42.com
Mon Feb 22 17:46:33 EST 2016


On Sat, Feb 20, 2016 at 04:02:11PM -0800, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1456012604 28800
> #      Sat Feb 20 15:56:44 2016 -0800
> # Node ID 81b1e7ce39d837e352e5ff9e799e512600ab4ba6
> # Parent  8449ef66f732a71008dc887ffd6efbfb6dc64ee0
> worker: change partition strategy to every Nth element

queued this, thanks

>
> The only consumer of the worker pool code today is `hg update`.
>
> Previously, the algorithm to partition work to each worker process
> preserved input list ordering. We'd take the first N elements, then
> the next N elements, etc. Measurements on mozilla-central demonstrate
> this isn't an optimal partitioning strategy.
>
> I added debug code to print when workers were exiting. When performing
> a working copy update on a previously empty working copy of
> mozilla-central, I noticed that process lifetimes were all over the
> map. One worker would complete after 7s. Many would complete after
> 12s. And another worker would often take >16s. This behavior occurred
> for many worker process counts and was more pronounced on some than
> others.
>
> What I suspect is happening is some workers end up with lots of
> small files and others with large files. This is because the update
> code passes in actions according to sorted filenames. And, directories
> under tend to accumulate similar files. For example, test directories
> often consist of many small test files and media directories contain
> binary (often larger) media files.
>
> This patch changes the partitioning algorithm to select every Nth
> element from the input list. Each worker thus has a similar composition
> of files to operate on.
>
> The result of this change is that worker processes now all tend to exit
> around the same time. The possibility of a long pole due to being
> unlucky and receiving all the large files has been mitigated. Overall
> execution time seems to drop, but not by a statistically significant
> amount on mozilla-central. However, repositories with directories
> containing many large files will likely show a drop.
>
> There shouldn't be any regressions due to partial manifest decoding
> because the update code already iterates the manifest to determine
> what files to operate on, so the manifest should already be decoded.
>
> diff --git a/mercurial/worker.py b/mercurial/worker.py
> --- a/mercurial/worker.py
> +++ b/mercurial/worker.py
> @@ -147,19 +147,16 @@ def _posixexitstatus(code):
>      elif os.WIFSIGNALED(code):
>          return -os.WTERMSIG(code)
>
>  if os.name != 'nt':
>      _platformworker = _posixworker
>      _exitstatus = _posixexitstatus
>
>  def partition(lst, nslices):
> -    '''partition a list into N slices of equal size'''
> -    n = len(lst)
> -    chunk, slop = n / nslices, n % nslices
> -    end = 0
> -    for i in xrange(nslices):
> -        start = end
> -        end = start + chunk
> -        if slop:
> -            end += 1
> -            slop -= 1
> -        yield lst[start:end]
> +    '''partition a list into N slices of roughly equal size
> +
> +    The current strategy takes every Nth element from the input. If
> +    we ever write workers that need to preserve grouping in input
> +    we should consider allowing callers to specify a partition strategy.
> +    '''
> +    for i in range(nslices):
> +        yield lst[i::nslices]
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list