[PATCH 4 of 5] changegroup: use filecloser for closing filelog handles

Augie Fackler raf at durin42.com
Wed Sep 30 10:50:15 CDT 2015


On Tue, Sep 29, 2015 at 10:36:30PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1443552606 25200
> #      Tue Sep 29 11:50:06 2015 -0700
> # Node ID a8d8f0d70593aaa6fb2b86dc77d29c021d83277d
> # Parent  4ade2d6126d9b0f67ea898ab0d021abb88d157dd
> changegroup: use filecloser for closing filelog handles

Overall, I like where this is headed, but I do wonder if this should
be part of our vfs layer rather than its own thing - that would also
give us a way to ensure close() operations finish before a file is
subsequently reopened (eg if an inline revlog switches to two-file
during a pull). What do you think?

>
> On a large repository like mozilla-central, writing filelogs during
> changegroup application during a clone opens (and closes) over 200,000
> files. On Windows, file closes are slow and block the main thread while
> they are closing.
>
> Moving filelog file closes to a background thread on Windows frees up
> the main thread to process the next filelog before waiting for the close
> to complete. The end result is faster filelog application.
>
> On my Windows 10 machine (not a VM) and with a modern SSD, this patch
> has a drastic impact on unbundling a snapshot of mozilla-central:
>
> Before: ~910s
> After:  ~609s
>
> That's a 301s or 5:01 wall time reduction!
>
> And I'm convinced this isn't the upper limit of Windows performance: I
> made some local changes to measure time spent waiting for the file
> closer thread's queue to unblock and it said we still spend ~34s of main
> thread time waiting on file closes. I also haven't really experimented
> with the various wait intervals and limits in the file closer patch. But
> since we're already looking at a 5 minute performance win, I'm content
> with where things are.
>
> diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
> --- a/mercurial/changegroup.py
> +++ b/mercurial/changegroup.py
> @@ -657,12 +657,14 @@ def changegroup(repo, basenodes, source)
>      # to avoid a race we use changegroupsubset() (issue1320)
>      return changegroupsubset(repo, basenodes, repo.heads(), source)
>
>  def addchangegroupfiles(repo, source, revmap, trp, pr, needfiles):
> -    if True:
> +    closer = util.filecloser(repo.ui)
> +    try:
>          revisions = 0
>          files = 0
>          while True:
> +            closer.verifystate()
>              chunkdata = source.filelogheader()
>              if not chunkdata:
>                  break
>              f = chunkdata["filename"]
> @@ -670,9 +672,9 @@ def addchangegroupfiles(repo, source, re
>              pr()
>              fl = repo.file(f)
>              o = len(fl)
>              try:
> -                if not fl.addgroup(source, revmap, trp):
> +                if not fl.addgroup(source, revmap, trp, closecb=closer.close):
>                      raise util.Abort(_("received file revlog group is empty"))
>              except error.CensoredBaseError as e:
>                  raise util.Abort(_("received delta base is censored: %s") % e)
>              revisions += len(fl) - o
> @@ -687,8 +689,10 @@ def addchangegroupfiles(repo, source, re
>                          raise util.Abort(
>                              _("received spurious file revlog entry"))
>                  if not needs:
>                      del needfiles[f]
> +    finally:
> +        closer.cleanup()
>          repo.ui.progress(_('files'), None)
>
>      for f, needs in needfiles.iteritems():
>          fl = repo.file(f)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list