[PATCH] hgwebdir: return memory to the OS after each request

Matt Mackall mpm at selenic.com
Tue Jul 3 14:09:13 CDT 2012


On Tue, 2012-07-03 at 03:10 +0200, Antoine Pitrou wrote:
> # HG changeset patch
> # User Antoine Pitrou <solipsis at pitrou.net>
> # Date 1341270506 -7200
> # Node ID 41cb7af1d92aa7c898634bfd49fea06388bcf666
> # Parent  f7a2849ef8cdd0ff3662b300702e40d55109d49b
> hgwebdir: return memory to the OS after each request
> 
> hgwebdir doesn't cache repositories, but memory can nevertheless
> build up accross requests due to delayed deallocation by Python's
> memory management routines.
> By forcing invalidation of caches and garbage collection after
> each request forwarded to hgweb, we manage to eliminate memory
> retention on hgwebdir instances on a (Python 2.7.3, Linux) setup.
> 
> Average memory consumption is now around 1.5 GB on hg.python.org
> (a mod_wsgi setup with daemon processes running hgwebdir instances),
> while it was around 4 GB before this patch.

> +                                repo.invalidate()
> +                                del repo
> +                                gc.collect()

I'm afraid I think this patch is about three steps backwards.

a) we should eventually share repo objects across requests
b) ..so that we can actually take advantage of their caches
c) we should avoid relying on the cyclic garbage collector

If you've found evidence of a reference cycle on repo objects, that's
newsworthy around here, and something we should properly find and fix,
not paper over.

Are you using any extensions? What does your WSGI configuration look
like? How many threads?

I see that a quick hack to call gc.disable() does indeed quickly show
problems and that it's worse in more recent version.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list