When handling a compressed bundle, bundlerepo first uncompress it in a temporary file then overlay it on top of existing revlogs. The temporary file is eventually removed when calling .close() or if the __del__() is run. The problem is it never happens in most cases. bundlerepos are usually created through hg.repository(). The problem is repository class was never meant to be closed once used, so bundlerepo specific close() is almost never called. And __del__ is not guaranteed to be run upon program termination. My suggestion is we change that and add an explicit .close() method to the repository class and use it. Besides fixing the bundlerepo case, I believe having a reliable finalizer on localrepo may be helpful. Consider adding something like "incoming()" to revsets. This finalizer would help us getting rid of the remote bundle once done with it. Here is a test reproducing the issue in the -R case: $ hg init repo $ cd repo $ echo a > a $ hg ci -Am t adding a $ echo b > b $ hg ci -Am t adding b $ hg bundle --base 0 ../test.hg 1 changesets found $ cd .. $ hg clone -r 0 repo repo2 adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files updating to branch default 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ cd repo2 Before importing the bundle $ ls .hg | grep bundle [1] $ hg -R ../test.hg heads changeset: 1:2c197ebeca99 tag: tip user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: t After importing the bundle $ ls .hg | grep bundle hg-bundle-gn6Twm.hg10un
This is an indicator of a memory leak due to cyclic references, so the actual problem is deeper than forgetting to call .close(). And, what is worse, it is starting to arise quite often now: http://selenic.com/repo/hg/rev/dda4ad7c9ea9 I think it is good to keep relying on __del__() to remove dangling bundle files as it is always a fast indicator of something wrong. There is a small tool called "objgraph" that can help dececting reference cycles: http://mg.pov.lt/objgraph/
Perhaps this is a symptom of something but I disagree that relying on __del__ to do anything is reliable. It is not guaranteed to be run upon interpreter termination, *even* in CPython. Look for __del__ in: http://docs.python.org/reference/datamodel.html
I wasn't aware of the fact that the cycle collector does not free objects with a __del__ method defined. In fact, I wasn't even aware of the existance of a cycle collector. This actually explains the memory leak produced by bookmarks in the first place. As bundlerepo has __del__ defined. So, yes, it's either using explicit .close() (and delete the __del__ method to let the cycle collector do its job) or finding and fixing the cycle/leak.
An explicit close may help with issue1944 too. In this case, sshrepo.__del__ seems to be called in all my tests but who knows.
I just stumbled over this today as well (and entered duplicate issue2578 , which I've now redirected to here). I can confirm that __del__ of the bundlerepo class is not called (as seen on Windows). I agree with pmezard that adding an explicit .close() method to the repository class and use it is probably best.
When running test-bundle.t with python2.6 and 8ae7626d8bf1 (current stable tip) I get the following in my temp dir: /tmp/thomas/tmp7SFC8m /tmp/thomas/tmp7SFC8m/.hg /tmp/thomas/tmp7SFC8m/.hg/store /tmp/thomas/tmp7SFC8m/.hg/00changelog.i /tmp/thomas/tmp7SFC8m/.hg/requires /tmp/thomas/tmp7SFC8m/.hg/hg-bundle-i1qYXM.hg10un /tmp/thomas/tmpPjgBYq /tmp/thomas/tmpPjgBYq/.hg /tmp/thomas/tmpPjgBYq/.hg/store /tmp/thomas/tmpPjgBYq/.hg/00changelog.i /tmp/thomas/tmpPjgBYq/.hg/requires /tmp/thomas/tmpPjgBYq/.hg/hg-bundle-4xEKHi.hg10un
Fixed by http://selenic.com/repo/hg/rev/4d875bb546dc Thomas Arendsen Hein <thomas@intevation.de> clone: always close source repository (issue2491) (please test the fix)
works for me, nothing remaining in $TMP after running tests
But not solved for pmezard's case in the initial post ...
affected: hg pull hg outgoing hg log -r 'outgoing()' hg summary --remote hg bundle foo ../test.hg hg bundle -r 'outgoing()' foo (in this case two copies!) hg export -r 'outgoing() (additionally a traceback, see issue3353) hg id ../test.hg hg push (additionally waiting for lock on self) not affected: hg unbundle hg incoming other commands not interacting with other repos, e.g. hg add not testes: extensions subrepos (can a bundle be a subrepo?)
4d875bb546dc closes srcrepo, but hg.clone() returns it. Currently mq's qclone is the only place where the returned srcrepo is used and it looks like it is used without a good reason: sr = hg.repository() ... if opts.get('patches'): patchespath = ui.expandpath(opts.get('patches')) else: patchespath = patchdir(sr) ... sr, dr = hg.clone() ... opts.get('patches') or patchdir(sr) I'm tempted to only return destrepo and adjust the few callers which read the return value, but it is probably safer to not close the repo in hg.clone, but in all callers.
--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:14 EDT --- This bug was previously known as _bug_ 2491 at http://mercurial.selenic.com/bts/issue2491
Appears fixed.