cvs2hg memory use

Michael Haggerty mhagger at alum.mit.edu
Tue Aug 4 15:31:15 CDT 2009


Greg Ward wrote:
> One oddity: cvs2svn disables garbage collection because it takes pains
> to create no cyclic data structures.  Mercurial presumably takes no
> such pains, and would no doubt benefit from occasional GC.  In fact,
> I'm going to add a gc.collect() call right at the end of the above
> method and see if it helps.

FYI, garbage collection was disabled in cvs2svn because (1) we don't
create any cycles (well, we create cycles in one or two places but we
break them explicitly), and (2) the garbage collector was causing big
pauses in BreakCVSSymbolChangesetLoopsPass while it looped through
memory accomplishing nothing.  I don't remember whether the gc pauses
constituted a significant fraction of the total runtime, but they were
definitely big enough to massively confuse the profiling that I was
doing at the time.

If you are willing to live with the pauses, there is no reason that you
cannot enable garbage collection in cvs2hg.  But they you might also
want to disable the check_for_garbage() mechanism in pass_manager.py,
perhaps by wrapping it in "if gc.isenabled():".

Actually, I expect that cvs2hg only generates garbage in OutputPass, so
you might consider enabling garbage collection only in that pass and
have the best of both worlds.

Michael


More information about the Mercurial-devel mailing list