[PATCH 1 of 3] convert: cvsps.py - code to generate changesets from a CVS repository

Matt Mackall mpm at selenic.com
Thu Jun 12 10:43:00 CDT 2008


On Thu, 2008-06-12 at 08:42 +0100, Frank Kingswood wrote:
> >> +    # reusing strings typically saves about 40% of memory
> >>     
> > Very interesting.
> >   
> This is peculiar to what cvsps is trying to do. Changeset log messages 
> will be identical and there might be hundreds of copies in a large 
> changeset. This is even more obvious in the pickle that cvsps stores.

Ahh, right.

> > Try/except turns out to be fairly slow. It's faster and simpler to do:
> >
> > if s not in _scache:
> >     _scache[s] = s
> > return _scache[s]
> >
> > But in this case, we can get by with:
> >
> > return _scache.setdefault(s, s)
> >   
> Good one, thanks. I was just following GvR's assertion that it should be 
> fast enough.

A lot of the Python performance advice out there is a bit dated. The
try/except tricked used to be faster, some time pre-2.3. But happily
Python has moved in a good direction here: the above two constructs are
conceptually simpler, shorter, and faster.

> >> +        cachefile = ['-'.join(re.findall(r'\w+', s)) for s in cachefile if s]
> >> +        cachefile = os.path.join(cachedir, '.'.join(cachefile))
> >>     
> >
> > No idea what's happening there.

Translation: add a comment.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list