[PATCH STABLE?] tags: create new sortdict for performance reasons
Gregory Szorc
gregory.szorc at gmail.com
Sat Nov 21 18:18:13 CST 2015
On Sat, Nov 21, 2015 at 2:06 PM, Augie Fackler <raf at durin42.com> wrote:
> On Thu, Nov 12, 2015 at 01:17:50PM -0800, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc at gmail.com>
> > # Date 1447362964 28800
> > # Thu Nov 12 13:16:04 2015 -0800
> > # Node ID c541bc4fc379f76d0cf30354942f78aaddd26f04
> > # Parent 150fb782ef80919e1c8cc7d4b047d78f889c7846
> > tags: create new sortdict for performance reasons
>
> squinting at this lightly, I suspect sortdict could go away in favor
> of collections.ordereddict now...
>
Requires 2.7 :/
>
> >
> > sortdict internally maintains a list of keys in insertion order. When a
> > key is replaced via __setitem__, we .remove() from this list. This
> > involves a linear scan and array adjustment. This is an expensive
> > operation.
> >
> > The tags reading code was calling into sortdict.__setitem__ for each tag
> > in a read .hgtags revision. For repositories with thousands of tags or
> > thousands of .hgtags revisions, the overhead from list.remove()
> > noticeable.
> >
> > This patch creates a new sortdict() so __setitem__ calls don't incur a
> > list.remove.
> >
> > This doesn't appear to have any performance impact on my Firefox
> > repository. But that's only because tags reading doesn't show up in
> > profiles to begin with. I'm still waiting to hear from a user with over
> > 10,000 tags and hundreds of heads on the impact of this patch.
> >
> > diff --git a/mercurial/tags.py b/mercurial/tags.py
> > --- a/mercurial/tags.py
> > +++ b/mercurial/tags.py
> > @@ -220,11 +220,15 @@ def _readtags(ui, repo, lines, fn, recod
> > All node ids are binary, not hex.
> > '''
> > filetags, nodelines = _readtaghist(ui, repo, lines, fn,
> recode=recode,
> > calcnodelines=calcnodelines)
> > + # util.sortdict().__setitem__ is much slower at replacing then
> inserting
> > + # new entries. The difference can matter if there are thousands of
> tags.
> > + # Create a new sortdict to avoid the performance penalty.
> > + newtags = util.sortdict()
> > for tag, taghist in filetags.items():
> > - filetags[tag] = (taghist[-1], taghist[:-1])
> > - return filetags
> > + newtags[tag] = (taghist[-1], taghist[:-1])
> > + return newtags
> >
> > def _updatetags(filetags, tagtype, alltags, tagtypes):
> > '''Incorporate the tag info read from one file into the two
> > dictionaries, alltags and tagtypes, that contain all tag
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel at selenic.com
> > https://selenic.com/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151121/4dfed33a/attachment.html>
More information about the Mercurial-devel
mailing list