[PATCH] py3: handling .iteritems() vs .items()

Martijn Pieters mj at zopatista.com
Mon Jun 6 05:06:47 EDT 2016


On 5 June 2016 at 08:32, Pulkit Goyal <7895pulkit at gmail.com> wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pulkit at gmail.com>
> # Date 1465140428 -19800
> #      Sun Jun 05 20:57:08 2016 +0530
> # Node ID ae89177a49c51f66962202598a44643c1dd1d18f
> # Parent  118a605e3ad9e1d30c4fd8bacc8310167ae1f222
> py3: handling .iteritems() vs .items()
>
> Using dict.items() instead of dict.iteritems() in
> py2 just for py3 compatibilty is not a good idea because
> in py2 dict.items() returns a copy of dictionary's list of
> pairs. This will take a lot of memory when dictionary is
> large.
>
> So it will be good to use .viewitems() in py2, so that we can iterate
> more than once and .items() in py3 as rest are gone. The .items() in
> py3 has an improved implementation.
>
> Importing from the util mdoule adds up the call cost which can be
> mitigated by using methodcaller. This approach can be used for
> mercurial/* and hgext/* modules.
> In contrib we have to import mercurial/util.py, that will be costly
> so handling it separately will be better. Also using .items() will
> be costly in contrib. So we have to add this hack everywhere or find
> another way out.
>
> I will like to get more advice on this as to what can be improvised
> in this approach and how can we handle the contrib section separately.
>
> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -23,6 +23,7 @@
>  import gc
>  import hashlib
>  import imp
> +import operator
>  import os
>  import re as remod
>  import shutil
> @@ -58,6 +59,8 @@
>  # This line is to make pyflakes happy:
>  urlreq = pycompat.urlreq
>
> +methodcaller = operator.methodcaller
> +
>  if os.name == 'nt':
>      from . import windows as platform
>  else:
> @@ -2830,3 +2833,12 @@
>
>  # convenient shortcut
>  dst = debugstacktrace
> +
> +def viewitems(dict):
> +
> +    # using methodcaller avoids having to create another Python call frame.
> +    if safehasattr(dict, 'viewitems'):
> +        viewitems = methodcaller('viewitems')
> +    else:
> +        viewitems = methodcaller('items')
> +    return viewitems(dict)

Making this a function that is called every time rather defeats the
purpose of using a methodcaller object; you now *still* call a Python
function each time. Set viewitems to the right methodcaller object
**once** and call it each time instead of your function:

    try:
        dict.viewitems
        viewitems = operator.methodcaller('viewitems')
    except AttributeError:
        viewitems = operator.methodcaller('items')

Now `mercurial.utils.viewitems(dict)` will use the methodcaller object
directly and avoid creating a new Python function frame. Using a
methodcaller that way is practically as fast as using
`dict.viewitems()` or `dict.items()` directly.

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel



-- 
Martijn Pieters


More information about the Mercurial-devel mailing list