[PATCH 1 of 4 V2] mercurial: implement diff and join for dicts

Kevin Bullock kbullock+mercurial at ringworld.org
Tue Mar 26 11:31:34 CDT 2013


On 25 Mar 2013, at 7:51 PM, Siddharth Agarwal wrote:

> # HG changeset patch
> # User Siddharth Agarwal <sid0 at fb.com>
> # Date 1364258439 25200
> #      Mon Mar 25 17:40:39 2013 -0700
> # Node ID bc4c228833d0a2bcc7c9a0e3ba4c364fe9157b4e
> # Parent  f0d16e97f0b228468807a23fb2b9dc17d5cc6f52
> mercurial: implement diff and join for dicts
> 
> Given two dicts, diff returns a dict containing all the keys that are present
> in one dict but not the other, or whose values are different between the
> dicts. The values are pairs of the values from the dicts, with missing values
> being represented as an optional argument, defaulting to None.
> 
> Given two dicts, join performs what is known as an outer join in relational
> database land: it returns a dict containing all the keys across both dicts.
> The values are pairs as above, except they aren't compared to see if they're
> the same.
> 
> diff --git a/mercurial/dicthelpers.py b/mercurial/dicthelpers.py
> new file mode 100644
> --- /dev/null
> +++ b/mercurial/dicthelpers.py
> @@ -0,0 +1,35 @@
> +# dicthelpers.py - helper routines for Python dicts
> +#
> +# Copyright 2013 Facebook
> +#
> +# This software may be used and distributed according to the terms of the
> +# GNU General Public License version 2 or any later version.
> +
> +def _diffjoin(d1, d2, default, compare):
> +    res = {}
> +    if d1 is d2 and compare:
> +        # same dict, so diff is empty
> +        return res
> +
> +    for k1, v1 in d1.iteritems():
> +        if k1 in d2:
> +            v2 = d2[k1]
> +            if not compare or v1 != v2:
> +                res[k1] = (v1, v2)
> +        else:
> +            res[k1] = (v1, default)
> +
> +    if d1 is d2:
> +        return res
> +
> +    for k2 in d2:
> +        if k2 not in d1:
> +            res[k2] = (default, d2[k2])
> +
> +    return res
> +
> +def diff(d1, d2, default=None):
> +    return _diffjoin(d1, d2, default, True)
> +
> +def join(d1, d2, default=None):
> +    return _diffjoin(d1, d2, default, False)

I'm a bit uneasy about the names of these, given that they're not closed on the domain. I'd expect 'diff' to be something like a set-difference resulting in all the k,v pairs in d1 not in d2.

'join' is better, since it does result in something like a full outer join (tuples with the corresponding values from each dict, or default/None). Given that, I guess 'diff' is really a left join.

pacem in terris / мир / शान्ति / ‎‫سَلاَم‬ / 平和
Kevin R. Bullock



More information about the Mercurial-devel mailing list