[PATCH 1 of 4 V2] mercurial: implement diff and join for dicts
Kevin Bullock
kbullock+mercurial at ringworld.org
Tue Mar 26 11:31:34 CDT 2013
On 25 Mar 2013, at 7:51 PM, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0 at fb.com>
> # Date 1364258439 25200
> # Mon Mar 25 17:40:39 2013 -0700
> # Node ID bc4c228833d0a2bcc7c9a0e3ba4c364fe9157b4e
> # Parent f0d16e97f0b228468807a23fb2b9dc17d5cc6f52
> mercurial: implement diff and join for dicts
>
> Given two dicts, diff returns a dict containing all the keys that are present
> in one dict but not the other, or whose values are different between the
> dicts. The values are pairs of the values from the dicts, with missing values
> being represented as an optional argument, defaulting to None.
>
> Given two dicts, join performs what is known as an outer join in relational
> database land: it returns a dict containing all the keys across both dicts.
> The values are pairs as above, except they aren't compared to see if they're
> the same.
>
> diff --git a/mercurial/dicthelpers.py b/mercurial/dicthelpers.py
> new file mode 100644
> --- /dev/null
> +++ b/mercurial/dicthelpers.py
> @@ -0,0 +1,35 @@
> +# dicthelpers.py - helper routines for Python dicts
> +#
> +# Copyright 2013 Facebook
> +#
> +# This software may be used and distributed according to the terms of the
> +# GNU General Public License version 2 or any later version.
> +
> +def _diffjoin(d1, d2, default, compare):
> + res = {}
> + if d1 is d2 and compare:
> + # same dict, so diff is empty
> + return res
> +
> + for k1, v1 in d1.iteritems():
> + if k1 in d2:
> + v2 = d2[k1]
> + if not compare or v1 != v2:
> + res[k1] = (v1, v2)
> + else:
> + res[k1] = (v1, default)
> +
> + if d1 is d2:
> + return res
> +
> + for k2 in d2:
> + if k2 not in d1:
> + res[k2] = (default, d2[k2])
> +
> + return res
> +
> +def diff(d1, d2, default=None):
> + return _diffjoin(d1, d2, default, True)
> +
> +def join(d1, d2, default=None):
> + return _diffjoin(d1, d2, default, False)
I'm a bit uneasy about the names of these, given that they're not closed on the domain. I'd expect 'diff' to be something like a set-difference resulting in all the k,v pairs in d1 not in d2.
'join' is better, since it does result in something like a full outer join (tuples with the corresponding values from each dict, or default/None). Given that, I guess 'diff' is really a left join.
pacem in terris / мир / शान्ति / سَلاَم / 平和
Kevin R. Bullock
More information about the Mercurial-devel
mailing list