[PATCH 1 of 3 V2] util: add method to hash nested combination of python data structures

Laurent Charignon lcharignon at fb.com
Mon Jul 6 13:42:40 CDT 2015


# HG changeset patch
# User Laurent Charignon <lcharignon at fb.com>
# Date 1435794507 25200
#      Wed Jul 01 16:48:27 2015 -0700
# Node ID 38cd2c6265855f0a8e65e02e2cc181921df498ca
# Parent  2748bf78a5bf610da4f2d90fd1eea19a3b360c04
util: add method to hash nested combination of python data structures

The goal of this series of patches is to have a method to compute the hash of
config objects. This will enable restarting the command server when the config
change.
This patch adds a method to compute the hash of nested combination of basic
data structure, it will be used to enable computing the hash of a sortdict and
in turn compute the hash of a config.

Implementation modified from:
http://stackoverflow.com/questions/5884066/hashing-a-python-dictionary

diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -2333,6 +2333,50 @@ class dirs(object):
 if safehasattr(parsers, 'dirs'):
     dirs = parsers.dirs
 
+def makehash(x):
+    """makes a hash from a basic data type containing only other hashable types
+
+    >>> a = makehash({3:2, 4:5})
+    >>> b = makehash({4:5, 3:2})
+    >>> c = makehash({4:5, 3:(2,3,4)})
+    >>> d = makehash({4:5, 3:[2,3,4]})
+    >>> e = makehash({4:5, 3:{2:[3,4]}})
+    >>> a == b
+    True
+    >>> c == d
+    True
+    >>> e != a and c != a
+    True
+    >>> x = sortdict({1:2})
+    >>> h1 = makehash(x)
+    >>> x[4] = 5
+    >>> h2 = makehash(x)
+    >>> x[1] = 2
+    >>> h3 = makehash(x)
+    >>> h3 != h2 and h2 != h1
+    True
+    >>> s1 = set(list(range(2000)))
+    >>> s2 = set(list(range(1999,-1,-1)))
+    >>> makehash(s1) == makehash(s2)
+    True
+    """
+    if isinstance(x, sortdict):
+        # creates a list, in order to freeze the order of the element and avoid
+        # sorting them before hashing (see below)
+        return makehash(list(x.iteritems()))
+    elif isinstance(x, dict):
+        # iteritems for a dict does not always return in the same order, thus
+        # the sorting
+        return makehash(sorted(x.iteritems()))
+    elif isinstance(x, (tuple, list)):
+        return hash(tuple(makehash(i) for i in x))
+    elif safehasattr(x, '__iter__'):
+        # this is the case for sets and other containers
+        # for two sets with the same content, __iter__ will not always return
+        # the same order, thus the need to sort
+        return hash(tuple(makehash(i) for i in sorted(x)))
+    return hash(x)
+
 def finddirs(path):
     pos = path.rfind('/')
     while pos != -1:


More information about the Mercurial-devel mailing list