[PATCH] parsers: when available, use a presized dictionary for the file foldmap

Siddharth Agarwal sid0 at fb.com
Wed Apr 15 21:38:52 UTC 2015


# HG changeset patch
# User Siddharth Agarwal <sid0 at fb.com>
# Date 1429133744 25200
#      Wed Apr 15 14:35:44 2015 -0700
# Node ID 8ca386e305d52af5e714126173bc15c884bf16fc
# Parent  c560d8c687916cb70a6d54c2c9ddcb5c9e457be2
parsers: when available, use a presized dictionary for the file foldmap

On a repo with over 300,000 files, this speeds up perffilefoldmap:

before: wall 0.178421 comb 0.180000 user 0.160000 sys 0.020000 (best of 55)
after:  wall 0.164462 comb 0.160000 user 0.140000 sys 0.020000 (best of 59)

diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
@@ -205,7 +205,20 @@ static PyObject *make_file_foldmap(PyObj
 		goto quit;
 	}
 
+#if PY_VERSION_HEX >= 0x02060000
+	/* _PyDict_NewPresized expects a minused parameter, but it actually
+	   creates a dictionary that's the nearest power of two bigger than the
+	   parameter. For example, with the initial minused = 1000, the
+	   dictionary created has size 1024. Of course in a lot of cases that
+	   can be greater than the maximum load factor Python's dict object
+	   expects (= 2/3), so as soon as we cross the threshold we'll resize
+	   anyway. So create a dictionary that's 3/2 the size. Also add some
+	   more to deal with additions outside this function. */
+	file_foldmap = _PyDict_NewPresized((PyDict_Size(dmap) / 5) * 8);
+#else
 	file_foldmap = PyDict_New();
+#endif
+
 	if (file_foldmap == NULL)
 		goto quit;
 


More information about the Mercurial-devel mailing list