[PATCH 2 of 4 V2] parsers: write dirstate starting with non-normal entries

Tue Dec 1 10:21:53 CST 2015

On Mon, Nov 30, 2015 at 04:52:44PM -0800, Laurent Charignon wrote:
> # HG changeset patch
> # User Laurent Charignon <lcharignon at fb.com>
> # Date 1448930384 28800
> #      Mon Nov 30 16:39:44 2015 -0800
> # Node ID 5e659a9b2694d155e33286ef9b236e092ce80ad0
> # Parent  a86356722e51056c2bbfd6accae954ca386d92e1
> parsers: write dirstate starting with non-normal entries

I like where this is going, but how do we detect that the dirstate is
using the new sorting such that status is able to depend on this?

>
> Before this patch we were writing the dirstate entries in a "random" way,
> following the *unstable* order of a Python dictionary. This patch changes the
> order in which we write the dirstate entries.
>
> We now start with the non-normal files (that have changed and likely to have
> changed) and end with the normal files. This makes the job of hg status easier
> as, in most cases, it will need to access the non-normal entries of the
> dirstate. This new ordering allows hg status to stop iterating over the dirstate
> after processing those entries.
>
> On our large repos, for hg status, we achieve a 40% improvement.
> On the same repo, the cost of this change is a slowdown for writing the
> dirstate to disk (as we do two passes). I measured the execution time of
> hg debugrebuilddirstate with and without the change and observed a 5% slowdown
> for the overall command (16ms).
>
> diff --git a/mercurial/parsers.c b/mercurial/parsers.c
> --- a/mercurial/parsers.c
> +++ b/mercurial/parsers.c
> @@ -551,7 +551,7 @@ static PyObject *pack_dirstate(PyObject
>       Py_ssize_t nbytes, pos, l;
>       PyObject *k, *v = NULL, *pn;
>       char *p, *s;
> -	int now;
> +	int now, pass;
>
>       if (!PyArg_ParseTuple(args, "O!O!Oi:pack_dirstate",
>                             &PyDict_Type, &map, &PyDict_Type, &copymap,
> @@ -602,7 +602,9 @@ static PyObject *pack_dirstate(PyObject
>       }
>       memcpy(p, s, l);
>       p += 20;
> -	if (0 == 0) {
> +	/* First pass, non normal files, second pass normal files. This is to improve
> +      * status performance as status generally only need the non normal files */
> +	for (pass = 0; pass <= 1; pass++) {
>               for (pos = 0; PyDict_Next(map, &pos, &k, &v); ) {
>                       dirstateTupleObject *tuple;
>                       char state;
> @@ -610,6 +612,7 @@ static PyObject *pack_dirstate(PyObject
>                       Py_ssize_t len, l;
>                       PyObject *o;
>                       char *t;
> +			int normal;
>
>                       if (!dirstate_tuple_check(v)) {
>                               PyErr_SetString(PyExc_TypeError,
> @@ -622,6 +625,9 @@ static PyObject *pack_dirstate(PyObject
>                       mode = tuple->mode;
>                       size = tuple->size;
>                       mtime = tuple->mtime;
> +			normal = (state == 'n' && mtime != -1);
> +			if (normal != pass)
> +				continue;
>                       if (state == 'n' && mtime == now) {
>                               /* See pure/parsers.py:pack_dirstate for why we do
>                                * this. */
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel