[PATCH 1 of 3 rfc] mpath: split the cpython part from the pure C part

Augie Fackler raf at durin42.com
Mon Mar 7 21:29:40 EST 2016


On Mon, Mar 07, 2016 at 05:43:15PM +0200, Maciej Fijalkowski wrote:
> # HG changeset patch
> # User fijal
> # Date 1456135587 -3600
> #      Mon Feb 22 11:06:27 2016 +0100
> # Node ID c41ede700e5cf47d7eb44408fd2aa0a55f21c81d
> # Parent  41dcd754526612c43b9695df8851557c851828ef
> mpath: split the cpython part from the pure C part

Heart's in the right place, I suppose, but I've got some nits on the
implementation.

>
> diff -r 41dcd7545266 -r c41ede700e5c mercurial/bitmanipulation.h
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/mercurial/bitmanipulation.h	Mon Feb 22 11:06:27 2016 +0100
> @@ -0,0 +1,48 @@
> +#pragma once

It's a near-certainty that we're going to hit a broken compiler that
doesn't grok this pragma. Probably better to stick to the orthodox
#ifdef guards.

> diff -r 41dcd7545266 -r c41ede700e5c mercurial/mpatch.c
> --- a/mercurial/mpatch.c	Wed Feb 24 15:55:44 2016 -0600
> +++ b/mercurial/mpatch.c	Mon Feb 22 11:06:27 2016 +0100
> @@ -20,26 +20,15 @@

[elided]

> @@ -206,8 +193,10 @@
>
>       /* assume worst case size, we won't have many of these lists */
>       l = lalloc(len / 12);
> -	if (!l)
> -		return NULL;
> +	if (!l) {
> +            mpatch_error = "no memory";
> +            return NULL;
> +        }

This adhoc errno-type error reporting freaks me out. It's going to
break as soon as threads are in play. I'd prefer (as gross as it is)
an out parameter of some sort (either for the result of the function,
or for the error.)

>
>       lt = l->tail;
>

[rest of file elided - I didn't look /too/ closely, but it looked
about like I expected.]

> diff -r 41dcd7545266 -r c41ede700e5c mercurial/mpatch.h
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/mercurial/mpatch.h	Mon Feb 22 11:06:27 2016 +0100

[elided this file - no unique complaints]

> diff -r 41dcd7545266 -r c41ede700e5c mercurial/mpatch_cpy.c
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/mercurial/mpatch_cpy.c	Mon Feb 22 11:06:27 2016 +0100

Would it make sense for this to be tagged as a copy from mpatch.c?
That way the blame can follow back past where the files were split.


> @@ -0,0 +1,159 @@
> +/* CPython C API interface to mpatch.c, an effiecient patch
> +   manipulation algorithm */
> +
> +#define PY_SSIZE_T_CLEAN
> +
> +#include <Python.h>
> +#include "util.h"
> +
> +static char mpatch_doc[] = "Efficient binary patching.";
> +static PyObject *mpatch_Error;
> +
> +#include "mpatch.h"
> +
> +/* recursively generate a patch of all bins between start and end */
> +static struct flist *fold(PyObject *bins, ssize_t start, ssize_t end)

This function probably needs to be refactored to not use a PyObject,
as it looks pretty mpatch-core-algorithm relevant.

> +{
> +	ssize_t len, blen;
> +	const char *buffer;
> +        struct flist *res;
> +
> +	if (start + 1 == end) {
> +		/* trivial case, output a decoded list */
> +		PyObject *tmp = PyList_GetItem(bins, start);
> +		if (!tmp)
> +			return NULL;
> +		if (PyObject_AsCharBuffer(tmp, &buffer, &blen))
> +			return NULL;
> +		res = decode(buffer, blen);
> +                goto finish;
> +	}
> +
> +	/* divide and conquer, memory management is elsewhere */
> +	len = (end - start) / 2;
> +	res = combine(fold(bins, start, start + len),
> +                    fold(bins, start + len, end));
> + finish:
> +        if (!res && !PyErr_Occurred()) {
> +            if (strncmp(mpatch_error, "no memory", 9) == 0) {
> +                mpatch_error = NULL;
> +                PyErr_NoMemory();
> +            } else {
> +                PyErr_SetString(mpatch_Error, mpatch_error);
> +                mpatch_error = NULL;
> +            }
> +        }
> +        return res;
> +}

[rest of file elided]

> diff -r 41dcd7545266 -r c41ede700e5c mercurial/util.h
> --- a/mercurial/util.h	Wed Feb 24 15:55:44 2016 -0600
> +++ b/mercurial/util.h	Mon Feb 22 11:06:27 2016 +0100

I think it might make sense to split out bitmanipulation.h in a commit
and in that commit just make any users of its functions also include
it instead of picking it up transitively through util.h.


[elided rest of patch, it's fine]


More information about the Mercurial-devel mailing list