[PATCH 5 of 7 V3] sparse-revlog: introduce native (C) implementation of slicechunktodensity

Yuya Nishihara yuya at tcha.org
Tue Nov 20 06:59:51 EST 2018


On Mon, 19 Nov 2018 18:02:16 +0100, Boris FELD wrote:
> On 19/11/2018 13:54, Yuya Nishihara wrote:
> > On Mon, 19 Nov 2018 10:42:17 +0100, Boris Feld wrote:
> >> # HG changeset patch
> >> # User Boris Feld <boris.feld at octobus.net>
> >> # Date 1542276598 -3600
> >> #      Thu Nov 15 11:09:58 2018 +0100
> >> # Node ID b365764b9915e600a5c7c273f244f65e2240e00a
> >> # Parent  f9ce23d5d3aeec9a9e8589ec83d089b2df83da82
> >> # EXP-Topic sparse-perf
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r b365764b9915
> >> sparse-revlog: introduce native (C) implementation of slicechunktodensity
> >> +		Py_ssize_t revstart;
> >> +		Py_ssize_t revsize;
> >> +		if (!index_get_start(self, revs[i], &revstart)) {
> >> +			goto bail;
> > Be aware that Py_ssize_t * != long *. I think Py_ssize_t is the right type
> > here and other places computing offsets or indices.
> >
> > There are a few more callers which passes Py_ssize_t * as long *argument.
> 
> Looking more closely to this, it seems like offset can be full unsigned
> 64 bits integer. Py_ssize_t can be smaller than that. So I think we
> should keep offset, length and segmentspan as explicit uint64_t. And use
> PyLong_AsUnsignedLongLong if conversion is needed.
> 
> Does it seem right to you? shall I update the series in that directory?

Yep, that's probably more correct. I was afraid of using PY_LONG_LONG which
isn't always available, but we appear to depend on it. The tuple_format
contains "K" on 32bit/LLP64 platforms, which wouldn't work if PY_LONG_LONG
weren't there.

> Because we use unsigned type, it is inconvenient to return -1 value.
> That is why I went for the boolean (same as pylong_to_long).

Well, the return value has no relation to the output argument. It can be
0/-1 int.

> Can you confirm the following:
> 1) using uint64 for data offset and length

Nit: use the same type as index_get() at the lowest layer. (i.e.
uint64_t for offset, and int for lengths.) And use the widest type while
computing (e.g. maybe signed int64_t in index_segment_span().) That's just
my opinion.

> 2) using Py_ssize_t for everything else

Seems fine.

> 3) Handle error as booleans (using true/false)

See above. But I'm okay with booleans if you prefer.


More information about the Mercurial-devel mailing list