[PATCH RFC] RFC: allow optional C++ 11 extensions with pybind11 for performance code

Siddharth Agarwal sid at less-broken.com
Fri Feb 19 17:29:31 EST 2016


On 2/19/16 13:06, Matt Mackall wrote:
> On Thu, 2016-02-11 at 15:30 +0000, Laurent Charignon wrote:
>> Hi,
>>
>> For the time being and for the tree manifest, I will use C and reuse code from
>> lazy manifest.
>> If the perf are not acceptable(because no hash table) I will look into hash
>> tables in C that we could add to our project.
>> Writing the code in C seems like a non controversial way to proceed and have
>> someone review the changes :)
>>
>> In this discussion we all seem to agree on one thing: **we will have to ditch
>> the Python C layer in the near future**.
>> We don't know yet if we should (1) use cffi and ctypes to move toward pypy or
>> (2) use cython.
>> (1) Implies rewriting of our C layer to decouple it from the Python API.
>> (2) Implies ditching our C code and adding type hints to our performance
>> sensitive code in python, correct?
>>
>> I didn't really follow the discussions around pypy, when are we planning to
>> support pypy?
>>
>> Matt, what do you think about this discussion?
> I think we're going to find that using anything but the Python C layer is going
> to have performance consequences we're not happy with, especially for building
> large Python-native objects. That's been our usual strategy as it generally
> means less C code and thus less maintenance pain. Prime example: parsing the
> manifest.
>
> But a strategy that we've used in a few places is:
>
> a) build a native C object
> b) wrap it as a Python object that mimics a native type like a list or dict
> c) drop it into a place we use a native type
>
> This requires a LOT more boilerplate but lets us do things like deferred parsing
> and construction of Python's expensive boxed types. We've primarily done this
> for the revlog index. The important thing to note here is that while the old
> strategy gave us bare metal performance, the new strategy is even faster _by not
> doing lots of work_, a situation that's made possible by owning the abstraction
> in C.
>
> Unfortunately, this strategy looks like it doesn't work with cffi/ctypes/cython
> because it can't do the all-important step (b). Which means:
>
> 1) we have to rewrite LOTS of Python code to replace x[foo] with somefunction(x,
> foo)
> 2) we have to add explicit lifetime management of x
> 3) the Python-level overhead of function calls is way higher
> 4) there's probably significantly more type boxing/unboxing overhead
>
> It's also worth mentioning that with the Python C API, we get to directly
> control some significant aspects of Python garbage collection and threading that
> have cut seconds off of runtime but aren't going to be accessible in other
> models.
>
> So I'm basically not thrilled with any of our alternatives here. But if someone
> wants to experiment in this space, I'd recommend converting our index-parsing
> code paths to cffi to get a good feel for the pain involved.

For a possibly less involved example of this, it might be worth looking 
at the code that constructs dirstate tuples. These behave like Python 
tuples, except the individual elements are only materialized when requested.

- Siddharth

>
> (C++ obviously has all of the above concerns.. plus C++. It's hard enough to get
> the current requisite "free" VS98 compiler in the hands of Windows developers,
> and I'm pretty sure it doesn't do C++11.)
>



More information about the Mercurial-devel mailing list