[PATCH 2 of 5 RFC] rust: iterator bindings to C code

Georges Racinet gracinet at anybox.fr
Sun Sep 30 10:23:15 EDT 2018


On 9/30/18 5:46 AM, Yuya Nishihara wrote:
> On Fri, 28 Sep 2018 15:31:09 +0200, Georges Racinet wrote:
>> # HG changeset patch
>> # User Georges Racinet <gracinet at anybox.fr>
>> # Date 1538059896 -7200
>> #      Thu Sep 27 16:51:36 2018 +0200
>> # Node ID de88c09512565ed1c12e2ff9159e06ed8d762d15
>> # Parent  d8c9571755a64e1fc3429587dfd3949b9862eceb
>> # EXP-Topic rustancestors-rfc
>> rust: iterator bindings to C code
>>
>> In this changeset, still made of Rust code only,
>> we expose the Rust iterator for instantiation and
>> consumption from C code.
>>
>> The idea is that both the index and index_get_parents()
>> will be passed from the C extension, hence avoiding a hard
>> link dependency to parsers.so, so that the crate can
>> still be built and tested independently.
> Just a note. That means parents() can't take hidden revisions into account.
> We might have to filter out hidden revisions from the initial set.
Yes, that's the plan.
>
> Another idea is to scan the revlog index in pure Rust. That's fairly easy
> since each revlog entry is fixed sized unless data is inlined. I'm sure
> there are several revlog parsers written by Mecurial folks to learn Rust.
> I have one. :)

I'd love to see it and agree that hooking onto such a parser could be
seen as a longer term perspective for what we're doing incrementally here.

On the other hand, I dismissed the idea to implement index_get_parents()
in Rust over the index C data structure as a bit too complicated, as it
seems to involve Python objects, and hence would require the help of
rust-cpython or a similar library.

>
>> +use std::boxed::Box;
> Nit: it's included in prelude.
Noted
>
>> +type IndexPtr = *mut c_void;
>> +type IndexParentsFn = extern "C" fn(index: IndexPtr,
>> +                                    rev: ssize_t,
>> +                                    ps: *mut [c_int; 2],
>> +                                    max_rev: c_int)
>> +                                    -> c_int;
> ...
>
>> +impl Graph for Index {
>> +    /// wrap a call to the C extern parents function
>> +    fn parents(&self, rev: i32) -> (i32, i32) {
>> +        let mut res: [c_int; 2] = [0; 2];
>> +        // surprisingly the call below is not unsafe, whereas calling the
>> +        // same extern function directly (not through a pointer) would.
>> +        // Maybe that depends on rustc version, though.
>> +        let code = (self.parents)(
> Perhaps, the IndexParentsFn type has to be labeled as "unsafe fn".
Ah yes, you're right, it works. The fact is, with a function pointer,
the Rust compiler can't know whether it's from external code or just
meant for external usage, should have thought of that.
>
>> +/// Wrapping of AncestorsIterator<CIndex> constructor, for C callers.
>> +///
>> +/// Besides `initrevs`, `stoprev` and `inclusive`, that are converted
>> +/// we receive the index and the parents function as pointers
>> +#[no_mangle]
>> +pub extern "C" fn rustlazyancestors_init(
>> +    index: IndexPtr,
>> +    parents: IndexParentsFn,
>> +    initrevslen: usize,
>> +    initrevs: *mut i64,
>> +    stoprev: i64,
>> +    inclusive: i64,
>> +) -> *mut AncestorsIterator<Index> {
>> +    let v: Vec<i64> =
>> +        unsafe { Vec::from_raw_parts(initrevs, initrevslen, initrevslen) };
> It's probably wrong to construct a Vec from a memory not allocated in Rust.
> Rust uses specialized malloc() variants. IIRC, the default is jemalloc and
> Vec has some other optimizations.

Yes, you're right, it makes much more sense to abide by the rule that
memory allocated by one language should be freed by the same one.

Given that and your earlier comments, I'll probably submit a new version
where rustlazyancestors_init() makes an iterator over revision numbers
(ie, i32) out of the initrevs it receives and pass it to
AncestorsIterator::new(). The raw pointer (initrevs array) will then be
freed from the C caller.

Also need to replace all occurrences of i64 by c_long for platform
independence.

Thanks for the comments,

-- 
Georges Racinet
Anybox SAS, http://anybox.fr
Téléphone: +33 6 51 32 07 27
GPG: B59E 22AB B842 CAED 77F7 7A7F C34F A519 33AB 0A35, sur serveurs publics




More information about the Mercurial-devel mailing list