[PATCH 1 of 5 RFC] rust: pure Rust lazyancestors iterator

Georges Racinet gracinet at anybox.fr
Sun Sep 30 09:47:43 EDT 2018


On 9/30/18 5:46 AM, Yuya Nishihara wrote:
> On Fri, 28 Sep 2018 15:31:08 +0200, Georges Racinet wrote:
>> # HG changeset patch
>> # User Georges Racinet <gracinet at anybox.fr>
>> # Date 1538060596 -7200
>> #      Thu Sep 27 17:03:16 2018 +0200
>> # Node ID d8c9571755a64e1fc3429587dfd3949b9862eceb
>> # Parent  d3d4b4b5f725124ef9e93cf74d779a7a05aa11b7
>> # EXP-Topic rustancestors-rfc
>> rust: pure Rust lazyancestors iterator
> First, thanks for tackling on the oxidization plain. I've quickly scanned
> a first few patches, and I'm generally positive to these.
Thanks for your interest and the positive feedback !
>
> Here's random thoughts regarding the project/module layout:
>
>  - place pure Rust code under rust/ ?
>
>    I think that's the intention why we have the rust/ tree. Perhaps, the
>    pure Rust modules can be packaged into a crate (e.g. hg-core), and the
>    Rust FFI part will depend on it (e.g. hg-ffi).
Ah, I was thinking that the rust/ tree was dedicated to the executable.
>
>    And the compiled CPython module (and maybe source) will be placed at
>    mercurial/rust.
>
>  - HGMODULEPOLICY=rust
>
>    Perhaps, we'll want a separate policy to enable the Rust bindings.
>    Most of the unimplemented functions will be forwarded to cext, just like
>    the cffi -> pure fallback.
I have yet to understand a bit more how way policy build and runtime
works, but this sounds nice too. Would you see this as part of the
patchset or in a subsequent one ?
>
>> +impl<G: Graph> AncestorsIterator<G> {
>> +    /// Constructor.
>> +    ///
>> +    /// We actually expect initrevs to be i64, and we coerce them to the
>> +    /// i32 of our internal representations. The reason is that from C
>> +    /// code, they actually come as such.
>> +    /// Here it would be better to take any iterable over our internal
>> +    /// revision numbers, and have the conversion be made from the caller.
>> +    pub fn new(
>> +        graph: G,
>> +        initrevs: &Vec<i64>,
>> +        stoprev: i64,
>> +        inclusive: bool,
>> +    ) -> Self {
>> +        let stop32 = stoprev as i32;
>> +        // because in iteration, contrary to what happens in ancestor.py, we
>> +        // compare with stoprev before pushing, we have to prefilter in the
>> +        // constructor too.
>> +        let deref_filter = initrevs.iter()
>> +            .map(|&x| x as i32).filter(|x| stop32 <= *x);
> Nit: I think i64->i32 conversion should be done by the caller. The initrevs
> argument can be IntoIterator<Item = i32> instead.
Yes, that's what I had in mind with the comment. I also have to get rid
of these i64 to support 32 bit architectures.
>
>> +    #[inline]
>> +    fn conditionally_push_rev(&mut self, rev: i32) {
>> +        if self.stoprev <= rev && !self.seen.contains(&rev) {
>> +            self.seen.insert(rev);
>> +            self.visit.push(rev);
> We could eliminate the seen set by skipping duplicated revisions in next().
> It's theoretically slower and was actually slower in Python, but might be
> worth trying in Rust.

What would be the procedure for submission here ? Maybe as a subsequent
patch, so that performance comparison would be easier to make, or do you
suggest we'd go in that direction right away ?

The seen set can also be used by a future (although I do have
experimental code) Rust implementation of __contains__, but this would
have positive impact on performance only if the lazyiterator object gets
used for iteration (growing a seen set) and then for membership (reusing
the existing seen set). I suppose this doesn't happen a lot.

Regards,

-- 
Georges Racinet
Anybox SAS, http://anybox.fr
Téléphone: +33 6 51 32 07 27
GPG: B59E 22AB B842 CAED 77F7 7A7F C34F A519 33AB 0A35, sur serveurs publics




More information about the Mercurial-devel mailing list