D5370: rust: core implementation of missingancestors (no bindings)

kevincox (Kevin Cox) phabricator at mercurial-scm.org
Tue Dec 4 19:18:22 EST 2018


kevincox added a comment.


  Overall looks good to me. Just a couple of points.
  
  - Using random numbers for tests without a seed that is logged will create failures which are basically impossible to reproduce.
  - A lot of the comments are comparing to the python, I don't know how useful this is to most readers.

INLINE COMMENTS

> ancestors.rs:155
> +    where
> +        I: IntoIterator<Item = Revision>,
> +    {

I prefer doing `bases: impl  IntoIterator<Item = Revision>` instead of using a where clause.

> ancestors.rs:157
> +    {
> +        let mut bset: HashSet<Revision> = bases.into_iter().collect();
> +        if bset.is_empty() {

If this is going to be stored in the `bases` field I would just call it `bases`.

This will also allow you to do `MissingAncestors { graph, bases }` if you prefer.

> ancestors.rs:159
> +        if bset.is_empty() {
> +            bset.insert(NULL_REVISION);
> +        }

Is always having an item in the set actually saving many corner cases? It seems like you usually check for empty sets anyways.

> ancestors.rs:170
> +            0 => false, // shouldn't happen
> +            1 => *(self.bases.iter().next().unwrap()) != -1,
> +            _ => true,

s/-1/NULL_REVISISON

> ancestors.rs:195
> +    ) -> Result<(), GraphError> {
> +        // using difference() or intersection() would not satisfy borrow rules
> +        revs.retain(|r| !self.bases.contains(r));

I don't think this comment is helpful to the reader.

> ancestors.rs:201
> +            revs.remove(&rev);
> +        }
> +        if revs.is_empty() {

Isn't this loop redundant with the retain call above?

> ancestors.rs:219
> +        // going all the way to the root
> +        let keepcount = revs.iter().filter(|&r| *r > start).count();
> +

I find the pattern match and deref slightly confusing. I would prefer a `**` to make it obvious you are doing a double deref.

> ancestors.rs:235
> +    fn add_parents(&mut self, rev: Revision) -> Result<(), GraphError> {
> +        let parents = self.graph.parents(rev)?;
> +        // No need to bother the set with inserting NULL_REVISION over and

I would prefer `let (p0, p1) = ...`. This makes it obvious exactly what data you are getting.

> ancestors.rs:262
> +        let bases_visit = &mut self.bases;
> +        let mut revs_as_set: HashSet<Revision> = revs
> +            .into_iter()

I would simply call it `revs`.

> ancestors.rs:281
> +            None => NULL_REVISION,
> +        };
> +        let start = max(max_bases, max_revs);

Replace the match with `.unwrap_or(NULL_REVISION)`

> ancestors.rs:286
> +        let mut missing: Vec<Revision> = Vec::new();
> +        for curr in (0..start + 1).map(|i| start - i) {
> +            if revs_visit.is_empty() {

If an inclusive range use `0..=start`.

> ancestors.rs:293
> +                // another path
> +                // TODO optim: Rust's HashSet.remove returns a boolean telling
> +                // if it happened. This will spare us one set lookup

You may as well do this now. Just move the remove into the if.

> ancestors.rs:309
> +                    missing.push(curr);
> +                    revs_visit.remove(&curr);
> +                    true

Do this remove in the if as well.

> ancestors.rs:484
> +    fn test_missing_bases() {
> +        let mut missanc = MissingAncestors::new(Stub, vec![5, 3, 1, 3]);
> +        let mut as_vec: Vec<Revision>;

Since you have used IntoIter you shouldn't need the `vec!`s.

> ancestors.rs:486
> +        let mut as_vec: Vec<Revision>;
> +        as_vec = missanc.get_bases().iter().map(|&r| r).collect();
> +        as_vec.sort();

Do this assignment on the same line as the declaration.

> lib.rs:20
> +
> +    fn parents_iter(
> +        &self,

I think the better solution here is to make `parents` return `[Revision; 2]` (or `&[Revision]` and not return nulls.

> lib.rs:26
> +        Ok(ParentsIterator::new(parents))
> +    }
> +}

Would something like the following be simpler?

  fn parents_iter(
      &self,
      r: Revision,
  ) -> Result<impl Iterator<Item=Revision>, GraphError> {
      let (p0, p1) = self.parents(r)?;
      let iter = [p0, p1].into_iter()
          .map(|r| if r == NULL_REVISION { None} else { Some(r) });
      Ok(iter)
  }

> lib.rs:36
> +/// Instead of NULL_REVISION, None is returned
> +/// Here a macro would probably be more in order for performance
> +impl ParentsIterator {

I don't think this comment is necessary. Also this should be quite transparent to the optimizer.

> lib.rs:52
> +        self.cur += 1;
> +        match match self.cur {
> +            0 => self.parents.0,

I would prefer putting the result in a variable and then doing the second match later.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5370

To: gracinet, #hg-reviewers
Cc: durin42, kevincox, mjpieters, mercurial-devel


More information about the Mercurial-devel mailing list