D7178: [RFC] rust-matchers: add `Matcher` trait and implement `AlwaysMatcher`

Alphare (Raphaël Gomès) phabricator at mercurial-scm.org
Tue Oct 29 16:27:53 UTC 2019


Alphare created this revision.
Herald added subscribers: mercurial-devel, kevincox, durin42.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  In our quest of a faster Mercurial, we have arrived at the point where we need
  to implement the matchers in Rust.
  This RFC mainly for the `Matcher` trait to see if the changes proposed feel
  fine to people with more experience on the matter. While the `AlwaysMatcher`
  implementation is here as a trivial example, it should be the first step
  towards matchers use in Rust as it is currently the only supported one.
  
  Notable changes:
  
  - `exact` is renamed to `exact_match`
  - enums for `visit*` methods with `Recursive` instead of `'all'`, etc.
  - a new `roots`, separate from `file_set`
  - no `bad`, `explicitdir` or `traversedir` functions as they can be passed to the high functions instead of the matchers
  
  Thanks to Martin for suggesting the last two (most important) changes and for
  reaching out to help a few weeks ago.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D7178

AFFECTED FILES
  rust/hg-core/src/lib.rs
  rust/hg-core/src/matchers.rs

CHANGE DETAILS

diff --git a/rust/hg-core/src/matchers.rs b/rust/hg-core/src/matchers.rs
new file mode 100644
--- /dev/null
+++ b/rust/hg-core/src/matchers.rs
@@ -0,0 +1,142 @@
+// matchers.rs
+//
+// Copyright 2019 Raphaël Gomès <rgomes at octobus.net>
+//
+// This software may be used and distributed according to the terms of the
+// GNU General Public License version 2 or any later version.
+
+//! Structs and types for matching files and directories.
+
+use crate::utils::hg_path::{HgPath, HgPathBuf};
+use std::collections::HashSet;
+
+pub enum VisitDir {
+    Recursive,
+    This,
+    No,
+}
+pub enum VisitChildrenSet {
+    Recursive,
+    Empty,
+    Set(HashSet<HgPathBuf>),  // Should we implement a `NonEmptyHashSet`?
+    This,
+}
+
+pub trait Matcher {
+    /// Returns whether `filename` is in `file_set`
+    fn exact_match(&self, _filename: impl AsRef<HgPath>) -> bool;
+    fn file_set(&self) -> Option<HashSet<&HgPath>> {
+        None
+    }
+    fn roots(&self) -> Option<HashSet<&HgPath>> {
+        None
+    }
+    /// Returns whether `filename` is matched by this matcher
+    fn match_fn(&self, _filename: impl AsRef<HgPath>) -> bool {
+        true
+    }
+
+    /// Decides whether a directory should be visited based on whether it
+    /// has potential matches in it or one of its subdirectories. This is
+    /// based on the match's primary, included, and excluded patterns.
+    ///
+    /// Returns `VisitDir::Recursive` if the given directory and all
+    /// subdirectories should be visited.
+    /// Otherwise returns `VisitDir::This` or `VisitDir::No` indicating whether
+    /// the given directory should be visited.
+    fn visit_dir(&self, _directory: impl AsRef<HgPath>) -> VisitDir {
+        VisitDir::This
+    }
+
+    /// Decides whether a directory should be visited based on whether it
+    /// has potential matches in it or one of its subdirectories, and
+    /// potentially lists which subdirectories of that directory should be
+    /// visited. This is based on the match's primary, included, and excluded
+    /// patterns.
+    ///
+    /// This function is very similar to `visit_dir`, and the following mapping
+    /// can be applied:
+    ///
+    ///  `VisitDir`  | `VisitChildrenSet`
+    /// -------------+-------------------
+    ///  `No`        | `Empty`
+    ///  `Recursive` | `Recursive`
+    ///  `This`      | `This` OR non-empty `Set` of subdirs -or files- to visit
+    ///
+    /// # Example
+    ///
+    /// Assume matchers `['path:foo/bar', 'rootfilesin:qux']`, we would
+    /// return the following values (assuming the implementation of
+    /// visit_children_set is capable of recognizing this; some implementations
+    /// are not).
+    ///
+    /// ```ignore
+    /// '' -> {'foo', 'qux'}
+    /// 'baz' -> set()
+    /// 'foo' -> {'bar'}
+    /// // Ideally this would be `Recursive`, but since the prefix nature of
+    /// // matchers is applied to the entire matcher, we have to downgrade this
+    /// // to `This` due to the (yet to be implemented in Rust) non-prefix
+    /// // `RootFilesIn'-kind matcher being mixed in.
+    /// 'foo/bar' -> 'this'
+    /// 'qux' -> 'this'
+    /// ```
+    /// # Important
+    ///
+    /// Most matchers do not know if they're representing files or
+    /// directories. They see `['path:dir/f']` and don't know whether `f` is a
+    /// file or a directory, so `visit_children_set('dir')` for most matchers
+    /// will return `HashSet{ HgPath { "f" } }`, but if the matcher knows it's
+    /// a file (like the yet to be implemented in Rust `ExactMatcher` does),
+    /// it may return `VisitChildrenSet::This`.
+    /// Do not rely on the return being a `HashSet` indicating that there are
+    /// no files in this dir to investigate (or equivalently that if there are
+    /// files to investigate in 'dir' that it will always return
+    /// `VisitChildrenSet::This`).
+    fn visit_children_set(
+        &self,
+        _directory: impl AsRef<HgPath>,
+    ) -> VisitChildrenSet {
+        VisitChildrenSet::This
+    }
+    /// Matcher will match everything and `files_set()` will be empty:
+    /// optimization might be possible.
+    fn always(&self) -> bool {
+        false
+    }
+    /// Matcher will match exactly the files in `files_set()`: optimization
+    /// might be possible.
+    fn is_exact(&self) -> bool {
+        false
+    }
+    /// Matcher will match the paths in `files_set()` recursively: optimization
+    /// might be possible
+    fn prefix(&self) -> bool {
+        false
+    }
+    /// None of `.always()`, `.is_exact()`, and `.prefix()` is true:
+    /// optimizations will be difficult.
+    fn any_pats(&self) -> bool {
+        // TODO rename? It's confusing
+        !self.always() && !self.is_exact() && !self.prefix()
+    }
+}
+
+/// Matches everything.
+#[derive(Debug)]
+pub struct AlwaysMatcher;
+
+impl Matcher for AlwaysMatcher {
+    fn exact_match(&self, _filename: impl AsRef<HgPath>) -> bool {
+        false
+    }
+    fn visit_dir(&self, _directory: impl AsRef<HgPath>) -> VisitDir {
+        VisitDir::Recursive
+    }
+    fn visit_children_set(
+        &self,
+        _directory: impl AsRef<HgPath>,
+    ) -> VisitChildrenSet {
+        VisitChildrenSet::Recursive
+    }
+}
diff --git a/rust/hg-core/src/lib.rs b/rust/hg-core/src/lib.rs
--- a/rust/hg-core/src/lib.rs
+++ b/rust/hg-core/src/lib.rs
@@ -17,6 +17,7 @@
     StateMap, StateMapIter,
 };
 mod filepatterns;
+pub mod matchers;
 pub mod utils;
 
 use crate::utils::hg_path::HgPathBuf;



To: Alphare, #hg-reviewers
Cc: durin42, kevincox, mercurial-devel


More information about the Mercurial-devel mailing list