D6773: rust-hgpath: add HgPath and HgPathBuf structs to encapsulate handling of paths

Yuya Nishihara yuya at tcha.org
Sat Aug 31 22:14:33 EDT 2019


>   > I found `std::path::Path` has a well-written inline comment about
>   > the unsafe pointer cast. Let's copy it so we won't introduce a
>   > memory issue.
>   
>   I'm not 100% sure which comment you're referring to, sorry.

This.

https://github.com/rust-lang/rust/commit/740f8db85572aef58d0734fc60bc2b54862ebbb0#diff-120c5906285d41e976dc4176265d7cbaR1754

Apparently it was added quite recently.

>   > I'm not sure if this `HgPath` type is designed to be always validated or not.
>   > If yes, these functions can be private, and we'll need a conservative
>   > `HgPath::from_bytes(s) -> Result<&HgPath>` and `unsafe fn from_bytes_unchecked(s)`
>   > functions. If not, maybe `HgPath::new()` shouldn't validate the bytes at all.
>   > I feel `debug_assert()` implies that the condition must always be met,
>   > so the caller has to guarantee that.
>   
>   I feel like both the performance implications and the ergonomics of checking everything are going to be a problem. @kevincox's idea of using `debug_assert!` seemed like a decent enough check to ease development. These functions could also be used during conversions to `OsString` and the like.

@kevincox said "it would be good to [...] add a debug_assert! on
creation/modification to check that everything is as expected."
IIUC, his idea is that the `debug_assert!()` condition must always be met,
which is the guarantee that the HgPath type provides. In order to ensure
that, we'll have to validate all paths read from dirstate, for example,
prior to casting to HgPath type. Otherwise `debug_assert!()` would fail,
meaning the implementation had a bug.

So, my question is, do we really need such strong guarantee by default?

Instead, we could lower the bar to "HgPath is merely a path represented in
platform-specific encoding though it's supposed to be a canonical path",
and add e.g. `is_canonicalized()` to check if the path meets the strong
requirement.

>   >> +impl Index<usize> for HgPath {
>   >> +    type Output = u8;
>   >> +
>   >> +    fn index(&self, i: usize) -> &Self::Output {
>   >> +        &self.inner[i]
>   >> +    }
>   >> +}
>   >
>   > [...]
>   > Better to not implement `path[...]` since it can be considered returning
>   > a path component at the specified index. We can always do
>   > `path.as_bytes()[...]` explicitly.
>   
>   I'd leave `Index<RangeFull>`since it it not ambiguous and is a neat shorthand.

What shorthand would it be used for?

I feel it's weird only `path[..]` is allowed.


More information about the Mercurial-devel mailing list