D6773: rust-hgpath: add HgPath and HgPathBuf structs to encapsulate handling of paths

yuja (Yuya Nishihara) phabricator at mercurial-scm.org
Sat Aug 31 08:20:28 EDT 2019


yuja added a comment.


  (resend to get around Phabricator's bad email processing)
  
  > +    pub fn as_bytes(&self) -> &[u8] {
  > +        unsafe { &*(&self.inner as *const _ as *const [u8]) }
  > +    }
  
  `&self.inner`. No unsafe trick is needed.
  
  > +    /// Checks for errors in the path, short-circuiting at the first one.
  > +    /// Useful to get finer-grained errors. To simply check if the path is
  > +    /// valid, use `is_valid`.
  > +    pub fn check_state(&self) -> Result<(), HgPathError> {
  > +        if self.len() == 0 {
  > +            return Ok(());
  > +        }
  > +        let bytes = self.as_bytes();
  > +        let mut previous_byte = None;
  > +
  > +        if bytes[0] == b'/' {
  > +            return Err(HgPathError::LeadingSlash);
  > +        }
  > +        for (index, byte) in bytes.iter().enumerate() {
  > +            match byte {
  > +                0 => return Err(HgPathError::ContainsNullByte(index)),
  > +                b'/' => {
  > +                    if previous_byte.is_some() && previous_byte == Some(b'/') {
  
  `previous_byte.is_some()` is redundant.
  
  > +                        return Err(HgPathError::ConsecutiveSlashes(index));
  > +                    }
  > +                }
  > +                _ => (),
  > +            };
  > +            previous_byte = Some(*byte);
  > +        }
  > +        Ok(())
  > +    }
  > +    pub fn is_valid(&self) -> bool {
  > +        self.check_state().is_ok()
  > +    }
  
  I'm not sure if this `HgPath` type is designed to be always validated or not.
  If yes, these functions can be private, and we'll need a conservative
  `HgPath::from_bytes(s) -> Result<&HgPath>` and `unsafe fn from_bytes_unchecked(s)`
  functions. If not, maybe `HgPath::new()` shouldn't validate the bytes at all.
  
  I feel `debug_assert()` implies that the condition must always be met,
  so the caller has to guarantee that.
  
  > +impl Index<usize> for HgPath {
  > +    type Output = u8;
  > +
  > +    fn index(&self, i: usize) -> &Self::Output {
  > +        &self.inner[i]
  > +    }
  > +}
  
  [...]
  
  Better to not implement `path[...]` since it can be considered returning
  a path component at the specified index. We can always do
  `path.as_bytes()[...]` explicitly.
  
  > +#[derive(Debug)]
  > +pub struct HgPathBytesIterator<'a> {
  > +    path: &'a HgPath,
  > +}
  
  Can't we simply implement it as a `slice::Iter<>` wrapper?
  
  > +    pub fn as_vec(&self) -> &Vec<u8> {
  > +        &self.inner
  > +    }
  > +    pub fn as_ref(&self) -> &[u8] {
  > +        self.inner.as_ref()
  > +    }
  
  Do we need `as_vec()`? I think `as_ref()` can be used in most cases.
  
  > +impl<T: ?Sized + AsRef<HgPath>> From<&T> for HgPathBuf {
  > +    fn from(s: &T) -> HgPathBuf {
  > +        let new = s.as_ref().to_owned();
  > +        debug_assert_eq!(Ok(()), new.check_state());
  > +        new
  > +    }
  
  No need to do `new.check_state()`?
  `s` should be already _checked_ if the type supports `s.as_ref()`.
  
  > +impl TryInto<PathBuf> for HgPathBuf {
  > +    type Error = std::io::Error;
  > +
  > +    fn try_into(self) -> Result<PathBuf, Self::Error> {
  > +        let os_str;
  > +        #[cfg(unix)]
  > +        {
  > +            use std::os::unix::ffi::OsStrExt;
  > +            os_str = std::ffi::OsStr::from_bytes(&self.inner);
  > +        }
  > +        #[cfg(windows)]
  > +        {
  > +            // TODO: convert from Windows MBCS (ANSI encoding) to WTF8.
  > +            // Perhaps, the return type would have to be Result<PathBuf>.
  > +            unimplemented!();
  > +        }
  > +
  > +        Ok(Path::new(os_str).to_path_buf())
  > +    }
  
  As I said, the encoding of HgPathBuf will be either MBCS or UTF-8 on
  Windows, and the encoding will be selected at runtime. We can't define
  a single conversion between HgPathBuf and PathBuf/OsString.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D6773/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D6773

To: Alphare, #hg-reviewers, kevincox
Cc: yuja, durin42, kevincox, mercurial-devel


More information about the Mercurial-devel mailing list