[PATCH 1 of 5] util: add filestat class to detect ambiguity of file stat

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Tue May 24 12:45:09 EDT 2016


At Tue, 24 May 2016 21:58:11 +0900,
Yuya Nishihara wrote:
> 
> On Thu, 19 May 2016 00:26:27 +0900, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > # Date 1463584837 -32400
> > #      Thu May 19 00:20:37 2016 +0900
> > # Node ID dc731ebd60613cf3f08799a8d8dc48435798665b
> > # Parent  8c8442523eefac2d53e3f10ff1ebf37f4d3c63c3
> > util: add filestat class to detect ambiguity of file stat
> 
> > +class filestat(object):
> > +    """help to exactly detect change of a file
> > +
> > +    'stat' attribute is result of 'os.stat()' if specified 'path'
> > +    exists. Otherwise, it is None. This can avoid preparative
> > +    'exists()' examination on client side of this class.
> > +    """
> > +    def __init__(self, path):
> > +        try:
> > +            self.stat = os.stat(path)
> > +        except OSError as err:
> > +            if err.errno != errno.ENOENT:
> > +                raise
> > +            self.stat = None
> > +
> > +    __hash__ = object.__hash__
> > +
> > +    def __eq__(self, old):
> > +        try:
> > +            # if ambiguity between stat of new and old file is
> > +            # avoided, comparision of size, ctime and mtime is enough
> > +            # to exactly detect change of a file regardless of platform
> > +            return (self.stat.st_size == old.stat.st_size and
> > +                    self.stat.st_ctime == old.stat.st_ctime and
> > +                    self.stat.st_mtime == old.stat.st_mtime)
> > +        except AttributeError:
> > +            return False
> 
> You have to implement __ne__ to avoid troubles. I have no idea about __hash__.
> 
> https://docs.python.org/2/reference/datamodel.html#object.__eq__

Oops, I forgot to copy __ne__() defition below from posix.cachestat.

    def __ne__(self, other):
        return not self == other

I'll send the patch to add it, soon !


> > +    def isambig(self, old):
> > +        """Examine whether new (= self) stat is ambiguous against old one
> > +
> > +        "S[N]" below means stat of a file at N-th change:
> > +
> > +        - S[n-1].ctime  < S[n].ctime: can detect change of a file
> > +        - S[n-1].ctime == S[n].ctime
> > +          - S[n-1].ctime  < S[n].mtime: means natural advancing (*1)
> > +          - S[n-1].ctime == S[n].mtime: is ambiguous (*2)
> > +          - S[n-1].ctime  > S[n].mtime: never occurs naturally (don't care)
> > +        - S[n-1].ctime  > S[n].ctime: never occurs naturally (don't care)
> 
> Does it work well on Windows? I'm not skeptical about it, I just question
> because ctime is platform dependent.
> 

According to "Remarks" in the document for BY_HANDLE_FILE_INFORMATION
structure, from which os.lstat() of Python gets st_mtime/st_ctime
information via GetFileInformationByHandle() API:

  https://msdn.microsoft.com/en-us//library/windows/desktop/aa363788(v=vs.85).aspx
    For example, on a Windows FAT file system, create time has a
    resolution of 10 milliseconds, write time has a resolution of 2
    seconds, and access time has a resolution of 1 day (the access
    date).

I confirmed availability of ctime on FAT-formatted USB memory device.

If Mercurial is used with the file system other than FAT and NTFS, and
it doesn't support ctime, ctime is always equals to 0, and this patch
causes advancing mtime always.

    ftCreationTime
        A FILETIME structure that specifies when a file or directory
        is created. If the underlying file system does not support
        creation time, this member is zero (0).

Fortunately, mtime is used only for examination of equivalence (in
filestat class, at least). Therefore, "advancing mtime always" should
work as expected, even if advancing causes overflow of mtime
(0x7fffffff => 0x00000000).

On the other hand, if both ctime and mtime aren't supported on the
underlying file system, "comparison between timestamp" itself doesn't
work well.

Therefore, when we'll replace {posix|windows}.cachestat by this
filestat, we should examine availability of both ctime and mtime on
the underlying file system, for "cacheable" check.


BTW, _stat() C runtime library family can't get valid ctime on FAT
(maybe, difference of internal implementation ?)

  https://msdn.microsoft.com/en-us//library/14h5k7ff.aspx

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list