RFC: exact change detection for non append-only files

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Nov 17 11:58:56 CST 2015



On 11/17/2015 09:10 AM, FUJIWARA Katsunori wrote:
>
> Now, 'filecache' detects changes of files by 'cachestat.__eq__()' of
> posix.py on POSIX platform, and it examines:
>
>    - st_size:
>
>      This works for append-only files (like revlog) as expect in all
>      cases (doesn't it ?)
>
>      But some status files (e.g. dirstate, bookmarks and so on) may not
>      be changed in size, even if they are actually changed.
>
>    - st_mtime:
>
>      For non append-only files, this works as expect in many cases. But
>      'st_mtime' doesn't have enough resolution for recent computing and
>      I/O speed, even if it is represented in float (see also issue4836
>      for more detail).
>
>    - st_ino:
>
>      This can compensate for 'st_mtime', because copy-on-write
>      semantics always changes st_ino.
>
> Therefore, 'st_ino' is the last bastion for change detection of
> dirstate and so on.
>
> But inode is quickly reused on some filesystems (perhaps for
> performance reason), and it prevents examination of 'st_ino' from
> detecting changes as expected.
>
> My instant ideas to detect changes correctly even in such situation
> are:
>
>    - ignore this very very rare case :-)
>
>      Because the inode, which is used previously for status file X,
>      should be reused for X again, at occurrence of this issue.
>
>    - writer: save also hash of data at writing data out
>      reader: check hash, if 'st_ino' can't detect changes
>
>      (e.g. '.hg/dirstate.hash' for '.hg/dirstate')
>
>      This requires reading whole data file in to calculate hash value,
>      and it easily decrease performance.
>
>    - writer: incremental and write "generation id" at writing data out
>      reader: check "generation id", if 'st_ino' can't detect changes
>
>      (e.g. '.hg/dirstate.genid' for '.hg/dirstate')

Writing two different file will be subject to race conditions. :-/

(Nice find, I'll be thinking about a work around here)

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list