[PATCH 1 of 2] journal: add dirstate tracking
Yuya Nishihara
yuya at tcha.org
Sun Jul 3 03:42:26 UTC 2016
On Thu, 30 Jun 2016 13:41:25 +0100, Martijn Pieters wrote:
> # HG changeset patch
> # User Martijn Pieters <mjpieters at fb.com>
> # Date 1467290363 -3600
> # Thu Jun 30 13:39:23 2016 +0100
> # Node ID 09c7b17ab01c202460543f02d9d71ac643f85a20
> # Parent cf092a3d202a11a670f2bbfd514bc07d6cc4cee7
> journal: add dirstate tracking
> +# hooks to record dirstate changes
> +def wrapdirstate(orig, repo):
> + dirstate = orig(repo)
> + if util.safehasattr(repo, 'journal'):
> + dirstate.journalrepo = repo
> + return dirstate
IIRC, it's considered layering violation to pass repo to dirstate. Perhaps,
it could be dirstate.journal = repo.journal, or carry journal by a transaction
object.
> +def recorddirstateparents(orig, dirstate, dirstatefp):
> + """Records all dirstate parent changes in the journal."""
> + if util.safehasattr(dirstate, 'journalrepo'):
> + old = [node.nullid, node.nullid]
> + nodesize = len(node.nullid)
> + try:
> + # The only source for the old state is in the dirstate file
> + # still on disk; the in-memory dirstate object only contains
> + # the new state.
> + with dirstate._opener(dirstate._filename) as fp:
> + state = fp.read(2 * nodesize)
> + if len(state) == 2 * nodesize:
> + old = [state[:nodesize], state[nodesize:]]
> + except IOError:
> + pass
> +
> + new = dirstate.parents()
> + if old != new:
> + # only record two hashes if there was a merge
> + oldhashes = old[:1] if old[1] == node.nullid else old
> + newhashes = new[:1] if new[1] == node.nullid else new
> + dirstate.journalrepo.journal.record(
> + wdirparenttype, '.', oldhashes, newhashes)
> +
> + return orig(dirstate, dirstatefp)
dirstatefp could be a backup file. Maybe that's okay because the old state
should be identical to the current state for backups, but I'm not pretty sure.
Can you take a look and add comments?
> + def _write(self, entry, _wait=False):
> + try:
> + with self.repo.wlock(wait=_wait):
> + version = None
> + # open file in amend mode to ensure it is created if missing
> + with self.vfs('journal', mode='a+b', atomictemp=True) as f:
> + f.seek(0, os.SEEK_SET)
> + # Read just enough bytes to get a version number (up to 2
> + # digits plus separator)
> + version = f.read(3).partition('\0')[0]
> + if version and version != str(storageversion):
> + # different version of the storage. Exit early (and
> + # not write anything) if this is not a version we can
> + # handle or the file is corrupt. In future, perhaps
> + # rotate the file instead?
> + self.repo.ui.warn(
> + _("unsupported journal file version '%s'\n") %
> + version)
> + return
> + if not version:
> + # empty file, write version first
> + f.write(str(storageversion) + '\0')
> + f.seek(0, os.SEEK_END)
> + f.write(str(entry) + '\0')
> + except error.LockHeld as lockerr:
> + lock = self.repo._wlockref and self.repo._wlockref()
> + if lock and lockerr.locker == '%s:%s' % (lock._host, lock.pid):
> + # the dirstate can be written out during wlock unlock, before
> + # the lockfile is removed. Re-run the write as a postrelease
> + # function instead.
> + lock.postrelease.append(
> + lambda: self._write(entry, _wait=True))
> + else:
> + # another process holds the lock, retry and wait
> + self._write(entry, _wait=True)
I understand why this is necessary, but it seems wrong to try to take a lock
in lock.releasefn(), which can't guarantee consistency. We could delay
"lock.held = 0" after releasefn() as the lock wasn't released yet, but it
would result in two releasefn() calls.
Instead, I think it will be way simpler if the journal has a dedicated lock,
and we can get rid of the cycle between journalstrage and localrepo.
More information about the Mercurial-devel
mailing list