[PATCH 1 of 2] journal: add dirstate tracking

Martijn Pieters mj at zopatista.com
Mon Jul 4 06:58:37 EDT 2016


On 3 July 2016 at 04:42, Yuya Nishihara <yuya at tcha.org> wrote:
> On Thu, 30 Jun 2016 13:41:25 +0100, Martijn Pieters wrote:
>> # HG changeset patch
>> # User Martijn Pieters <mjpieters at fb.com>
>> # Date 1467290363 -3600
>> #      Thu Jun 30 13:39:23 2016 +0100
>> # Node ID 09c7b17ab01c202460543f02d9d71ac643f85a20
>> # Parent  cf092a3d202a11a670f2bbfd514bc07d6cc4cee7
>> journal: add dirstate tracking
>
>> +# hooks to record dirstate changes
>> +def wrapdirstate(orig, repo):
>> +    dirstate = orig(repo)
>> +    if util.safehasattr(repo, 'journal'):
>> +        dirstate.journalrepo = repo
>> +    return dirstate
>
> IIRC, it's considered layering violation to pass repo to dirstate. Perhaps,
> it could be dirstate.journal = repo.journal, or carry journal by a transaction
> object.

Good point; all that is needed really is the journal storage, so
`dirstate.journalstorage = repo.journal` should suffice here. I'll
make the change and send a V2.

>> +def recorddirstateparents(orig, dirstate, dirstatefp):
>> +    """Records all dirstate parent changes in the journal."""
>> +    if util.safehasattr(dirstate, 'journalrepo'):
>> +        old = [node.nullid, node.nullid]
>> +        nodesize = len(node.nullid)
>> +        try:
>> +            # The only source for the old state is in the dirstate file
>> +            # still on disk; the in-memory dirstate object only contains
>> +            # the new state.
>> +            with dirstate._opener(dirstate._filename) as fp:
>> +                state = fp.read(2 * nodesize)
>> +            if len(state) == 2 * nodesize:
>> +                old = [state[:nodesize], state[nodesize:]]
>> +        except IOError:
>> +            pass
>> +
>> +        new = dirstate.parents()
>> +        if old != new:
>> +            # only record two hashes if there was a merge
>> +            oldhashes = old[:1] if old[1] == node.nullid else old
>> +            newhashes = new[:1] if new[1] == node.nullid else new
>> +            dirstate.journalrepo.journal.record(
>> +                wdirparenttype, '.', oldhashes, newhashes)
>> +
>> +    return orig(dirstate, dirstatefp)
>
> dirstatefp could be a backup file. Maybe that's okay because the old state
> should be identical to the current state for backups, but I'm not pretty sure.
> Can you take a look and add comments?

I'll take a look, thanks for the pointer.

>> +    def _write(self, entry, _wait=False):
>> +        try:
>> +            with self.repo.wlock(wait=_wait):
>> +                version = None
>> +                # open file in amend mode to ensure it is created if missing
>> +                with self.vfs('journal', mode='a+b', atomictemp=True) as f:
>> +                    f.seek(0, os.SEEK_SET)
>> +                    # Read just enough bytes to get a version number (up to 2
>> +                    # digits plus separator)
>> +                    version = f.read(3).partition('\0')[0]
>> +                    if version and version != str(storageversion):
>> +                        # different version of the storage. Exit early (and
>> +                        # not write anything) if this is not a version we can
>> +                        # handle or the file is corrupt. In future, perhaps
>> +                        # rotate the file instead?
>> +                        self.repo.ui.warn(
>> +                            _("unsupported journal file version '%s'\n") %
>> +                            version)
>> +                        return
>> +                    if not version:
>> +                        # empty file, write version first
>> +                        f.write(str(storageversion) + '\0')
>> +                    f.seek(0, os.SEEK_END)
>> +                    f.write(str(entry) + '\0')
>> +        except error.LockHeld as lockerr:
>> +            lock = self.repo._wlockref and self.repo._wlockref()
>> +            if lock and lockerr.locker == '%s:%s' % (lock._host, lock.pid):
>> +                # the dirstate can be written out during wlock unlock, before
>> +                # the lockfile is removed. Re-run the write as a postrelease
>> +                # function instead.
>> +                lock.postrelease.append(
>> +                    lambda: self._write(entry, _wait=True))
>> +            else:
>> +                # another process holds the lock, retry and wait
>> +                self._write(entry, _wait=True)
>
> I understand why this is necessary, but it seems wrong to try to take a lock
> in lock.releasefn(), which can't guarantee consistency. We could delay
> "lock.held = 0" after releasefn() as the lock wasn't released yet, but it
> would result in two releasefn() calls.
>
> Instead, I think it will be way simpler if the journal has a dedicated lock,
> and we can get rid of the cycle between journalstrage and localrepo.

I still need access to the repo object to support shared repositories,
I think. I'll look into a dedicated lock, that'd make this a lot
simpler probably.

-- 
Martijn Pieters


More information about the Mercurial-devel mailing list