[PATCH] transaction: support for callbacks during abort

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Mon Jul 13 01:44:07 CDT 2015



At Sat, 11 Jul 2015 20:47:00 +0100,
Pierre-Yves David wrote:

> On 07/11/2015 11:31 AM, FUJIWARA Katsunori wrote:
> > At Tue, 06 Jan 2015 22:00:18 -0800,
> > Gregory Szorc wrote:
> >>
> >> # HG changeset patch
> >> # User Gregory Szorc <gregory.szorc at gmail.com>
> >> # Date 1420610193 28800
> >> #      Tue Jan 06 21:56:33 2015 -0800
> >> # Node ID 5affbb31eea2a1034bc73ee20182880adcc9af36
> >> # Parent  b9d06fa10ef29c012d48bd4f3c93fd7bf1347d40
> >> transaction: support for callbacks during abort
> >>
> >> Previous transaction work added callbacks to be called during regular
> >> transaction commit/close. As part of refactoring Mozilla's pushlog
> >> extension (an extension that opens a SQLite database and tries to tie
> >> its transaction semantics to Mercurial's transaction), I discovered that
> >> the new transaction APIs were insufficient to avoid monkeypatching
> >> transaction instance internals. Adding a callback that is called during
> >> transaction abort removes the necessity for monkeypatching and completes
> >> the API.
> >>
> >> diff --git a/mercurial/transaction.py b/mercurial/transaction.py
> >> --- a/mercurial/transaction.py
> >> +++ b/mercurial/transaction.py
> >
> > (snip)
> >
> >> @@ -442,8 +455,10 @@ class transaction(object):
> >>
> >>               self.report(_("transaction abort!\n"))
> >>
> >>               try:
> >> +                for cat in sorted(self._abortcallback):
> >> +                    self._abortcallback[cat](self)
> >>                   _playback(self.journal, self.report, self.opener, self._vfsmap,
> >>                             self.entries, self._backupentries, False)
> >>                   self.report(_("rollback completed\n"))
> >>               except Exception:
> >
> > (also CC-ed to Pierre-Yves as a implementer of 'txnabort' hook)

Thank you for your comments, Gregory and Pierre-Yves.

I'll work to add another callback to know the end of transaction, as
described later. So, some of comments below are just for your
information.


> > Now, I'm working for dirstate consistency around inside and border of
> > transaction, and planning to use this abort callback mechanism to know
> > the end of transaction at failure.
> 
> I'm a bit curious about what code you need to run in such case. If you 
> use the file generator mechanism you do not need to worries about 
> restoring backup.
> 
> However, I'm not sure how rolling back the file would atomatically 
> invalide the in memory object content (probably not). Is this why you 
> need a callback? could we (would it make sense to) integrate that as 
> part of one of our current cache invalidation mechanism?


I'm planning to use callback mechanism for purposes below:

1. to make existing 'dirstate.write()' invocations write in-memory
   changes into appropriate file ('.hg/dirstate' or '.hg/dirstate.pending'),
   'dirstate.write()' should know whether transaction is running or not

   I'm planning changes below for this purpose:

   - invoke (newly added) 'dirstate.begintransaction()' at the
     beginning of transaction in 'localrepo.transaction()'

   - invoke (newly added) 'dirstate.endtransaction()' at the end of
     transaction via "the end of transaction" callback (instead of
     finalize/abort callbacks)


   There are some other ways:

   1-1. instantiate dirstate with 'repo'
        (or weak ref of repo to avoid circular reference ?)

   1-2. make 'dirstate.write()' require 'localrepo' argument

        This seems too large impact (especially for 3rd party
        extensions, even though Mercurial doesn't ensure compatibility
        of internal API at all :-))

   In both cases, 'dirstate.write()' can determine appropriate output
   file by examination of 'repo.currenttransaction()'.


2. in-memory dirstate changes should be discarded at failure of
   transaction, to prevent them from being written out at (outer)
   'wlock.release' or so

   I'm planning to do this via "the end of transaction" callback.


   Using 'dirstate.{begin|end}parentchange()' mechanism as below also
   works similarly.

   - invoke beginparentchange at starting transaction
   - invoke endparentchange at (successfully) closing transaction
   - wlock.release at failure invokes 'dirstate.invalidate()'

   But it seems too optimistic for outside transaction :-), and it
   can't ensure consistency just at the end of transaction.



> > But current implementation doesn't invoke abort callbacks, if there is
> > no change on repository data at aborting transaction: e.g. "hg import"
> > fails to import the first patch.
> 
> The change to dirstate should most probably register himself as change 
> to the transaction. That would trigger it.

I found two problems of 'addfilegenerator()' below. IMHO, the former
is a major problem, and the latter is minor one (but you should
dislike it as later :-)).

And, both of my fixing plans cause omitting invocation of abort
callbacks, because they makes 'transaction._backupentries' empty :-<

- current 'addfilegenerator()' causes forcibly restoring
  '.hg/dirstate' at rollback-ing, even if rollback-ing doesn't change
  parents of the working directory

  This causes unexpected "hg status" output at rollback-ing in such
  situation (aka non-"parentgone" case).

  I'm planning to introduce 'backup' optional argument, which controls
  'addbackup()' invocation in '_generatefiles()' to avoid issue above,
  to 'addfilegenerator()'.

    https://selenic.com/repo/hg/file/tip/mercurial/transaction.py#l267

  According to discussion before, dirstate writes changes into (a)
  '.hg/dirstate' at the beginning of transaction and (b)
  '.hg/dirstate.pending' while transaction running. Then, discarding
  the latter can work as a kind of restoring. So, omitting
  'addbackup()' invocation doesn't cause problem at least for dirstate.

  Or is another trick like "add to _backupentries, but omit backup
  itself" better ?

- current 'addfilegenerator()' causes forcibly writing data out at
  'writepending()', even if data hasn't been changed since last
  writing out

  When external hooks are invoked while transaction, this causes
  additional "transaction abort!" and "rollback completed" output,
  even though dirstate hasn't been changed since the beginning of
  transaction.

  For example, pushing changes/bookmarks causes writing into
  'dirstate.pending' at REMOTE side, even though it never uses and
  changes dirstate.

  I'm planning to introduce 'checkdirty' optional argument, which is
  used to examine whether data has been changed since last writing
  out (= 'genfunc' should be invoked or not), to avoid redundant
  writing data out.


> > So, there is no easy and certain way to know the end of transaction at
> > failure.
> >
> > BTW, this also causes that 'txnabort' hook isn't invoked at aborting
> > transaction in such case, because 'txnabort' hook invocation is
> > achieved by abort callback mechanism. This may need some cleanup for
> > orphan 'pretxnopen' hook assuming that one of 'txnclose' or 'txnabort'
> > will be invoked subsequently.
> >
> > Then, please let me confirm some points below before changing around
> > 'transaction._abort()' for my purpose:
> >
> >    - are there any usecases to expect actual changes to be rollbacked ?
> >
> >      In other words, does it cause problem to invoke abort callbacks
> >      regardless of actual changes ?
> 
> If I remember correctly, it make "transaction abort" "abort complete 
> message to happears in a lot of new place. Basically any place where we 
> open a transaction but eventually do nothing. We can probably not change it

Adding invocation of abort callbacks just before cleaning journal
files up can avoid such additional messages, even though duplicated
code paths are redundant :-)

    https://selenic.com/repo/hg/file/tip/mercurial/transaction.py#l477

    diff --git a/mercurial/transaction.py b/mercurial/transaction.py
    --- a/mercurial/transaction.py
    +++ b/mercurial/transaction.py
    @@ -482,6 +482,8 @@ class transaction(object):
     
             try:
                 if not self.entries and not self._backupentries:
    +                for cat in sorted(self._abortcallback):
    +                    self._abortcallback[cat](self)
                     if self.journal:
                         self.opener.unlink(self.journal)
                     if self._backupjournal:


> >    - are there any usecases to refer changes to be rollbacked in abort
> >      callbacks ?
> >
> >      In other words, does it cause problem to invoke abort callbacks
> >      after rollbacking ?
> 
> I would says yes, but detail could be complicated (to be continued)
> 
> >
> >      BTW, 'txnabort' hooks are invoked without 'pending', and it means
> >      that such changes should be invisible to 'txnabort' hooks, doesn't
> >      it ? > Pierre-Yves
> 
> And here we have proof of (1) people do not rely of it too much (2) 
> definition of the state of the transaciton during abort is far from trivial.
> We should probably keep it with rolled back content/

It means that 'txnabort' hooks should be invoked with 'pending' (=
rolled back content should be visible to them), doen't it ? (sorry for
my poor English, if I misunderstood :-<)


> >
> >      Then, IMHO, such changes should be invisible to abort callbacks,
> >      too.
> >
> > Maybe, should I introduce another callback to know the end of
> > transaction (regardless of the result of it, or only at failure) ?
> 
> Sounds like an awesome idea. Python have:
> 
> try:
>      .... # equivalent to transaction open
> except:
>      .... # equivalent to transaction abort
> else:
>      .... # equivalent to transaction success
> finally:
>      .... # equivalent of your proposal
> 
> Finally is definitly very useful, so go for it.

I'll do so.


> -- 
> Pierre-Yves David
> 
> 
> 
> 
> 
> 
> >
> >
> > ----------------------------------------------------------------------
> > [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
> >
> 
> -- 
> Pierre-Yves David
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list