[PATCH 1 of 2 STABLE] hooks: always include HG_PENDING
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Fri Nov 6 10:16:49 CST 2015
On 11/05/2015 02:18 PM, FUJIWARA Katsunori wrote:
> At Wed, 4 Nov 2015 17:11:12 -0800,
> Durham Goode wrote:
>>
>> On 11/4/15 4:26 PM, FUJIWARA Katsunori wrote:
>>> At Tue, 3 Nov 2015 17:19:37 -0800,
>>> Durham Goode wrote:
>>>> # HG changeset patch
>>>> # User Durham Goode <durham at fb.com>
>>>> # Date 1446598693 28800
>>>> # Tue Nov 03 16:58:13 2015 -0800
>>>> # Branch stable
>>>> # Node ID 0a772553285d61aeea3226f74698223b09fae7ac
>>>> # Parent 58b7f3e93bbab749ab16c09df12aae5ba7880708
>>>> hooks: always include HG_PENDING
>>>>
>>>> Previously we would only include HG_PENDING in the hook args if the
>>>> transaction's writepending() actually wrote something. This is a bad criteria,
>>>> since it's possible that a previous call to writepending() wrote stuff and the
>>>> hooks want to still see that.
>>>>
>>>> The solution is to always have hooks execute within the scope of the pending
>>>> changes by always putting HG_PENDING in the environment.
>>>>
>>>> diff --git a/mercurial/hook.py b/mercurial/hook.py
>>>> --- a/mercurial/hook.py
>>>> +++ b/mercurial/hook.py
>>>> @@ -122,7 +122,8 @@ def _exthook(ui, repo, name, cmd, args,
>>>> # make in-memory changes visible to external process
>>>> tr = repo.currenttransaction()
>>>> repo.dirstate.write(tr)
>>>> - if tr and tr.writepending():
>>>> + if tr:
>>>> + tr.writepending()
>>>> env['HG_PENDING'] = repo.root
>>>>
>>>> for k, v in args.iteritems():
>>> 'transaction.writepending()' saves whether there is any pending data
>>> or not into 'self._anypending'.
>>>
>>> https://selenic.com/repo/hg/file/e5a1df51bb25/mercurial/transaction.py#l350
>>>
>>> @active
>>> def writepending(self):
>>> '''write pending file to temporary version
>>>
>>> This is used to allow hooks to view a transaction before commit'''
>>> categories = sorted(self._pendingcallback)
>>> for cat in categories:
>>> # remove callback since the data will have been flushed
>>> any = self._pendingcallback.pop(cat)(self)
>>> self._anypending = self._anypending or any
>>> self._anypending |= self._generatefiles(suffix='.pending')
>>> return self._anypending
>>>
>>> It should return True as expected, if pending data has been written out
>>> once.
>>>
>>> The root cause of this issue seems that 'transaction.writepending()'
>>> removes pending callback invoked once, even if pending callback didn't
>>> actually write changes out yet.
>>>
>>> This removal of callbacks is reasonable, if it is assumed that pending
>>> callback may not be "idempotent" (e.g. writing into the file opened
>>> with "append" mode).
>>>
>>> But on the other hand, this removal may cause similar issue in other
>>> code paths = other hook combinations (or introducing new hooks).
>>>
>>> To avoid such similar issues:
>>>
>>> - invoke 'changelog.delayupdate()' explicitly before
>>> 'transaction.writepending()' for external hook invocation, like
>>> 'dirstate.write(tr)' above
>>>
>>> e.g. in _exthook()@hook.py
>>>
>>> tr = repo.currenttransaction()
>>> repo.dirstate.write(tr)
>>> if tr:
>>> repo.changelog.delayupdate(tr)
>>> if tr.writepending()
>>> env['HG_PENDING'] = repo.root
>>>
>>> - invoke 'changelog.delayupdate()' internally in changelog by
>>> overriding functions below, which open files in 'w'/'a' mode by
>>> 'self.opener'
>>>
>>> - revlog.addgroup
>>> - revlog.addrevision
>>> - revlog.checkinlinesize
>>>
>>> e.g.:
>>>
>>> def addrevision(self, text, transaction, link, p1, p2, cachedelta=None,
>>> node=None)
>>> self.delayupdate(transaction)
>>> return super(changelog, self).addrevision(.....)
>>>
>>> The former still needs 'changelog.delayupdate()' before starting
>>> changes to setup opener (and other internal status) of changelog, but
>>> the latter doesn't.
>>>
>>> I think the latter is reasonable. (yes, the former is just for
>>> comparison :-)) How about these ?
>> I think your second proposal sounds reasonable. And will catch any
>> cases that are currently missing it.
>>>
>>>
>>> BTW, #1 of this series causes that HG_PENDING is always passed to
>>> external hooks while transaction running, even if there is no pending
>>> data. Is this reasonable ? I think that this should be backed out to
>>> avoid unintentional (trial) accessing to pending files.
>
>> Wouldn't we always want hooks to be able to see the current state of the
>> repo? This was originally added because there was a test failure if I
>> didn't add it. I must've changed my implementation of #2 because now
>> there is no test failure if I exclude #1. So I'm ok backing out #1 if
>> you think we should.
>
> At first, as far as I confirmed, #2 can fix issue4934 (= test newly
> added in #2 can run successfully) even without #1.
>
> The root cause of issue4934 is that 'transaction.writepending()'
> doesn't imply 'changelog._writepending()', which writes buffered
> changes out, at the second hook invocation. Lack of HG_PENDING in this
> case is just a side effect of it.
>
> Therefore, it should be specification matter whether HG_PENDING can be
> defined always, IMHO.
>
> Pierre-Yves, does your describing "HG_PENDING must -only- occured in
> the context of a transaction" mean that HG_PENDING can be defined even
> if there is no pending data yet while transaction running ? (I think
> it doesn't)
>
> https://www.mercurial-scm.org/wiki/DirstateTransactionPlan#Detail_about_HG_PENDING_mechanism
I confirm that #1 seems wrong to me:
- writepending will is already properly reporting "True" for all call
once something happened
- asking hooks to read the "pending" data when non-exist is superfluous
and a change in behavior.
I've pushed a backout of #1 to the clowncopter.
I'm currently looking at #2 and related discussion.
--
Pierre-Yves David
More information about the Mercurial-devel
mailing list