[PATCH 3 of 6 V2] repoview: discard filtered changelog if the length differs from unfiltered

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Sun Dec 20 11:20:37 CST 2015


At Mon, 21 Dec 2015 01:06:49 +0900,
FUJIWARA Katsunori wrote:
> 
> 
> At Sun, 20 Dec 2015 20:46:04 +0900,
> Yuya Nishihara wrote:
> > 
> > On Thu, 17 Dec 2015 02:50:51 +0900, FUJIWARA Katsunori wrote:
> > > # HG changeset patch
> > > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > > # Date 1450287754 -32400
> > > #      Thu Dec 17 02:42:34 2015 +0900
> > > # Node ID 6455d3aac05ac9ab6913b162b46a924b9a512121
> > > # Parent  4210557127a602a91ce9958bb957e2a32493856b
> > > repoview: discard filtered changelog if the length differs from unfiltered
> > > 
> > > Before this patch, revisions rollbacked at failure of previous
> > > transaction may be visible at subsequent operations unintentionally,
> > > if repoview object is reused even after failure of transaction:
> > > e.g. command server and HTTP server are typical cases.
> > > 
> > > 'repoview' uses the tuple of values below of unfiltered changelog as
> > > "the key" to examine validity of filtered changelog cache.
> > > 
> > >      - length
> > >      - tip node
> > >      - filtered revisions (as hashed value)
> > >      - '_delayed' field
> > > 
> > > 'repoview' compares between "the key" of unfiltered changelog at
> > > previous caching and now, and reuses filtered changelog cache if no
> > > change is detected.
> > > 
> > > But this comparison indicates only that there is no change between
> > > unfiltered 'repo.changelog' at last caching and now, but not that
> > > filtered changelog cache is valid for current unfiltered one.
> > > 
> > > 'repoview' uses "shallow copy" of unfiltered changelog to create
> > > filtered changelog cache. In this case, 'index' buffer of unfiltered
> > > changelog is also referred by filtered changelog.
> > > 
> > > At failure of transaction, unfiltered changelog itself is invalidated
> > > (= un-referred) on the 'repo' side (see 0a7610758c42 also). But
> > > 'index' of it still contains revisions to be rollbacked at this
> > > failure, and is referred by filtered changelog.
> > > 
> > > Therefore, even if there is no change between unfiltered
> > > 'repo.changelog' at last caching and now, steps below makes rollbacked
> > > revisions visible via filtered changelog unintentionally.
> > > 
> > >   1. instantiate unfiltered changelog as 'repo.changelog'
> > >      (call it CL1)
> > > 
> > >   2. make filtered (= shallow copy of) CL1
> > >      (call it FCL1)
> > > 
> > >   3. cache FCL1 with "the key" of CL1
> > > 
> > >   4. revisions are appended to 'index', which is shared by CL1 and FCL1
> > > 
> > >   5. invalidate 'repo.changelog' (= CL1) at failure of transaction
> > > 
> > >   6. instantiate 'repo.changelog' again at next operation
> > >      (call it CL2)
> > > 
> > >      CL2 doesn't have revisions added at (4), because it is
> > >      instantiated from '00changelog.i', which isn't changed while
> > >      failed transaction.
> > > 
> > >   7. compare between "the key" of CL1 and CL2
> > > 
> > >   8. FCL1 cached at (3) is reused, because comparison at (7) doesn't
> > >      detect change between CL1 at (1) and CL2
> > > 
> > >   9. revisions rollbacked at (5) are visible via FCL1 unintentionally,
> > >      because FCL1 still refers 'index' changed at (4)
> > > 
> > > The root cause of this issue is that there is no examination about
> > > validity of filtered changelog cache against current unfiltered one.
> > > 
> > > This patch discards filtered changelog if the length of it differs
> > > from one of unfiltered.
> > > 
> > > "Difference of the length" means that filtered changelog isn't related
> > > to unfiltered one, because the length of changelog isn't changed by
> > > filtering.
> > > 
> > > This patch uses 'len(cl.index) - 1' directly instead of 'len(cl)', to
> > > follow beda2c9dbbff, which bypasses changelog method for performance
> > > reason.
> > > 
> > > BTW, at the time of this patch, redundant truncation of
> > > '00changelog.i' at failure of transaction (see 0a7610758c42 for
> > > detail) often prevents "hg serve" from making already rollbacked
> > > revisions visible, because updating timestamps of '00changelog.i' by
> > > truncation makes "hg serve" discard old repoview object with invalid
> > > filtered changelog cache.
> > > 
> > > This is reason why this issue is overlooked before this patch, even
> > > though test-bundle2-exchange.t has tests in similar situation: failure
> > > of "hg push" via HTTP by pretxnclose hook on server side doesn't
> > > prevent subsequent commands from looking up outgoing revisions
> > > correctly.
> > > 
> > > But it can't be assumed that this avoidance always works as expected,
> > > because timestamp on the filesystem doesn't have enough resolution for
> > > recent computation power.
> > > 
> > > Therefore, without this patch, this issue may appear occasionally.
> > > 
> > > diff --git a/mercurial/repoview.py b/mercurial/repoview.py
> > > --- a/mercurial/repoview.py
> > > +++ b/mercurial/repoview.py
> > > @@ -308,7 +308,8 @@ class repoview(object):
> > >          revs = filterrevs(unfi, self.filtername)
> > >          cl = self._clcache
> > >          newkey = (unfilen, unfinode, hash(revs), unfichangelog._delayed)
> > > -        if cl is not None and newkey != self._clcachekey:
> > > +        if (cl is not None and
> > > +            (len(cl.index) - 1 != unfilen or newkey != self._clcachekey)):
> > 
> > From your comment, this checks if cl.index is not unfiindex, right?
> 
> Oh, it looks better > 'if cl.index is not unfiindex'
> 
> BTW, it returns immediately without comparison of entries in index
> buffer, if cl.index and unfiindex are separate objects, doesn't it ?
> (I don't know well about fallbacking around __eq__ or so for C
> implementation :-<)

OK, I understood about difference between '==' and 'is'. This is my
first time to apply 'is'/'is not' on other than 'None' :-)

> > Instead of validating lengths, I think _clcache should be nullified when
> > _filecache['changelog'] is deleted. But I'm not sure if it's possible by
> > repoview.
> 
> Yes, it is ideal that invalidating changelog cache at failure of
> transaction implies not only deleting _filecache['changelog'] but also
> clearing _clcache of repoview.
> 
> But there is no reference from repo to repoview, and introducing it
> (or registering a kind of callback to clear 'repoview._clcache' with
> repo) seems to introduce also bad cyclic reference between them.
> 
> 
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list