[PATCH 3 of 3 RFC] localrepository: add caches to transaction (issue4255)

Wed May 28 13:37:30 CDT 2014

On 5/28/14, 10:40 AM, Durham Goode wrote:
> On 5/27/14, 11:00 PM, "Gregory Szorc" <gregory.szorc at gmail.com> wrote:
>
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc at gmail.com>
>> # Date 1401253147 25200
>> #      Tue May 27 21:59:07 2014 -0700
>> # Node ID 12c2a7ae1d443e6ea89ea6f8f1032bf2178af15d
>> # Parent  02939296b8052833d4c8bfa6e2e2f1371fe5c517
>> localrepository: add caches to transaction (issue4255)
>>
>> localrepository.transaction() now creates a chained transaction bound
>> to the .hg opener in addition to the .hg/store opener. This new
>> transaction creates backups of most files under .hg/cache.
>>
>> With this change, transaction rollbacks will restore caches, thus
>> preventing cache invalidation and recomputation. This will result
>> in considerable performance wins on repositories where cache
>> generation is slow.
>>
>> On a repository with 12,421 heads and manifests with near 100,000
>> entries, branch caches can take ~17 minutes to generate
>> (259s for base, 68s for immutable, 723s for served). On a server
>> that had hooks causing branch cache updates, transaction rollbacks
>> resulted in branch cache invalidation and an expensive computation.
>> This patch makes the problem go away.
>>
>> THIS PATCH CURRENTLY CAUSES A REGRESSION IN test-fncache.t.
>> THIS PATCH SHOULD ALSO HAVE EXPLICIT TEST COVERAGE THAT CACHES ARE
>> RESTORED.
>>
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -848,8 +848,16 @@ class localrepository(object):
>>
>>      def wwritedata(self, filename, data):
>>          return self._filter(self._decodefilterpats, filename, data)
>>
>> +    def _cachetransactionblacklist(self):
>> +        """Set of cache files that are not safe to include in
>> transactions.
>> +
>> +        Files in .hg/cache/ are included in transactions by default. To
>> +        prevent that from happening, add an entry to this set.
>> +        """
>> +        return set(['storehash'])
>> +
>>      def transaction(self, desc, report=None):
>>          tr = self._transref and self._transref() or None
>>          if tr and tr.running():
>>              return tr.nest()
>> @@ -870,8 +878,27 @@ class localrepository(object):
>>                                       "journal",
>>                                       aftertrans(renames),
>>                                       self.store.createmode,
>>                                       onclose)
>> +
>> +        # Add caches to a chained transaction.
>> +        if self.ui.configbool("format", "usestore", True):
>> +            try:
>> +                def noreport(msg):
>> +                    pass
>> +                tr2 = transaction.transaction(noreport, self.opener,
>> +                                              "journal",
>> +
>> createmode=self.store.createmode)
>> +                blacklist = self._cachetransactionblacklist()
>> +                for filename, mode in self.vfs.readdir("cache"):
>> +                    if filename not in blacklist:
>> +                        tr2.addbackup("cache/%s" % filename,
>> hardlink=False)
>> +                tr.chaintransaction(tr2)
>> +
>> +            except OSError, e:
>> +                if e.errno != errno.ENOENT:
>> +                    raise
>> +
>>          self._transref = weakref.ref(tr)
>>          return tr
>>
>>      def _journalfiles(self):
>
>
> I’m not sure I like the chained transaction concept.  How is the chain
> represented on disk (i.e. if the user ctrl+c’s, how does rollback handle
> it)? What happens if the _abort for one transaction fails or the process
> is killed before all the chained transactions are aborted?  What if the
> same file is added to two transactions (perhaps by a poorly written
> extension)? These caches can be written outside the lock right (like
> during read only operations)? So what happens if someone modifies a cache
> outside the lock while another process is in the middle of transacting it?
>
> I like how elegantly this addresses your problem, but I think the
> transaction model is already fragile and difficult enough to reason about,
> and I think adding multiple simultaneous transactions will make it even
> harder to reason about transaction behavior in the future.  If we want the
> caches to be transacted, I think we need to refactor the existing
> transaction to support them, make sure they are never written outside of a
> transaction, and make sure they are never written during read only
> operations.
>
> I have a local commit where I’ve started doing this for the phase cache,
> but that’s one of the easier ones since it is mostly modified inside locks
> already.

Thank you for the feedback and for confirming some of the concerns I had.

Since it sounds like you are already going down this road, I will hold 
off and build on top of whatever you land.

FWIW, the issue of cache population outside of locks is bothersome. If a 
cache is invalidated, you may have multiple processes (e.g. hgweb) 
rebuilding the cache simultaneously. This can easily bring a server to 
its knees. Perhaps I should work on a patch series to address that?