D5296: store: don't read the whole fncache in memory

pulkit (Pulkit Goyal) phabricator at mercurial-scm.org
Mon Mar 18 08:29:19 EDT 2019


This revision was automatically updated to reflect the committed changes.
Closed by commit rHGa56487081109: store: don't read the whole fncache in memory (authored by pulkit, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D5296?vs=14541&id=14542

REVISION DETAIL
  https://phab.mercurial-scm.org/D5296

AFFECTED FILES
  mercurial/store.py
  tests/test-fncache.t

CHANGE DETAILS

diff --git a/tests/test-fncache.t b/tests/test-fncache.t
--- a/tests/test-fncache.t
+++ b/tests/test-fncache.t
@@ -1,5 +1,19 @@
 #require repofncache
 
+An extension which will set fncache chunksize to 1 byte to make sure that logic
+does not break
+
+  $ cat > chunksize.py <<EOF
+  > from __future__ import absolute_import
+  > from mercurial import store
+  > store.fncache_chunksize = 1
+  > EOF
+
+  $ cat >> $HGRCPATH <<EOF
+  > [extensions]
+  > chunksize = $TESTTMP/chunksize.py
+  > EOF
+
 Init repo1:
 
   $ hg init repo1
diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -8,6 +8,7 @@
 from __future__ import absolute_import
 
 import errno
+import functools
 import hashlib
 import os
 import stat
@@ -23,6 +24,9 @@
 )
 
 parsers = policy.importmod(r'parsers')
+# how much bytes should be read from fncache in one read
+# It is done to prevent loading large fncache files into memory
+fncache_chunksize = 10 ** 6
 
 def _matchtrackedpath(path, matcher):
     """parses a fncache entry and returns whether the entry is tracking a path
@@ -463,7 +467,20 @@
             # skip nonexistent file
             self.entries = set()
             return
-        self.entries = set(decodedir(fp.read()).splitlines())
+
+        self.entries = set()
+        chunk = b''
+        for c in iter(functools.partial(fp.read, fncache_chunksize), b''):
+            chunk += c
+            try:
+                p = chunk.rindex(b'\n')
+                self.entries.update(decodedir(chunk[:p + 1]).splitlines())
+                chunk = chunk[p + 1:]
+            except ValueError:
+                # substring '\n' not found, maybe the entry is bigger than the
+                # chunksize, so let's keep iterating
+                pass
+
         self._checkentries(fp)
         fp.close()
 



To: pulkit, #hg-reviewers
Cc: indygreg, yuja, mjpieters, mercurial-devel


More information about the Mercurial-devel mailing list