[PATCH 2 of 4] changelog: extract changelog parsing into standalone function

Thu Jul 9 19:08:59 CDT 2015

# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1436477224 25200
#      Thu Jul 09 14:27:04 2015 -0700
# Node ID c4c7e9382652ddcbfc53ffc4fd00673c725068c3
# Parent  9403f12629d5c83ddc9d68dc5f8c2f48c6a895e1
changelog: extract changelog parsing into standalone function

An upcoming patch will introduce an additional consumer for parsing
raw revision content into a changelog data structure. We refactor the
common code for parsing raw data into a standalone function.

A downside of this refactor is an extra function call is introduced.
However, revsetbenchmark.py doesn't seem to indicate any significant
change in performance. There are reported changes, but very similar
revsets are both faster and slower, so I think the differences are a
result of expected variations. Someone else may want to test this before
landing, just to be sure. If this extra function really does regress
performance, we should consider implementing changeset parsing in C.

diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -307,9 +307,14 @@ class changelog(revlog.revlog):
         if not self._delayed:
             revlog.revlog.checkinlinesize(self, tr, fp)
 
     def read(self, node):
+        return self._newchangelog(self.revision(node))
+
+    def _newchangelog(self, text):
         """
+        Parse raw revision content into a tuple.
+
         format used:
         nodeid\n        : manifest node in ascii
         user\n          : user, no \n or \r allowed
         time tz extra\n : date (time is int or float, timezone is int)
@@ -319,9 +324,8 @@ class changelog(revlog.revlog):
         (.*)            : comment (free text, ideally utf-8)
 
         changelog v0 doesn't use extra
         """
-        text = self.revision(node)
         if not text:
             return (nullid, "", (0, 0), [], "", _defaultextra)
         last = text.index("\n\n")
         desc = encoding.tolocal(text[last + 2:])