[Bug 5533] New: json encoder is too slow
mercurial-bugs at mercurial-scm.org
mercurial-bugs at mercurial-scm.org
Tue Apr 11 20:52:47 UTC 2017
https://bz.mercurial-scm.org/show_bug.cgi?id=5533
Bug ID: 5533
Summary: json encoder is too slow
Product: Mercurial
Version: default branch
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: feature
Priority: wish
Component: Mercurial
Assignee: bugzilla at mercurial-scm.org
Reporter: arcppzju+hgbug at gmail.com
CC: mercurial-devel at mercurial-scm.org
I wrote a simple program to compare the performance difference between stdlib
json and the json routine we have in core:
from mercurial import encoding
import contextlib
import json
import time
def hgescape(obj):
s = '{'
s += ','.join('"%s":"%s"' % (encoding.jsonescape(k),
encoding.jsonescape(v))
for k, v in obj.iteritems())
s += '}'
return s
@contextlib.contextmanager
def measure(name):
t1 = time.time()
yield
t2 = time.time()
print('%s: %s' % (name, t2 - t1))
lines = []
with measure('insert 50k lines'):
for l in xrange(50000):
lines.append({'author': 'test',
'commit': 'fe4713a645e44df4bbaeb8a04ea428a2d1c82a4b',
'date': '1999-99-99'})
with measure('stdlib json escape'):
s = json.dumps(lines)
with measure('hg json escape'):
s = ','.join([hgescape(l) for l in lines])
I got something like:
insert 50k lines: 0.0199460983276
stdlib json escape: 0.0517330169678
hg json escape: 1.18240094185
So the core hg json escaping is roughly 25x slower.
That means things like "annotate -Tjson" can spend noticeable time just doing
the formatting.
I can think of two paths worth a try:
1. Write the json encoding logic in C.
2. Write a general purpose string-like object in C that does 2 things: `+`
and `x.join` in a zero-copy manner. This will increase the burden of the
GC though.
I'm not sure if there are existing libraries doing 2 already.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Mercurial-devel
mailing list