[PATCH] changelog: keep track of file end in appender (issue5444)

Augie Fackler raf at durin42.com
Thu Dec 15 21:16:02 EST 2016


> On Dec 15, 2016, at 2:01 PM, Durham Goode <durham at fb.com> wrote:
> 
> # HG changeset patch
> # User Durham Goode <durham at fb.com>
> # Date 1481828418 28800
> #      Thu Dec 15 11:00:18 2016 -0800
> # Node ID 6db5cb70cecbd50835f7c8f28770643279001e76
> # Parent  07bcd1bf61514be28956cf32280cc3d442ffc760
> changelog: keep track of file end in appender (issue5444)

Nice fix. Queued.

> 
> Previously, changelog.appender.end() would compute the end of the file by
> joining all the current appended data and checking the length. This is an O(n)
> operation.  e240e914d226 introduced a seek call before every revlog write, which
> means we are hitting this O(n) behavior n times, which causes changelog writes
> during a pull to be n^2.
> 
> In our large repo, this caused pulling 100k commits to go from 17s to 130s. With
> this fix, it's back to 17s.
> 
> diff --git a/mercurial/changelog.py b/mercurial/changelog.py
> --- a/mercurial/changelog.py
> +++ b/mercurial/changelog.py
> @@ -79,9 +79,10 @@ class appender(object):
>         self.fp = fp
>         self.offset = fp.tell()
>         self.size = vfs.fstat(fp).st_size
> +        self._end = self.size
> 
>     def end(self):
> -        return self.size + len("".join(self.data))
> +        return self._end
>     def tell(self):
>         return self.offset
>     def flush(self):
> @@ -121,6 +122,7 @@ class appender(object):
>     def write(self, s):
>         self.data.append(str(s))
>         self.offset += len(s)
> +        self._end += len(s)
> 
> def _divertopener(opener, target):
>     """build an opener that writes in 'target.a' instead of 'target'"""
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list