[PATCH 1 of 8] _addrevision: refactor out the selection of candidate revisions
Gregory Szorc
gregory.szorc at gmail.com
Sun Jan 14 15:46:35 EST 2018
On Sun, Jan 14, 2018 at 2:28 AM, Paul Morelle <paul.morelle at octobus.net>
wrote:
> # HG changeset patch
> # User Paul Morelle <paul.morelle at octobus.net>
> # Date 1515668342 -3600
> # Thu Jan 11 11:59:02 2018 +0100
> # Node ID 7526dfca3d32e7c51864c21de2c2f4735c4cade6
> # Parent 4b68ca118d8d316cff1fbfe260e8fdb0dae3e26a
> # EXP-Topic refactor-revlog
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
> 7526dfca3d32
> _addrevision: refactor out the selection of candidate revisions
>
This refactor seems to preserve the existing behavior AFAICT. The code is
not easy to read. But that was a preexisting problem. Hopefully the end
state of this series improves matters more. If not, that can be done in
follow-ups.
Queued part 1.
Will look at the subsequent parts soon...
Also, please use "revlog:" in the commit message.
>
> The new function will be useful to retrieve all the revisions which will be
> needed to determine the best delta, and parallelize the computation of the
> necessary diffs.
>
> diff -r 4b68ca118d8d -r 7526dfca3d32 mercurial/revlog.py
> --- a/mercurial/revlog.py Thu Jan 11 11:57:59 2018 +0000
> +++ b/mercurial/revlog.py Thu Jan 11 11:59:02 2018 +0100
> @@ -1844,6 +1844,44 @@
>
> return True
>
> + def _getcandidaterevs(self, p1, p2, cachedelta):
> + """
> + Provides revisions that present an interest to be diffed against,
> + grouped by level of easiness.
> + """
>
I'd appreciate a bit more documentation around the intended use of the
generator stream. Essentially, it is emitting iterables of revs and the
first group that yields a rev with a suitable delta is used and the client
stops processing. That's kinda a weird pattern. But it is how the existing
logic works. Using a generator is a clever way to codify that logic.
> + curr = len(self)
> + prev = curr - 1
> + p1r, p2r = self.rev(p1), self.rev(p2)
>
curr, p1r, and p2r are already calculated by the caller. I'd pass them into
this function. Please send a follow-up.
> +
> + # should we try to build a delta?
> + if prev != nullrev and self.storedeltachains:
>
We could change this to `if pre == nullrev or not self.storedeltachains:
return` and dedent the following block. That would make things a bit easier
to read.
> + tested = set()
> + # This condition is true most of the time when processing
> + # changegroup data into a generaldelta repo. The only time it
> + # isn't true is if this is the first revision in a delta chain
> + # or if ``format.generaldelta=true`` disabled
> ``lazydeltabase``.
> + if cachedelta and self._generaldelta and self._lazydeltabase:
> + # Assume what we received from the server is a good choice
> + # build delta will reuse the cache
> + yield (cachedelta[0],)
> + tested.add(cachedelta[0])
> +
> + if self._generaldelta:
> + # exclude already lazy tested base if any
> + parents = [p for p in (p1r, p2r)
> + if p != nullrev and p not in tested]
> + if parents and not self._aggressivemergedeltas:
> + # Pick whichever parent is closer to us (to minimize
> the
> + # chance of having to build a fulltext).
> + parents = [max(parents)]
> + tested.update(parents)
> + yield parents
> +
> + if prev not in tested:
> + # other approach failed try against prev to hopefully
> save us a
> + # fulltext.
> + yield (prev,)
>
Having looked at this code in detail as part of the review, it seems
nonsensical to me to emit prev as a candidate revision when generaldelta is
being used. I'd refactor the code to use separate branches for generaldelta
and non-generaldelta scenarios. But this can be done as follow-up (if it
isn't addressed by later patches in this series).
> def _addrevision(self, node, rawtext, transaction, link, p1, p2,
> flags,
> cachedelta, ifh, dfh, alwayscache=False):
> """internal function to add revisions to the log
> @@ -1943,42 +1981,16 @@
> else:
> textlen = len(rawtext)
>
> - # should we try to build a delta?
> - if prev != nullrev and self.storedeltachains:
> - tested = set()
> - # This condition is true most of the time when processing
> - # changegroup data into a generaldelta repo. The only time it
> - # isn't true is if this is the first revision in a delta chain
> - # or if ``format.generaldelta=true`` disabled
> ``lazydeltabase``.
> - if cachedelta and self._generaldelta and self._lazydeltabase:
> - # Assume what we received from the server is a good choice
> - # build delta will reuse the cache
> - candidatedelta = builddelta(cachedelta[0])
> - tested.add(cachedelta[0])
> + for candidaterevs in self._getcandidaterevs(p1, p2, cachedelta):
> + nominateddeltas = []
> + for candidaterev in candidaterevs:
> + candidatedelta = builddelta(candidaterev)
> if self._isgooddelta(candidatedelta, textlen):
> - delta = candidatedelta
> - if delta is None and self._generaldelta:
> - # exclude already lazy tested base if any
> - parents = [p for p in (p1r, p2r)
> - if p != nullrev and p not in tested]
> - if parents and not self._aggressivemergedeltas:
> - # Pick whichever parent is closer to us (to minimize
> the
> - # chance of having to build a fulltext).
> - parents = [max(parents)]
> - tested.update(parents)
> - pdeltas = []
> - for p in parents:
> - pd = builddelta(p)
> - if self._isgooddelta(pd, textlen):
> - pdeltas.append(pd)
> - if pdeltas:
> - delta = min(pdeltas, key=lambda x: x[1])
> - if delta is None and prev not in tested:
> - # other approach failed try against prev to hopefully
> save us a
> - # fulltext.
> - candidatedelta = builddelta(prev)
> - if self._isgooddelta(candidatedelta, textlen):
> - delta = candidatedelta
> + nominateddeltas.append(candidatedelta)
> + if nominateddeltas:
> + delta = min(nominateddeltas, key=lambda x: x[1])
> + break
> +
> if delta is not None:
> dist, l, data, base, chainbase, chainlen, compresseddeltalen
> = delta
> else:
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180114/4dc7ef68/attachment.html>
More information about the Mercurial-devel
mailing list