D6255: copies: calculate mergecopies() based on pathcopies()
martinvonz (Martin von Zweigbergk)
phabricator at mercurial-scm.org
Wed Apr 17 15:25:22 EDT 2019
martinvonz added a comment.
In https://phab.mercurial-scm.org/D6255#91019, @martinvonz wrote:
> In https://phab.mercurial-scm.org/D6255#91015, @marmoute wrote:
>
> >
>
>
> The rest somehow didn't make it here, so I'll copy from the email (i.e. the below is from Pierre-Yves, not from me):
>
> >
>
> on related topic, you seems to be fixing a case in
> test-fastannotate-hg.t and test-annotate-hg.t that is probably worth
> mentioning.
Sure, I'll add that.
>
>
>> One drawback of the rewritten code is that we may now make
>> remotefilelog prefetch more files. We used to prefetch files that were
>> unique to either side of the merge compared to the other. We now
>> prefetch files that are unique to either sise of the merge compared to
>
> sise → side
Thanks, done.
>> Some timings for calculating mergecopies between two revisions (all
>> using the common ancestor as base):
>
> Which revisions did you pick?
I've tried to clarify this in the commit message.
>> So it's measurably slower in most cases. Note that merge copies are
>> not calculated when updating with a clean working copy, which is
>> probably the most common case. I therefore think the much simpler code
>> is worth the slowdown.
>
> Do you know where the slowdown comes from ?
One difference is that the new code ends up calling `adjustlinkrev()` (from the `isintroducedafter()` call in `_tracefile()`), which the old code somehow avoids. That's your area of expertise, so maybe you can figure it out? I've already spent several hours on it without understanding much more than I did before.
>
>
>> + # TODO: Handle cases where it was renamed on one side and copied
>> + # on the other side
>
> Is this TODO a regression or some ported limitation.
It's ported (i.e. it existed before this patch). See commit message of https://phab.mercurial-scm.org/D6242.
>
>
>> + elif dsts1:
>> + # copied/renamed only on side 1
>> + if src not in m2:
>> + # deleted on side 2
>> + if src not in m1:
>> + # renamed on side 1, deleted on side 2
>> + renamedelete[src] = dsts1
>> + elif m2[src] != mb[src]:
>> + # modified on side 2
>> + for dst in dsts1:
>> + if dst not in m2:
>> + # dst not added on side 2 (handle as regular
>> + # "both created" case in manifestmerge in that case)
>
> Can you elaborate a bit on what this case means (and how we deal with
> it) especially, what happens if dst is indeed in m2 ?
Oops, I think that was supposed say "otherwise" instead of "in that case". I'll fix.
>
>
>> + copy[dst] = src
>> + elif dsts2:
>> + # copied/renamed only on side 2
>> + if src not in m1:
>> + # deleted on side 1
>> + if src not in m2:
>> + # renamed on side 2, deleted on side 1
>> + renamedelete[src] = dsts2
>> + elif m1[src] != mb[src]:
>> + # modified on side 1
>> + for dst in dsts2:
>> + if dst not in m1:
>> + # dst not added on side 1 (handle as regular
>> + # "both created" case in manifestmerge in that case)
>> + copy[dst] = src
>
> Since this seems the very same code as the previous clause, I wonder if
> we could factor it out. This would help to prevent subtle bug when we
> update it. (the answer might be "no because python is slow).
Yes, I also considered that. I wasn't sure what a good name for the method would be and I gave up. I'll try again.
>
>
>> + renamedeleteset = set()
>> + divergeset = set()
>> + for src, dsts in diverge.items():
>> + divergeset.update(dsts)
>> + for src, dsts in renamedelete.items():
>> + renamedeleteset.update(dsts)
>
> small nits: Is there any reason not to use diverge.values and
> renamedelete.values here ?
This is existing code, but I can send a separate patch for that.
REPOSITORY
rHG Mercurial
REVISION DETAIL
https://phab.mercurial-scm.org/D6255
To: martinvonz, #hg-reviewers
Cc: marmoute, mjpieters, mercurial-devel
More information about the Mercurial-devel
mailing list