D6255: copies: calculate mergecopies() based on pathcopies()

martinvonz (Martin von Zweigbergk) phabricator at mercurial-scm.org
Wed Apr 17 15:25:22 EDT 2019


martinvonz added a comment.


  In https://phab.mercurial-scm.org/D6255#91019, @martinvonz wrote:
  
  > In https://phab.mercurial-scm.org/D6255#91015, @marmoute wrote:
  >
  > >
  >
  >
  > The rest somehow didn't make it here, so I'll copy from the email (i.e. the below is from Pierre-Yves, not from me):
  >
  > >    
  >
  > on related topic, you seems to be fixing a case in 
  >  test-fastannotate-hg.t and test-annotate-hg.t that is probably worth 
  >  mentioning.
  
  
  Sure, I'll add that.
  
  > 
  > 
  >>     One drawback of the rewritten code is that we may now make
  >>     remotefilelog prefetch more files. We used to prefetch files that were
  >>     unique to either side of the merge compared to the other. We now
  >>     prefetch files that are unique to either sise of the merge compared to
  > 
  > sise → side
  
  Thanks, done.
  
  >>     Some timings for calculating mergecopies between two revisions (all
  >>     using the common ancestor as base):
  > 
  > Which revisions did you pick?
  
  I've tried to clarify this in the commit message.
  
  >>     So it's measurably slower in most cases. Note that merge copies are
  >>     not calculated when updating with a clean working copy, which is
  >>     probably the most common case. I therefore think the much simpler code
  >>     is worth the slowdown.
  > 
  > Do you know where the slowdown comes from ?
  
  One difference is that the new code ends up calling `adjustlinkrev()` (from the `isintroducedafter()` call in `_tracefile()`), which the old code somehow avoids. That's your area of expertise, so maybe you can figure it out? I've already spent several hours on it without understanding much more than I did before.
  
  > 
  > 
  >> +            # TODO: Handle cases where it was renamed on one side and copied
  >>  +            # on the other side
  > 
  > Is this TODO a regression or some ported limitation.
  
  It's ported (i.e. it existed before this patch). See commit message of https://phab.mercurial-scm.org/D6242.
  
  > 
  > 
  >> +        elif dsts1:
  >>  +            # copied/renamed only on side 1
  >>  +            if src not in m2:
  >>  +                # deleted on side 2
  >>  +                if src not in m1:
  >>  +                    # renamed on side 1, deleted on side 2
  >>  +                    renamedelete[src] = dsts1
  >>  +            elif m2[src] != mb[src]:
  >>  +                # modified on side 2
  >>  +                for dst in dsts1:
  >>  +                    if dst not in m2:
  >>  +                        # dst not added on side 2 (handle as regular
  >>  +                        # "both created" case in manifestmerge in that case)
  > 
  > Can you elaborate a bit on what this case means (and how we deal with 
  >  it) especially, what happens if dst is indeed in m2 ?
  
  Oops, I think that was supposed say "otherwise" instead of "in that case". I'll fix.
  
  > 
  > 
  >> +                        copy[dst] = src
  >>  +        elif dsts2:
  >>  +            # copied/renamed only on side 2
  >>  +            if src not in m1:
  >>  +                # deleted on side 1
  >>  +                if src not in m2:
  >>  +                    # renamed on side 2, deleted on side 1
  >>  +                    renamedelete[src] = dsts2
  >>  +            elif m1[src] != mb[src]:
  >>  +                # modified on side 1
  >>  +                for dst in dsts2:
  >>  +                    if dst not in m1:
  >>  +                        # dst not added on side 1 (handle as regular
  >>  +                        # "both created" case in manifestmerge in that case)
  >>  +                        copy[dst] = src
  > 
  > Since this seems the very same code as the previous clause, I wonder if 
  >  we could factor it out. This would help to prevent subtle bug when we 
  >  update it. (the answer might be "no because python is slow).
  
  Yes, I also considered that. I wasn't sure what a good name for the method would be and I gave up. I'll try again.
  
  > 
  > 
  >> +    renamedeleteset = set()
  >>  +    divergeset = set()
  >>  +    for src, dsts in diverge.items():
  >>  +        divergeset.update(dsts)
  >>  +    for src, dsts in renamedelete.items():
  >>  +        renamedeleteset.update(dsts)
  > 
  > small nits: Is there any reason not to use diverge.values and 
  >  renamedelete.values here ?
  
  This is existing code, but I can send a separate patch for that.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6255

To: martinvonz, #hg-reviewers
Cc: marmoute, mjpieters, mercurial-devel


More information about the Mercurial-devel mailing list