[PATCH V2] merge: add `internal:dumpjson` tool to `resolve`, which outputs conflict state

Ryan McElroy rm at fb.com
Thu Mar 16 02:20:42 EDT 2017


More nits inline to consider since you're looking into a v3 anyway...


On 3/15/17 10:48 AM, Durham Goode wrote:
>
>
> On 3/7/17 11:40 AM, Phil Cohen wrote:
>> # HG changeset patch
>> # User Phil Cohen <phillco at fb.com>
>> # Date 1488915535 28800
>> #      Tue Mar 07 11:38:55 2017 -0800
>> # Node ID bbce62e3790220f19e7b37160a2f8351b7461272
>> # Parent  91e86a6c61c0c6a3b554eefeba9060000311aa29
>> merge: add `internal:dumpjson` tool to `resolve`, which outputs 
>> conflict state
>>
>> This supersedes a previous change which added a --prep option.
>
> Generally we don't reference previous iterations of the patch in the 
> commit message, since those previous iterations wont' show up in the 
> repo after this is pushed.
>
>> Normally, `hg resolve` takes the user on a whistle-stop tour of each 
>> conflicted
>> file, pausing to launch the editor to resolve the conflicts. This is an
>> alternative workflow for resolving many conflicts in a random-access 
>> fashion. It
>> doesn't change/replace the default behavior.
>>
>> This commit adds `--tool=internal:dumpjson`. It prints, for each 
>> conflict, the
>> "base", "other", and "ours" versions (the contents, as well as their 
>> exec/link
>> flags), and where the user/tool should write a resolved version 
>> (i.e., the
>> working copy) as JSON. The user will then resolve the conflicts at 
>> their leisure
>> and run `hg resolve --mark`.
>
> Overall I think it looks good.  Some minor comments inline, but I'd be 
> ok having this queued (since it's been sitting here a while) and 
> addressing the nits in a follow up.
>
>> diff --git a/mercurial/commands.py b/mercurial/commands.py
>> --- a/mercurial/commands.py
>> +++ b/mercurial/commands.py
>> @@ -33,6 +33,7 @@
>>      error,
>>      exchange,
>>      extensions,
>> +    formatter,
>>      graphmod,
>>      hbisect,
>>      help,
>> @@ -4284,6 +4285,26 @@
>>          fm.end()
>>          return 0
>>
>> +    if opts.get('tool', '') == "internal:dumpjson":
>> +        fm = ui.formatter('resolve', {'template': 'json'})
>> +        ms = mergemod.mergestate.read(repo)

I don't love the short name style hg uses, and I don't see a godo reason 
to use it unless there's already a strong-established precedent in the file.

We've been calling formatters 'fm' for a long time, but 'ms' as 
mergestate seems like a new one (to me at least). I'd love a more 
descriptive variable name since we have the option right now

>> +        m = scmutil.match(repo[None], pats, opts)

Would prefer this be called 'matcher'

>> +        wctx = repo[None]
>> +
>> +        paths = []
>> +        for f in ms:

Then this could become:

for filename in mergestate:

And then it's obvious from just reading the loop what it is doing

>> +            if not m(f):

And this would become:

if not matcher(filename):

Which is also much more clear to me at least.

>> +                continue
>> +
>> +            val = ms.internaldump(f, wctx)
>> +            if val is not None:
>> +                paths.append(val)
>> +
>> +        fm.startitem()
>> +        fm.write('conflicts', '%s\n', paths)
>> +        fm.end()
>> +        return 0
>> +
>>      with repo.wlock():
>>          ms = mergemod.mergestate.read(repo)
>>
>> diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
>> --- a/mercurial/filemerge.py
>> +++ b/mercurial/filemerge.py
>> @@ -567,6 +567,40 @@
>>          "o": " [%s]" % labels[1],
>>      }
>>
>> +def summarize(repo, fcd, fco, fca):

I'm sure we can come up with better parameter names here too.

fca is probably the "file context ancestor", but I have no idea what fcd 
and fco are without diving deep into the code.

>> +    back = None if fcd.isabsent() else \
>> +        scmutil.origpath(repo.ui, repo, repo.wjoin(fcd.path()))

It's unclear to me what 'back' will be used for. Can you add a comment? 
(is it the "back"up -- the .orig file?)

>> +
>> +    def sum(ctx):

I'd name this flags() instead of sum

>> +        return {
>> +            'contents': ctx.data(),
>> +            'isexec': ctx.isexec(),
>> +            'issymlink': ctx.islink(),
>> +        }
>> +
>> +    ours = None

'ours' is a git thing, not a mercurial thing. Let's use consitent 
hg-internal naming if possible:
* local
* other
* base

>> +    if back and util.filestat(back).stat:
>> +        # Replicate the schema of `base` and `other` for `ours`. 
>> Since you can
>> +        # start a merge with a dirty working copy, `ours` must 
>> reflect that,
>> +        # not the underlying commit. That's why we look at .orig 
>> version.
>> +        ours = {
>> +            'path': back,
>> +            'contents': util.readfile(back),
>> +            'isexec': util.isexec(back),
>> +            'issymlink': util.statislink(util.filestat(back).stat)
>> +        }
>> +
>> +    output = sum(fcd)
>> +    output['path'] = repo.wjoin(fcd.path())
>> +
>> +    return {
>> +        'path': fcd.path(),
>> +        'base': sum(fca),
>> +        'other': sum(fco),
>> +        'output': output,
>> +        'ours': ours,

It's even more important not to introduce a new non-hg term in the 
user-visible output.

>> +    }
>> +
>>  def _filemerge(premerge, repo, mynode, orig, fcd, fco, fca, 
>> labels=None):
>>      """perform a 3-way merge in the working directory
>>
>> diff --git a/mercurial/merge.py b/mercurial/merge.py
>> --- a/mercurial/merge.py
>> +++ b/mercurial/merge.py
>> @@ -456,6 +456,24 @@
>>      def extras(self, filename):
>>          return self._stateextras.setdefault(filename, {})
>>
>> +    def _internaldump(self, dfile, wctx):
>> +        if self[dfile] in 'rd':
>> +            return None
>> +        stateentry = self._state[dfile]
>> +        state, hash, lfile, afile, anode, ofile, onode, flags = 
>> stateentry
>> +        octx = self._repo[self._other]
>> +        extras = self.extras(dfile)
>> +        anccommitnode = extras.get('ancestorlinknode')
>> +        if anccommitnode:
>> +            actx = self._repo[anccommitnode]
>> +        else:
>> +            actx = None
>> +        fcd = self._filectxorabsent(hash, wctx, dfile)
>> +        fco = self._filectxorabsent(onode, octx, ofile)
>> +        fca = self._repo.filectx(afile, fileid=anode, changeid=actx)
>> +
>> +        return filemerge.summarize(self._repo, fcd, fco, fca)
>> +

It seems weird to me that _internaldump() (with an underscore) is called 
from outside the module and summarize() (without an underscore) is 
called from inside the module.

>>      def _resolve(self, preresolve, dfile, wctx):
>>          """rerun merge process for file path `dfile`"""
>>          if self[dfile] in 'rd':
>> @@ -543,6 +561,9 @@
>>          Returns whether the merge is complete, and the exit code."""
>>          return self._resolve(True, dfile, wctx)
>>
>> +    def internaldump(self, dfile, wctx):
>> +        return self._internaldump(dfile, wctx)
>
> Why have the _ function and the normal function?
>
>> +
>>      def resolve(self, dfile, wctx):
>>          """run merge process (assuming premerge was run) for dfile
>>
>> diff --git a/tests/test-resolve-prep.t b/tests/test-resolve-prep.t
>> new file mode 100644
>> --- /dev/null
>> +++ b/tests/test-resolve-prep.t
>
> Maybe needs a new name since we're no longer calling this --prep. Same 
> for references to prep inside the test
>
>> @@ -0,0 +1,75 @@
>> +1) Make the repo
>> +  $ hg init
>> +
>> +2) Can't run prep outside a conflict
>> +  $ hg resolve --tool internal:dumpjson
>> +  abort: no files or directories specified
>> +  (use --all to re-merge all unresolved files)
>> +  [255]
>> +
>> +3) Make a simple conflict
>> +  $ echo "Unconflicted base, F1" > F1
>> +  $ echo "Unconflicted base, F2" > F2
>> +  $ hg commit -Aqm "initial commit"
>> +  $ echo "First conflicted version, F1" > F1
>> +  $ echo "First conflicted version, F2" > F2
>> +  $ hg commit -m "first version, a"
>> +  $ hg bookmark a
>> +  $ hg checkout .~1
>> +  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
>> +  (leaving bookmark a)
>> +  $ echo "Second conflicted version, F1" > F1
>> +  $ echo "Second conflicted version, F2" > F2
>> +  $ hg commit -m "second version, b"
>> +  created new head
>> +  $ hg bookmark b
>> +  $ hg log -G -T '({rev}) {desc}\nbookmark: {bookmarks}\nfiles: 
>> {files}\n\n'
>> +  @  (2) second version, b
>> +  |  bookmark: b
>> +  |  files: F1 F2
>> +  |
>> +  | o  (1) first version, a
>> +  |/   bookmark: a
>> +  |    files: F1 F2
>> +  |
>> +  o  (0) initial commit
>> +     bookmark:
>> +     files: F1 F2
>> +
>> +
>> +
>> +  $ hg merge a
>> +  merging F1
>> +  merging F2
>> +  warning: conflicts while merging F1! (edit, then use 'hg resolve 
>> --mark')
>> +  warning: conflicts while merging F2! (edit, then use 'hg resolve 
>> --mark')
>> +  0 files updated, 0 files merged, 0 files removed, 2 files unresolved
>> +  use 'hg resolve' to retry unresolved file merges or 'hg update -C 
>> .' to abandon
>> +  [1]
>> +
>> +5) Get the paths:
>> +  $ hg resolve --tool internal:dumpjson --all
>> +  [
>> +   {
>> +    "conflicts": [{"base": {"contents": "Unconflicted base, F1\n", 
>> "isexec": false, "issymlink": false}, "other": {"contents": "First 
>> conflicted version, F1\n", "isexec": false, "issymlink": false}, 
>> "ours": {"contents": "Second conflicted version, F1\n", "isexec": 
>> false, "issymlink": false, "path": "$TESTTMP/F1.orig"}, "output": 
>> {"contents": 
>> "<<<<<<<workingcopy:13124abb51b9b-test:secondversion,b\nSecondconflictedversion,F1\n=======\nFirstconflictedversion,F1\n>>>>>>> 
>> merge rev:    6dd692b7db4a a - test: first version, a\n", "isexec": 
>> false, "issymlink": false, "path": "$TESTTMP/F1"}, "path": "F1"}, 
>> {"base": {"contents": "Unconflicted base, F2\n", "isexec": false, 
>> "issymlink": false}, "other": {"contents": "First conflicted version, 
>> F2\n", "isexec": false, "issymlink": false}, "ours": {"contents": 
>> "Second conflicted version, F2\n", "isexec": false, "issymlink": 
>> false, "path": "$TESTTMP/F2.orig"}, "output": {"contents": 
>> "<<<<<<<workingcopy:13124abb51b9b-test:secondversion,b\nSecondconflictedversion,F2\n=======\nFirstconflictedversion,F2\n>>>>>>> 
>> merge rev:    6dd692b7db4a a - test: first version, a\n", "isexec": 
>> false, "issymlink": false, "path": "$TESTTMP/F2"}, "path": "F2"}]
>> +   }
>> +  ]
>
> Might be nice to check that passing a path to resolve along with 
> :dumpjson results in it only dumping json for that path.
>
>> +
>> +6) Ensure the paths point to the right contents:
>> +  $ getcontents() { # Usage: getcontents <path> <version>
>> +  >  local script="import sys, json; print 
>> json.load(sys.stdin)[0][\"conflicts\"][$1][\"$2\"][\"contents\"]"
>> +  >  local result=$(hg resolve --tool internal:dumpjson --all | 
>> python -c "$script")
>> +  >  echo "$result"
>> +  > }
>> +  $ echo $(getcontents 0 "base")
>> +  Unconflicted base, F1
>> +  $ echo $(getcontents 0 "other")
>> +  First conflicted version, F1
>> +  $ echo $(getcontents 0 "ours")
>> +  Second conflicted version, F1
>> +  $ echo $(getcontents 1 "base")
>> +  Unconflicted base, F2
>> +  $ echo $(getcontents 1 "other")
>> +  First conflicted version, F2
>> +  $ echo $(getcontents 1 "ours")
>> +  Second conflicted version, F2

I'd like to see tests for more of the "weird" conflicts that we went 
over in the table a couple days ago.

I also think this patch should be split up to make it smaller and easier 
to review: My suggestions:

* a patch introducing summarize
* a patch introducing internaldump (but probably renamed because 
_internaldump really sounds like I shouldn't be calling it)
* a patch introducing the new merge tool and the tests

I'm excited for this behavior!


More information about the Mercurial-devel mailing list