[PATCH V2] merge: add `internal:dumpjson` tool to `resolve`, which outputs conflict state
Ryan McElroy
rm at fb.com
Thu Mar 16 02:20:42 EDT 2017
More nits inline to consider since you're looking into a v3 anyway...
On 3/15/17 10:48 AM, Durham Goode wrote:
>
>
> On 3/7/17 11:40 AM, Phil Cohen wrote:
>> # HG changeset patch
>> # User Phil Cohen <phillco at fb.com>
>> # Date 1488915535 28800
>> # Tue Mar 07 11:38:55 2017 -0800
>> # Node ID bbce62e3790220f19e7b37160a2f8351b7461272
>> # Parent 91e86a6c61c0c6a3b554eefeba9060000311aa29
>> merge: add `internal:dumpjson` tool to `resolve`, which outputs
>> conflict state
>>
>> This supersedes a previous change which added a --prep option.
>
> Generally we don't reference previous iterations of the patch in the
> commit message, since those previous iterations wont' show up in the
> repo after this is pushed.
>
>> Normally, `hg resolve` takes the user on a whistle-stop tour of each
>> conflicted
>> file, pausing to launch the editor to resolve the conflicts. This is an
>> alternative workflow for resolving many conflicts in a random-access
>> fashion. It
>> doesn't change/replace the default behavior.
>>
>> This commit adds `--tool=internal:dumpjson`. It prints, for each
>> conflict, the
>> "base", "other", and "ours" versions (the contents, as well as their
>> exec/link
>> flags), and where the user/tool should write a resolved version
>> (i.e., the
>> working copy) as JSON. The user will then resolve the conflicts at
>> their leisure
>> and run `hg resolve --mark`.
>
> Overall I think it looks good. Some minor comments inline, but I'd be
> ok having this queued (since it's been sitting here a while) and
> addressing the nits in a follow up.
>
>> diff --git a/mercurial/commands.py b/mercurial/commands.py
>> --- a/mercurial/commands.py
>> +++ b/mercurial/commands.py
>> @@ -33,6 +33,7 @@
>> error,
>> exchange,
>> extensions,
>> + formatter,
>> graphmod,
>> hbisect,
>> help,
>> @@ -4284,6 +4285,26 @@
>> fm.end()
>> return 0
>>
>> + if opts.get('tool', '') == "internal:dumpjson":
>> + fm = ui.formatter('resolve', {'template': 'json'})
>> + ms = mergemod.mergestate.read(repo)
I don't love the short name style hg uses, and I don't see a godo reason
to use it unless there's already a strong-established precedent in the file.
We've been calling formatters 'fm' for a long time, but 'ms' as
mergestate seems like a new one (to me at least). I'd love a more
descriptive variable name since we have the option right now
>> + m = scmutil.match(repo[None], pats, opts)
Would prefer this be called 'matcher'
>> + wctx = repo[None]
>> +
>> + paths = []
>> + for f in ms:
Then this could become:
for filename in mergestate:
And then it's obvious from just reading the loop what it is doing
>> + if not m(f):
And this would become:
if not matcher(filename):
Which is also much more clear to me at least.
>> + continue
>> +
>> + val = ms.internaldump(f, wctx)
>> + if val is not None:
>> + paths.append(val)
>> +
>> + fm.startitem()
>> + fm.write('conflicts', '%s\n', paths)
>> + fm.end()
>> + return 0
>> +
>> with repo.wlock():
>> ms = mergemod.mergestate.read(repo)
>>
>> diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
>> --- a/mercurial/filemerge.py
>> +++ b/mercurial/filemerge.py
>> @@ -567,6 +567,40 @@
>> "o": " [%s]" % labels[1],
>> }
>>
>> +def summarize(repo, fcd, fco, fca):
I'm sure we can come up with better parameter names here too.
fca is probably the "file context ancestor", but I have no idea what fcd
and fco are without diving deep into the code.
>> + back = None if fcd.isabsent() else \
>> + scmutil.origpath(repo.ui, repo, repo.wjoin(fcd.path()))
It's unclear to me what 'back' will be used for. Can you add a comment?
(is it the "back"up -- the .orig file?)
>> +
>> + def sum(ctx):
I'd name this flags() instead of sum
>> + return {
>> + 'contents': ctx.data(),
>> + 'isexec': ctx.isexec(),
>> + 'issymlink': ctx.islink(),
>> + }
>> +
>> + ours = None
'ours' is a git thing, not a mercurial thing. Let's use consitent
hg-internal naming if possible:
* local
* other
* base
>> + if back and util.filestat(back).stat:
>> + # Replicate the schema of `base` and `other` for `ours`.
>> Since you can
>> + # start a merge with a dirty working copy, `ours` must
>> reflect that,
>> + # not the underlying commit. That's why we look at .orig
>> version.
>> + ours = {
>> + 'path': back,
>> + 'contents': util.readfile(back),
>> + 'isexec': util.isexec(back),
>> + 'issymlink': util.statislink(util.filestat(back).stat)
>> + }
>> +
>> + output = sum(fcd)
>> + output['path'] = repo.wjoin(fcd.path())
>> +
>> + return {
>> + 'path': fcd.path(),
>> + 'base': sum(fca),
>> + 'other': sum(fco),
>> + 'output': output,
>> + 'ours': ours,
It's even more important not to introduce a new non-hg term in the
user-visible output.
>> + }
>> +
>> def _filemerge(premerge, repo, mynode, orig, fcd, fco, fca,
>> labels=None):
>> """perform a 3-way merge in the working directory
>>
>> diff --git a/mercurial/merge.py b/mercurial/merge.py
>> --- a/mercurial/merge.py
>> +++ b/mercurial/merge.py
>> @@ -456,6 +456,24 @@
>> def extras(self, filename):
>> return self._stateextras.setdefault(filename, {})
>>
>> + def _internaldump(self, dfile, wctx):
>> + if self[dfile] in 'rd':
>> + return None
>> + stateentry = self._state[dfile]
>> + state, hash, lfile, afile, anode, ofile, onode, flags =
>> stateentry
>> + octx = self._repo[self._other]
>> + extras = self.extras(dfile)
>> + anccommitnode = extras.get('ancestorlinknode')
>> + if anccommitnode:
>> + actx = self._repo[anccommitnode]
>> + else:
>> + actx = None
>> + fcd = self._filectxorabsent(hash, wctx, dfile)
>> + fco = self._filectxorabsent(onode, octx, ofile)
>> + fca = self._repo.filectx(afile, fileid=anode, changeid=actx)
>> +
>> + return filemerge.summarize(self._repo, fcd, fco, fca)
>> +
It seems weird to me that _internaldump() (with an underscore) is called
from outside the module and summarize() (without an underscore) is
called from inside the module.
>> def _resolve(self, preresolve, dfile, wctx):
>> """rerun merge process for file path `dfile`"""
>> if self[dfile] in 'rd':
>> @@ -543,6 +561,9 @@
>> Returns whether the merge is complete, and the exit code."""
>> return self._resolve(True, dfile, wctx)
>>
>> + def internaldump(self, dfile, wctx):
>> + return self._internaldump(dfile, wctx)
>
> Why have the _ function and the normal function?
>
>> +
>> def resolve(self, dfile, wctx):
>> """run merge process (assuming premerge was run) for dfile
>>
>> diff --git a/tests/test-resolve-prep.t b/tests/test-resolve-prep.t
>> new file mode 100644
>> --- /dev/null
>> +++ b/tests/test-resolve-prep.t
>
> Maybe needs a new name since we're no longer calling this --prep. Same
> for references to prep inside the test
>
>> @@ -0,0 +1,75 @@
>> +1) Make the repo
>> + $ hg init
>> +
>> +2) Can't run prep outside a conflict
>> + $ hg resolve --tool internal:dumpjson
>> + abort: no files or directories specified
>> + (use --all to re-merge all unresolved files)
>> + [255]
>> +
>> +3) Make a simple conflict
>> + $ echo "Unconflicted base, F1" > F1
>> + $ echo "Unconflicted base, F2" > F2
>> + $ hg commit -Aqm "initial commit"
>> + $ echo "First conflicted version, F1" > F1
>> + $ echo "First conflicted version, F2" > F2
>> + $ hg commit -m "first version, a"
>> + $ hg bookmark a
>> + $ hg checkout .~1
>> + 2 files updated, 0 files merged, 0 files removed, 0 files unresolved
>> + (leaving bookmark a)
>> + $ echo "Second conflicted version, F1" > F1
>> + $ echo "Second conflicted version, F2" > F2
>> + $ hg commit -m "second version, b"
>> + created new head
>> + $ hg bookmark b
>> + $ hg log -G -T '({rev}) {desc}\nbookmark: {bookmarks}\nfiles:
>> {files}\n\n'
>> + @ (2) second version, b
>> + | bookmark: b
>> + | files: F1 F2
>> + |
>> + | o (1) first version, a
>> + |/ bookmark: a
>> + | files: F1 F2
>> + |
>> + o (0) initial commit
>> + bookmark:
>> + files: F1 F2
>> +
>> +
>> +
>> + $ hg merge a
>> + merging F1
>> + merging F2
>> + warning: conflicts while merging F1! (edit, then use 'hg resolve
>> --mark')
>> + warning: conflicts while merging F2! (edit, then use 'hg resolve
>> --mark')
>> + 0 files updated, 0 files merged, 0 files removed, 2 files unresolved
>> + use 'hg resolve' to retry unresolved file merges or 'hg update -C
>> .' to abandon
>> + [1]
>> +
>> +5) Get the paths:
>> + $ hg resolve --tool internal:dumpjson --all
>> + [
>> + {
>> + "conflicts": [{"base": {"contents": "Unconflicted base, F1\n",
>> "isexec": false, "issymlink": false}, "other": {"contents": "First
>> conflicted version, F1\n", "isexec": false, "issymlink": false},
>> "ours": {"contents": "Second conflicted version, F1\n", "isexec":
>> false, "issymlink": false, "path": "$TESTTMP/F1.orig"}, "output":
>> {"contents":
>> "<<<<<<<workingcopy:13124abb51b9b-test:secondversion,b\nSecondconflictedversion,F1\n=======\nFirstconflictedversion,F1\n>>>>>>>
>> merge rev: 6dd692b7db4a a - test: first version, a\n", "isexec":
>> false, "issymlink": false, "path": "$TESTTMP/F1"}, "path": "F1"},
>> {"base": {"contents": "Unconflicted base, F2\n", "isexec": false,
>> "issymlink": false}, "other": {"contents": "First conflicted version,
>> F2\n", "isexec": false, "issymlink": false}, "ours": {"contents":
>> "Second conflicted version, F2\n", "isexec": false, "issymlink":
>> false, "path": "$TESTTMP/F2.orig"}, "output": {"contents":
>> "<<<<<<<workingcopy:13124abb51b9b-test:secondversion,b\nSecondconflictedversion,F2\n=======\nFirstconflictedversion,F2\n>>>>>>>
>> merge rev: 6dd692b7db4a a - test: first version, a\n", "isexec":
>> false, "issymlink": false, "path": "$TESTTMP/F2"}, "path": "F2"}]
>> + }
>> + ]
>
> Might be nice to check that passing a path to resolve along with
> :dumpjson results in it only dumping json for that path.
>
>> +
>> +6) Ensure the paths point to the right contents:
>> + $ getcontents() { # Usage: getcontents <path> <version>
>> + > local script="import sys, json; print
>> json.load(sys.stdin)[0][\"conflicts\"][$1][\"$2\"][\"contents\"]"
>> + > local result=$(hg resolve --tool internal:dumpjson --all |
>> python -c "$script")
>> + > echo "$result"
>> + > }
>> + $ echo $(getcontents 0 "base")
>> + Unconflicted base, F1
>> + $ echo $(getcontents 0 "other")
>> + First conflicted version, F1
>> + $ echo $(getcontents 0 "ours")
>> + Second conflicted version, F1
>> + $ echo $(getcontents 1 "base")
>> + Unconflicted base, F2
>> + $ echo $(getcontents 1 "other")
>> + First conflicted version, F2
>> + $ echo $(getcontents 1 "ours")
>> + Second conflicted version, F2
I'd like to see tests for more of the "weird" conflicts that we went
over in the table a couple days ago.
I also think this patch should be split up to make it smaller and easier
to review: My suggestions:
* a patch introducing summarize
* a patch introducing internaldump (but probably renamed because
_internaldump really sounds like I shouldn't be calling it)
* a patch introducing the new merge tool and the tests
I'm excited for this behavior!
More information about the Mercurial-devel
mailing list