[PATCH] merge: add resolve --prep, which outputs conflicted comparison files
Jun Wu
quark at fb.com
Mon Feb 27 20:59:28 EST 2017
Excerpts from Durham Goode's message of 2017-02-27 17:39:10 -0800:
>
> On 2/27/17 4:58 PM, Jun Wu wrote:
> > Excerpts from Durham Goode's message of 2017-02-27 16:37:56 -0800:
> >> Do you mean outputting the actual file content in the json? I was
> >> thinking of having just the paths to the file contents be in the json.
> >> That way we don't have to json escape a bunch of binary data if we
> >> output conflicting binary files (which is an eventual use case).
> >
> > Yes. Having the content available is better in my opinion:
> >
> > 1. No filesystem race condition
> > 2. Stateless. No need to clean up files
> >
> > For binary files, we can skip them in some way. That's also one reason why
> > the json schema is hard to define, and why I think Python merge functions is
> > a cleaner solution.
>
> K, I'd be fine streaming the raw files as part of the json until we find
> a reason not to (like huge conflicts or binary resolutions or something).
>
> >> For raw vs slice, I think having the four raw files (yours, mine, base,
> >> output file which starts with conflict markers in it) makes the most
> >> sense from the point of view of a merge tool consuming this data. I
> >> don't think we should require the consumer to parse the conflict marker
> >> file. I'm not even sure they could 100% correctly parse the file if the
> >> file contained text that looked like conflict markers but was actually
> >> part of the file.
> >
> > So you are suggesting raw files without markers? That makes sense from a
> > cleaner API perspective. But it's less practical - it requires extra efforts
> > from the upper-layer application to figure out what could be cleanly merged
> > (other and local have the same change).
>
> I think the consumer application is usually going to be a merge tool
> anyway, so I'm not too concerned with them figuring out the conflict.
> But even so, the 'output' file I suggest would initially contain the
> conflict markers so the tool could present that to the user if necessary.
>
> >
> > I never suggested put conflict markers into file contents. The "slices"
> > format is defined precisely by JSON - something similar to the output of
> > google's diff-match-patch library [1], but with 3 sides.
> >
> > [1]: https://github.com/zchee/google-diff-match-patch/wiki/API#initialization
>
> I think starting with the normal other/local/base/output raw text is
> probably sufficient for most resolve use cases, since that's enough for
> the current merge tools. If we come up with a use case, we can add
> internal:dump-json-slices or something fancier.
>
> So, do you think it would be sufficient to make the following changes to
> the current patch?
>
> 1. Remove --prep
> 2. Add internal:dump-json as a tool that uses this code path. 'hg
> resolve --tool internal:dump-json' doesn't actually resolve anything,
> but outputs the json to stdout.
Yes.
> 3. Return raw file contents in the json instead of file paths. So a
> format like:
>
> {
> "conflicts": {
> "path/to/file1": {
> "base_content": "xxxx",
> "original_content": "yyyy",
> "other_content": "zzzz",
> "conflictmarker_content": "aaaa",
Maybe without "conflictmarker_content" - the merge tool could have its own
resolution state (like, entirely in the IDE memory) where the working copy
state is meaningless to it.
> }
> }
I think the following metadata needs to be exposed for base, local, and
other, respectively:
- executable bit
- symlink bit
I'd arrange the JSON a bit differently so keys are fixed, and items are
stored in an array:
[
{'base': {'path': ...,
'content': ...,
'isexec': ...,
'issymlink': ... },
'other': {'path': ...,
'content': ...
'isexec': ...,
'issymlink': ...},
'local': { .... },
'resolved': false },
{'base': ....,
'other': ....,
'local': ....,
'resolved': .... },
....
]
In this way, it's easier to define the schema using json-schema.org
specification [1] if we want it eventually, and the values of 'base',
'other', 'local' could be described using a single schema definition.
[1]: http://json-schema.org/
> That gives us some room to add file paths later with different key names
> if necessary. It also renames "workingcopy" to be
> "conflictmarker_content" since it's now raw content. The caller can know
> where to write it's results based on the "path/to/file1".
More information about the Mercurial-devel
mailing list