[PATCH 2 of 6] merge: introduce a ``mergemarkersscope`` config option

Siddharth Agarwal sid at less-broken.com
Tue Jun 10 17:14:27 CDT 2014


On 06/10/2014 01:50 PM, pierre-yves.david at ens-lyon.org wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at fb.com>
> # Date 1402311202 25200
> #      Mon Jun 09 03:53:22 2014 -0700
> # Node ID a0d2858a720b665e594372fc3953fcdb20d1ce52
> # Parent  2fb39405980681c1bdbc97d898285bc2012ef586
> merge: introduce a ``mergemarkersscope`` config option

So I have misgivings about exposing this via a config option. The 
problem really is the number of sets of conflict markers vs the size of 
the sections. Can we use this heuristic instead:

For each chunk:
- First compute the 'minimal' scope.
- If this has exactly one set of conflict markers, use this.
- Otherwise use the 'plain' scope.

This should produce the best results in all your examples.

There's probably some sort of quality metric we can use to make this 
even better. More sets of markers = bad, but a shorter conflict section 
= good. But that can be done in a follow-up patch.


>
> This option control the amount of information included in markers. It currently
> have two possible values ``minimalist`` and ``plain``. "minimalist" matches the


'minimal', not 'minimalist'.

'plain' doesn't communicate very much information, but I can't come up 
with a better name right now.

> current behavior where conflict markers try to be reduced to strictly different
> lines. ``plain`` exposes whole conflicting chunks (see detailed explanation
> below). The default behavior remains unchanged for now. A third value, "base"
> is expected to be added in a later changeset. It will include the content from
> "base" file.
>
> Detailled explanation
> =====================
>
>
> The ``simplemerge`` code use in ``internal:merge`` has a feature called
> "minimization". It reprocess conflicting chunks to find common changes
> inside them and excludes such common sections from the marker.
>
> This approach seems a significant win at first glance but produces very
> confusing results in some other cases.
>
> Simple example
> --------------
>
> A simple example is enough to show the benefit of this feature.  In this merge,
> both sides change all numbers from letters to digits, but one side is also
> changing some values.
>
>    $ cat << EOF > base
>    > Small Mathematical Series.
>    > One
>    > Two
>    > Three
>    > Four
>    > Five
>    > Hop we are done.
>    > EOF
>
>    $ cat << EOF > local
>    > Small Mathematical Series.
>    > 1
>    > 2
>    > 3
>    > 4
>    > 5
>    > Hop we are done.
>    > EOF
>
>    $ cat << EOF > other
>    > Small Mathematical Series.
>    > 1
>    > 2
>    > 3
>    > 6
>    > 8
>    > Hop we are done.
>    > EOF
>
> In the minimalists case, the markers focus on the disagreement between the two
> sides.
>
>    $ $TESTDIR/../contrib/simplemerge --print local base other
>    Small Mathematical Series.
>    1
>    2
>    3
>    <<<<<<< local
>    4
>    5
>    =======
>    6
>    8
>    >>>>>>> other
>    Hop we are done.
>    warning: conflicts during merge.
>    [1]
>
> In the non minimalist case, the whole chunk is included in the conflict marker.
> Making it harder spot actual differences.
>
>    $ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other
>    Small Mathematical Series.
>    <<<<<<< local
>    1
>    2
>    3
>    4
>    5
>    =======
>    1
>    2
>    3
>    6
>    8
>    >>>>>>> other
>    Hop we are done.
>    warning: conflicts during merge.
>    [1]

This should be in a test. (I see this is already in a test -- maybe just 
refer to that.)

>
> Practical Advantages of minimalisation: merge of grafted change
> ---------------------------------------------------------------
>
> This feature can be very useful when a change have been grafted in another
> branch and then some change have been made to the grafted code.
>
>    $ cat << EOF > base
>    > # empty file
>    > EOF
>
>    $ cat << EOF > local
>    > def somefunction(one, two):
>    >     some = one
>    >     stuff = two
>    >     are(happening)
>    >     here()
>    > EOF
>
>    $ cat << EOF > other
>    > def somefunction(one, two):
>    >     some = one
>    >     change = two
>    >     are(happening)
>    >     here()
>    > EOF
>
> The minimalist case recognises the grafted content as similar and highlight the
> actual change.
>
>
>    $ $TESTDIR/../contrib/simplemerge --print local base other
>    def somefunction(one, two):
>        some = one
>    <<<<<<< local
>        stuff = two
>    =======
>        change = two
>    >>>>>>> other
>        are(happening)
>        here()
>    warning: conflicts during merge.
>    [1]
>
> Again, the non-minimalist case produces a larger conflict. Making it harder to
> spot the actual conflict.
>
>    $ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other
>    <<<<<<< local
>    def somefunction(one, two):
>        some = one
>        stuff = two
>        are(happening)
>        here()
>    =======
>    def somefunction(one, two):
>        some = one
>        change = two
>        are(happening)
>        here()
>    >>>>>>> other
>    warning: conflicts during merge.
>    [1]
>
>
> Practical disadvantage: multiple functions on each side
> ---------------------------------------------------------------
>
> So, if this "minimalist" help so much, why introduce a setting to disable it?
>
> The issue is that this minimisation will grab any common lines for breaking
> chunks. This may result in partial context when solving a merge. The most
> simple example is a merge where both side added some (different) functions
> separated by blank lines. The "minimalist" approach will recognise the blank
> line as "common" and over slice the chunks, turning a simple conflict case into
> multiple pairs of conflicting functions.
>
>    $ cat << EOF > base
>    > # empty file
>    > EOF
>
>    $ cat << EOF > local
>    > def function1():
>    >     bla()
>    >     bla()
>    >     bla()
>    >
>    > def function2():
>    >     ble()
>    >     ble()
>    >     ble()
>    > EOF
>
>    $ cat << EOF > other
>    > def function3():
>    >     bli()
>    >     bli()
>    >     bli()
>    >
>    > def function4():
>    >     blo()
>    >     blo()
>    >     blo()
>    > EOF
>
> The minimal case presents each function as a separated context.
>
>    $ $TESTDIR/../contrib/simplemerge --print local base other
>    <<<<<<< local
>    def function1():
>        bla()
>        bla()
>        bla()
>    =======
>    def function3():
>        bli()
>        bli()
>        bli()
>    >>>>>>> other
>
>    <<<<<<< local
>    def function2():
>        ble()
>        ble()
>        ble()
>    =======
>    def function4():
>        blo()
>        blo()
>        blo()
>    >>>>>>> other
>    warning: conflicts during merge.
>    [1]
>
> The non-minimalist approach produces a simpler version with more context in
> each block. Solving such conflicts is usually as simple as dropping the 3 lines
> dedicated to markers.
>
>    $ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other
>    <<<<<<< local
>    def function1():
>        bla()
>        bla()
>        bla()
>
>    def function2():
>        ble()
>        ble()
>        ble()
>    =======
>    def function3():
>        bli()
>        bli()
>        bli()
>
>    def function4():
>        blo()
>        blo()
>        blo()
>    >>>>>>> other
>    warning: conflicts during merge.
>    [1]

This should be in a test too.


>
> Practical disaster: programing language have a lot of common line
> =================================================================
>
> If only blank lines between function where the only frequent content of a code
> file. But programming language tend to repeat them self much more often. In that
> case, the minimalist approach turns a simple conflict into a massive mess.
>
> Consider this example where two unrelated functions are added on each side.
> Those function shares common programming constructs by chance.
>
>    $ cat << EOF > base
>    > # empty file
>    > EOF
>
>    $ cat << EOF > local
>    > def longfunction():
>    >     if bla:
>    >        foo
>    >     else:
>    >        bar
>    >     try:
>    >        ret = some stuff
>    >     except Exception:
>    >        ret = None
>    >     if ret is not None:
>    >         return ret
>    >     return 0
>    >
>    > def shortfunction(foo):
>    >     goo()
>    >     ret = foo + 5
>    >     return ret
>    > EOF
>
>    $ cat << EOF > other
>    > def otherlongfunction():
>    >     for x in xxx:
>    >        if coin:
>    >            break
>    >        tutu
>    >     else:
>    >        bar()
>    >     baz()
>    >     ret = week()
>    >     try:
>    >        groumpf = tutu
>    >        fool()
>    >     except Exception:
>    >        zoo()
>    >     pool()
>    >     if cond:
>    >         return ret
>    >
>    >     # some big block
>    >     ret ** 6
>    >     koin()
>    >     return ret
>    > EOF
>
> The minimalist approach will hash the whole conflict into small chunks that
> does not match any meaningful semantic and are impossible to solve.
>
>    $ $TESTDIR/../contrib/simplemerge --print local base other
>    <<<<<<< local
>    def longfunction():
>        if bla:
>           foo
>    =======
>    def otherlongfunction():
>        for x in xxx:
>           if coin:
>               break
>           tutu
>    >>>>>>> other
>        else:
>    <<<<<<< local
>           bar
>    =======
>           bar()
>        baz()
>        ret = week()
>    >>>>>>> other
>        try:
>    <<<<<<< local
>           ret = some stuff
>    =======
>           groumpf = tutu
>           fool()
>    >>>>>>> other
>        except Exception:
>    <<<<<<< local
>           ret = None
>        if ret is not None:
>    =======
>           zoo()
>        pool()
>        if cond:
>    >>>>>>> other
>            return ret
>    <<<<<<< local
>        return 0
>    =======
>    >>>>>>> other
>
>    <<<<<<< local
>    def shortfunction(foo):
>        goo()
>        ret = foo + 5
>    =======
>        # some big block
>        ret ** 6
>        koin()
>    >>>>>>> other
>        return ret
>    warning: conflicts during merge.
>    [1]
>
> The non minimalist approach will properly produce a single set of conflict
> markers. Highlighting that the two chunk are unrelated. Such conflict from
> unrelated content added at the same place is usually solved by dropping the
> marker an keeping both content. Something impossible with minimised markers.
>
>
>    $ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other
>    <<<<<<< local
>    def longfunction():
>        if bla:
>           foo
>        else:
>           bar
>        try:
>           ret = some stuff
>        except Exception:
>           ret = None
>        if ret is not None:
>            return ret
>        return 0
>
>    def shortfunction(foo):
>        goo()
>        ret = foo + 5
>        return ret
>    =======
>    def otherlongfunction():
>        for x in xxx:
>           if coin:
>               break
>           tutu
>        else:
>           bar()
>        baz()
>        ret = week()
>        try:
>           groumpf = tutu
>           fool()
>        except Exception:
>           zoo()
>        pool()
>        if cond:
>            return ret
>
>        # some big block
>        ret ** 6
>        koin()
>        return ret
>    >>>>>>> other
>    warning: conflicts during merge.
>    [1]
>
> diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
> --- a/mercurial/filemerge.py
> +++ b/mercurial/filemerge.py
> @@ -188,11 +188,13 @@ def _premerge(repo, toolconf, files, lab
>               raise error.ConfigError(_("%s.premerge not valid "
>                                         "('%s' is neither boolean nor %s)") %
>                                       (tool, premerge, _valid))
>   
>       if premerge:
> -        r = simplemerge.simplemerge(ui, a, b, c, quiet=True, label=labels)
> +        scope = ui.config('ui', 'mergemarkersscope', 'minimal')
> +        r = simplemerge.simplemerge(ui, a, b, c, quiet=True, label=labels,
> +                                    no_minimal=(scope == 'plain'))
>           if not r:
>               ui.debug(" premerge successful\n")
>               return 0
>           if premerge != 'keep':
>               util.copyfile(back, a) # restore from backup and try again
> @@ -215,11 +217,13 @@ def _imerge(repo, mynode, orig, fcd, fco
>       if r:
>           a, b, c, back = files
>   
>           ui = repo.ui
>   
> -        r = simplemerge.simplemerge(ui, a, b, c, label=labels)
> +        scope = ui.config('ui', 'mergemarkersscope', 'minimal')
> +        r = simplemerge.simplemerge(ui, a, b, c, label=labels,
> +                                    no_minimal=(scope == 'plain'))
>           return True, r
>       return False, 0
>   
>   @internaltool('dump', True)
>   def _idump(repo, mynode, orig, fcd, fco, fca, toolconf, files, labels=None):
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1226,10 +1226,15 @@ User interface controls.
>       marker during merge conflicts. See :hg:`help templates` for the template
>       format.
>       Defaults to showing the hash, tags, branches, bookmarks, author, and
>       the first line of the commit description.
>   
> +``mergemarkersscope``
> +    Set the amount of information included in the markers. The default
> +    ``minimal`` reduce the size of conflicting chunk as much as possible.
> +    ``plain`` can be used to include the whole content of the conflicting chunk.
> +
>   ``portablefilenames``
>       Check for portable filenames. Can be ``warn``, ``ignore`` or ``abort``.
>       Default is ``warn``.
>       If set to ``warn`` (or ``true``), a warning message is printed on POSIX
>       platforms, if a file with a non-portable filename is added (e.g. a file
> diff --git a/tests/test-conflict.t b/tests/test-conflict.t
> --- a/tests/test-conflict.t
> +++ b/tests/test-conflict.t
> @@ -114,5 +114,62 @@ Verify basic conflict markers
>     =======
>     4
>     5
>     >>>>>>> other
>     Hop we are done.
> +
> +Test minimalist config settings
> +
> +("plain" setting)
> +
> +  $ hg up -q --clean .
> +  $ printf "\n[ui]\nmergemarkersscope=plain\n" >> .hg/hgrc
> +
> +  $ hg merge 1
> +  merging a
> +  warning: conflicts during merge.
> +  merging a incomplete! (edit conflicts, then use 'hg resolve --mark')
> +  0 files updated, 0 files merged, 0 files removed, 1 files unresolved
> +  use 'hg resolve' to retry unresolved file merges or 'hg update -C .' to abandon
> +  [1]
> +  $ cat a
> +  Small Mathematical Series.
> +  <<<<<<< local
> +  1
> +  2
> +  3
> +  6
> +  8
> +  =======
> +  1
> +  2
> +  3
> +  4
> +  5
> +  >>>>>>> other
> +  Hop we are done.
> +
> +("minimal" setting)
> +
> +  $ hg up -q --clean .
> +  $ printf "\n[ui]\nmergemarkersscope=minimal\n" >> .hg/hgrc
> +
> +  $ hg merge 1
> +  merging a
> +  warning: conflicts during merge.
> +  merging a incomplete! (edit conflicts, then use 'hg resolve --mark')
> +  0 files updated, 0 files merged, 0 files removed, 1 files unresolved
> +  use 'hg resolve' to retry unresolved file merges or 'hg update -C .' to abandon
> +  [1]
> +  $ cat a
> +  Small Mathematical Series.
> +  1
> +  2
> +  3
> +  <<<<<<< local
> +  6
> +  8
> +  =======
> +  4
> +  5
> +  >>>>>>> other
> +  Hop we are done.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list