[PATCH 1 of 1 default] hgweb: column / side-by-side diff functionality
Patrick Mézard
patrick at mezard.eu
Sun Jun 10 11:53:03 CDT 2012
Nice work! Live version here for the curious:
http://mezard.eu/hg/hg-does-it-look-good-for-you/coldiff/e1aa1ed30030/mercurial/revset.py
Le 03/06/12 23:43, wujek srujek a écrit :
> # HG changeset patch
> # User wujek srujek <wujek.srujek at googlemail.com>
> # Date 1338755144 -7200
> # Node ID 2975d6fb8ad6cc24c020ed8507d9bc1c39652703
> # Parent 0a0cf3f26938ff7a084f2dcc9e59152ac6060e1e
> hgweb: column / side-by-side diff functionality
>
> Adds new web command to the core, ``filecoldiff`` (and an alias, ``coldiff``),
I would prefer to use "compare" or "comparison" rather than "diff". Other opinions about this?
filecoldiff -> filecomp
fullcoldiff -> fullcomparison
> which enables colorful side-by-side diff, which for some might be much easier to
> work with than the standard line diff output. The idea how to implement comes
> from the SonicHq extension.
>
> The web interface gets a new link to call the column diff functionality. It
> comes in two flavors: short output (the default), where only subsets (blocks) of
> the files' contents that were actually changed are returned, with 3 lines of
> context; and long output, where the whole files are output. The latter will
> result in much more output most of the time, especially where very few lines
> were changed in big files. The output modes are configurable with the new
> ``web.fullcoldiff`` setting in hgrc (taking values according to the
> ``mercurial.util.parsebool`` function). This hgrc setting can be overridden by
> adding ``fullcoldiff=<bool>`` to the request query string. The diff creates
> addressable lines, so as to enable sending links to appropriate lines, just as
> the standard diff does.
>
> Known limitations:
> * the column diff is done against the first parent, just as the standard diff
> * this change allows examining diffs for single files only (as I am not sure if
> examining the whole changeset in this way would be helpful)
> * syntax highlighting of the output changes is not performed (enabling the
> highlight extension has no influence on it)
> * only the default (paper) style is updated, the rest will come next
> * the number of context lines in short output is not configurable (I'm not sure
> if this is necessary)
>
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1377,6 +1377,11 @@
> ``errorlog``
> Where to output the error log. Default is stderr.
>
> +``fullcoldiff``
> + Whether column / side-by-side diff should output the files' full contents,
> + or just the changed blocks with context around them. Setting it to False
> + results in more concise output. Default is False.
> +
I am not really convinced by the partial mode. Does it bring any information compared to a regular diff, besides a different formatting? For me, this file comparison view is really useful when diff does not give enough context to understand the change, in which case I prefer to have the full file.
Compare:
http://mezard.eu/hg/hg-does-it-look-good-for-you/diff/e1aa1ed30030/mercurial/revset.py
http://mezard.eu/hg/hg-does-it-look-good-for-you/coldiff/e1aa1ed30030/mercurial/revset.py
http://mezard.eu/hg/hg-does-it-look-good-for-you/coldiff/e1aa1ed30030/mercurial/revset.py?fullcoldiff=1
I agree the last one would be better with a mean to navigate between groups.
> ``hidden``
> Whether to hide the repository in the hgwebdir index.
> Default is False.
> diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py
> --- a/mercurial/hgweb/webcommands.py
> +++ b/mercurial/hgweb/webcommands.py
> @@ -8,6 +8,7 @@
> import os, mimetypes, re, cgi, copy
> import webutil
> from mercurial import error, encoding, archival, templater, templatefilters
> +from mercurial import util
> from mercurial.node import short, hex
> from mercurial.util import binary
> from common import paritygen, staticfile, get_contact, ErrorResponse
> @@ -23,6 +24,7 @@
> 'log', 'rawfile', 'file', 'changelog', 'shortlog', 'changeset', 'rev',
> 'manifest', 'tags', 'bookmarks', 'branches', 'summary', 'filediff', 'diff',
> 'annotate', 'filelog', 'archive', 'static', 'graph', 'help',
> + 'filecoldiff', 'coldiff'
> ]
>
> def log(web, req, tmpl):
> @@ -586,6 +588,28 @@
>
> diff = filediff
>
> +def filecoldiff(web, req, tmpl):
> + fctx = webutil.filectx(web.repo, req)
> + rename = fctx and webutil.renamelink(fctx) or []
> + full = web.configbool('web', 'fullcoldiff', False)
> + if 'fullcoldiff' in req.form:
> + full = util.parsebool(req.form['fullcoldiff'][0])
> + coldiff = webutil.coldiff(tmpl, fctx, full)
> + return tmpl('filecoldiff',
> + file=fctx.path(),
> + node=hex(fctx.node()),
> + rev=fctx.rev(),
> + date=fctx.date(),
> + desc=fctx.description(),
> + author=fctx.user(),
> + rename=rename,
> + branch=webutil.nodebranchnodefault(fctx),
> + parent=webutil.parents(fctx),
> + child=webutil.children(fctx),
> + coldiff=coldiff)
> +
> +coldiff = filecoldiff
Do we need this alias?
> +
> def annotate(web, req, tmpl):
> fctx = webutil.filectx(web.repo, req)
> f = fctx.path()
> diff --git a/mercurial/hgweb/webutil.py b/mercurial/hgweb/webutil.py
> --- a/mercurial/hgweb/webutil.py
> +++ b/mercurial/hgweb/webutil.py
> @@ -6,10 +6,12 @@
> # This software may be used and distributed according to the terms of the
> # GNU General Public License version 2 or any later version.
>
> -import os, copy
> +import os, mimetypes, copy
> from mercurial import match, patch, scmutil, error, ui, util
> from mercurial.i18n import _
> from mercurial.node import hex, nullid
> +from mercurial.util import binary
Please do not import symbols from modules. It is unfortunate the two previous lines give a bad example but these are the only ones tolerated in the code base.
> +from difflib import SequenceMatcher
Ditto, just import difflib.
Note we may replace this one with our internal differ, but this could be done after this patch is accepted.
> def up(p):
> if p[0] != "/":
> @@ -220,6 +222,106 @@
> yield tmpl('diffblock', parity=parity.next(), blockno=blockno,
> lines=prettyprintlines(''.join(block), blockno))
>
> +def coldiff(tmpl, fctx, full):
> + '''Generator function that provides column / side-by-side diff data.'''
> +
> + def filelines(f):
> + if binary(f.data()):
if util.binary(f.data()):
> + mt = mimetypes.guess_type(f.path())[0]
> + if not mt:
> + mt = 'application/octet-stream'
> + return [_('(binary file %s, hash: %s)') % (mt, hex(f.filenode()))]
> + return f.data().splitlines()
> +
> + def diffline(type, leftlineno, leftline, rightlineno, rightline):
> + lineid = leftlineno and "l%s" % leftlineno or ''
Maybe wrap:
"l%s" % leftlineno
in parenthesis to avoid playing with operators priority.
> + lineid += rightlineno and "r%s" % rightlineno or ''
> + return tmpl('coldiffline',
> + type=type,
> + lineid=lineid,
> + leftlinenumber="% 6s" % (leftlineno or ''),
> + leftline=leftline or '',
> + rightlinenumber="% 6s" % (rightlineno or ''),
> + rightline=rightline or '')
> +
> + def getblock(opcodes):
> + for type, llo, lhi, rlo, rhi in opcodes:
> + if type in ('equal'):
This is the same than:
if type in 'equal':
which asks if type is a substring of 'equal'. You wanted:
if type in ('equal',):
or just:
if type == 'equal':
> + for i in xrange(lhi - llo):
> + yield diffline(type=type,
> + leftlineno=llo + i + 1,
> + leftline=leftlines[llo + i],
> + rightlineno=rlo + i + 1,
> + rightline=rightlines[rlo + i])
> + elif type == 'delete':
> + for i in xrange(llo, lhi):
> + yield diffline(type=type,
> + leftlineno=i + 1,
> + leftline=leftlines[i],
> + rightlineno=None,
> + rightline=None)
> + elif type == 'insert':
> + for i in xrange(rlo, rhi):
> + yield diffline(type=type,
> + leftlineno=None,
> + leftline=None,
> + rightlineno=i + 1,
> + rightline=rightlines[i])
> + else: # replace
Note this case handles all the others, you can just keep this one.
> + len1 = lhi - llo
> + len2 = rhi - rlo
> + if len1 < len2:
> + count = len1
> + else:
> + count = len2
count = min(len1, len2)
> + for i in xrange(count):
> + yield diffline(type=type,
> + leftlineno=llo + i + 1,
> + leftline=leftlines[llo + i],
> + rightlineno=rlo + i + 1,
> + rightline=rightlines[rlo + i])
> + if len1 > len2:
> + for i in xrange(llo + count, lhi):
> + yield diffline(type=type,
> + leftlineno=i + 1,
> + leftline=leftlines[i],
> + rightlineno=None,
> + rightline=None)
> + elif len2 > len1:
> + for i in xrange(rlo + count, rhi):
> + yield diffline(type=type,
> + leftlineno=None,
> + leftline=None,
> + rightlineno=i + 1,
> + rightline=rightlines[i])
> +
> + parents = fctx.parents()
> + if not parents:
> + leftrev = -1
> + leftnode = nullid
> + leftlines = ()
> + else:
> + pfctx = parents[0]
> + leftrev = pfctx.filerev()
> + leftnode = pfctx.filenode()
> + leftlines = filelines(pfctx)
> +
> + rightlines = filelines(fctx)
> +
> + s = SequenceMatcher(None, leftlines, rightlines, False)
The last "autojunk" argument was added by python 2.7.1. There are ways to test function arguments count, but it would mean a different behaviour between python < 2.7.1 and >= 2.7.1. Also, this could be replaced later by bdiff.blocks(). Removing the last argument completely is probably the way to go.
s = SequenceMatcher(None, leftlines, rightlines)
> + if full:
> + blocks = [tmpl('coldiffblock', lines=getblock(s.get_opcodes()))]
> + else:
> + blocks = (tmpl('coldiffblock', lines=getblock(oc))
> + for oc in s.get_grouped_opcodes())
Thank you for the patch!
--
Patrick Mézard
More information about the Mercurial-devel
mailing list