D3212: patch: implement a new worddiff algorithm
yuja (Yuya Nishihara)
phabricator at mercurial-scm.org
Tue Apr 10 11:00:59 EDT 2018
yuja requested changes to this revision.
yuja added a comment.
This revision now requires changes to proceed.
I have no opinion about the "dim" thingy, but the series generally looks
good to me.
Thanks for tackling on the painfully slow `SequenceMatcher.ratio()` issue.
INLINE COMMENTS
> patch.py:53
> tabsplitter = re.compile(br'(\t+|[^\t]+)')
> -_nonwordre = re.compile(br'([^a-zA-Z0-9_\x80-\xff])')
> +wordsplitter = re.compile(br'(\t+| +|[a-zA-Z0-9_\x80-\xff]+|'
> + '[^ \ta-zA-Z0-9_\x80-\xff])')
Nit: `_wordsplitter` as it is private constant
> patch.py:54
> +wordsplitter = re.compile(br'(\t+| +|[a-zA-Z0-9_\x80-\xff]+|'
> + '[^ \ta-zA-Z0-9_\x80-\xff])')
>
Missed `br''` here though "\t" and "\x" of string escape are compatible with regexp's.
> patch.py:2536
> + for token in mdiff.splitnewlines(''.join(bl[b1:b2])):
> + btokens.append((changed, token))
> +
Nit: maybe we can sort out tokens here instead of re-parsing tabs, newlines, trailing whitespaces later.
But I'm not sure if that will make things simpler.
REPOSITORY
rHG Mercurial
REVISION DETAIL
https://phab.mercurial-scm.org/D3212
To: quark, #hg-reviewers, durin42, yuja
Cc: yuja, spectral, mercurial-devel
More information about the Mercurial-devel
mailing list