history at diff blocks level

Denis Laxalde denis.laxalde at logilab.fr
Mon Oct 3 14:38:17 UTC 2016


Hi all,

I've been recently thinking about adding some support in Mercurial to
query repository history based on changes within a given line range in a
file. I think that would be useful in at least two commands:

* log, to restrict history to revisions that modify specified part(s) of
file(s) and only display the diff around specified line range and,

* annotate, to window the annotate result and maybe consider walking
file revision from tip to base to reduce computation costs.

(The "log" part is more interesting, I think.)


 From UI point of view, the basic idea is to specify a (file name, line
range) pair and the simplest solution I could find is something like:

   hg log/annotate --line-range fromline,toline FILE

but this does not work well with several files. (Perhaps something like
hg log FILE:fromline:toline would be better.) I also thought about a
"changes(filename, fromline, toline)" revset (or an extension of the
existing "modifies()" revset), but it seems to me that this would not
fit well for both log and annotate. Suggestions welcome.


 From the technical point of view, my idea is to rely on
mdiff.allblocks(<file content at rev1>, <file content at rev 2>) (used
in both annotate and log, when --patch option is given) to:

1. filter blocks depending on whether they are relevant to specified
line range (e.g., for the log command there's some "!" block), and,

2. track the evolution of the line range across revisions (namely, given
the line range at rev2, find the line range at rev1 in the example above).

I have something working concerning this "low level" aspect, but I'm
somehow getting stuck when it comes to plug things into the log command
call. Namely, I need to pass the "line range" information from one
revision to another during iterations of the loop on revisions in
commands.log() [1] and pass this information down to the mdiff.unidiff()
call [2] which would then give me back another line range to push up to
the outer loop on revisions. Given the complexity of the call chain, I
actually think this is not a very good idea... So the best idea I could
come up with is to filter revisions beforehand (as would a revset do)
but this would imply keeping track of files' line ranges per revision
(to avoid processing diff blocks twice when --patch option is specified
in particular). All in all, it's not clear to me which "tool" I may use
to achieve this (I thought about using the "filematcher" built along
with "revs" in commands.log(), but not really sure it's a good idea).
Maybe I just need a data structure that does not exist yet?
I'd appreciate any pointer to move forward.

(I'll be at the sprint, so this can also be a working topic if some
people are interested.)


-- 
Denis Laxalde
Logilab         http://www.logilab.fr

[1] https://selenic.com/repo/hg/file/tip/mercurial/commands.py#l5295
[2] https://selenic.com/repo/hg/file/tip/mercurial/patch.py#l2506


More information about the Mercurial-devel mailing list