history at diff blocks level

Jun Wu quark at fb.com
Sun Oct 16 15:14:53 EDT 2016


Excerpts from Denis Laxalde's message of 2016-10-03 16:38:17 +0200:
> Hi all,
> 
> I've been recently thinking about adding some support in Mercurial to
> query repository history based on changes within a given line range in a
> file. I think that would be useful in at least two commands:
> 
> * log, to restrict history to revisions that modify specified part(s) of
> file(s) and only display the diff around specified line range and,
> 
> * annotate, to window the annotate result and maybe consider walking
> file revision from tip to base to reduce computation costs.
> 
> (The "log" part is more interesting, I think.)

I've been thinking about this as well for "fastannotate --deleted". Although
"annotate" is generally easier than "log" in this case: slicing the
annotated lines seems to be enough.

>  From UI point of view, the basic idea is to specify a (file name, line
> range) pair and the simplest solution I could find is something like:
> 
>    hg log/annotate --line-range fromline,toline FILE
> 
> but this does not work well with several files. (Perhaps something like
> hg log FILE:fromline:toline would be better.) I also thought about a

+1 for "FILE:fromline:toline". It is intuitive and makes sense. A new
boolean flag (like "--line-ranges") that enables the syntax explicitly
may be necessary. The flag can avoid conflicts with existing matcher syntax,
and make it clear that some commands like "add" do not support line ranges.

> "changes(filename, fromline, toline)" revset (or an extension of the
> existing "modifies()" revset), but it seems to me that this would not
> fit well for both log and annotate. Suggestions welcome.
> 
> 
>  From the technical point of view, my idea is to rely on
> mdiff.allblocks(<file content at rev1>, <file content at rev 2>) (used
> in both annotate and log, when --patch option is given) to:
> 
> 1. filter blocks depending on whether they are relevant to specified
> line range (e.g., for the log command there's some "!" block), and,
> 
> 2. track the evolution of the line range across revisions (namely, given
> the line range at rev2, find the line range at rev1 in the example above).
>
> I have something working concerning this "low level" aspect, but I'm
> somehow getting stuck when it comes to plug things into the log command
> call. Namely, I need to pass the "line range" information from one
> revision to another during iterations of the loop on revisions in
> commands.log() [1] and pass this information down to the mdiff.unidiff()
> call [2] which would then give me back another line range to push up to
> the outer loop on revisions. Given the complexity of the call chain, I
> actually think this is not a very good idea... So the best idea I could
> come up with is to filter revisions beforehand (as would a revset do)
> but this would imply keeping track of files' line ranges per revision
> (to avoid processing diff blocks twice when --patch option is specified
> in particular). All in all, it's not clear to me which "tool" I may use
> to achieve this (I thought about using the "filematcher" built along
> with "revs" in commands.log(), but not really sure it's a good idea).
> Maybe I just need a data structure that does not exist yet?
> I'd appreciate any pointer to move forward.

I think "changeset_printer" and "diffordiffstat" are worth considering.
"diffordiffstat" is currently stateless. A possible direction is to add a
new stateful "diffordiffstat" that tracks the line ranges.

If revisions are filtered before-hand, the state could be passed to the new
"diffordiffstat" function to avoid unnecessary calculation.

It seems to me that high level diff function like "mdiff.unidiff" could take
an extra parameter "difffunc" which defaults to "allblocks". Then we can
have a "filteredallblocks" passed to "unidiff".

> 
> (I'll be at the sprint, so this can also be a working topic if some
> people are interested.)
> 


More information about the Mercurial-devel mailing list