log: slowpath: do not read the full changelog

Nicolas Dumazet nicdumz at gmail.com
Mon Jul 19 18:11:28 CDT 2010


Hello!

2010/7/20 Matt Mackall <mpm at selenic.com>:
> You pushed this patch to crew:
>
>        log: slowpath: do not read the full changelog
>
>        When in the slowpath, we are examining _all_ changesets in revs.
>        We need to order reads so they happen increasingly for I/O performance.
>
>        Increasing windows were used to read changelog backwards in a windowed manner,
>        reading the changelog forward inside each window. But since no revision range
>        was specified, it was equivalent to reading the full changelog, even if a
>        single revision was passed to the commandline.
>
>        When --removed is used, we _need_ to scan all changesets, but if we're only
>        looking for file patterns, this is not necessary and we can stick to
>        the revspec that was given to us.
>
> And the patch contains the following:
>
> +        if opts.get('removed'):
> +            # --removed wants to yield the changes where the file
> +            # was removed, this means that we have to explore all
> +            # changesets, effectively ignoring the revisions that
> +            # had been passed as arguments
> +            revrange = xrange(nullrev, len(repo) - 1)
>
> Not sure about this: if I say 'hg log --removed -r 1.5:1.6 hg', then I
> want to know about anywhere between 1.5 and 1.6 where hg was modified,
> including removals. It's not necessary to look at anything outside of
> 1.5:1.6. If we do look at changesets outside of that range, it's wasting
> time, and if we report them, it's a bug. But it does mean that I must
> look at every changelog entry in that range and not simply the filelog.

I completely agree.
I blame myself for not having a clearer commit message. The previous
code was buggy and I dont think that I changed the behaviour in this
regard. "Increasing windows" were hiding this buggy behaviour, and
before my commit the full changelog was being read.

While I understand why increasing windows were introduced at first, I
think that they don't help for code comprehension... If I read
correctly the code there should be only one case where such
"moonwalking" is needed.


I wondered yesterday what would be the good behaviour with "--removed
-rA:B foo", or if it was buggy at all, since I only recently
discovered about --removed.
I asked on IRC, and since I did not get feedback I decided to push a
"simple" simplification, to avoid breaking anything I did not know
enough about. I tried to clearly put the behaviour in the open,
mentioning the perhaps-buggy issue, waiting to discuss it properly,
and to find an appropriate fix. Testing --removed in those situations
seems like a must.
Do you think I should have waited for a "full" serie before pushing
it? I so, I think that I can understand this, and try to work in this
sense!

Regards,
-Nicolas.

>
> Now if I use a revspec and say:
>
> hg log --removed -r '1.5::1.6' hg
>
> ...again, the same rules apply. I can't look at just the hg filelog, but
> I needn't (shouldn't!) look at anything outside of 1.5::1.6.
>
> And lastly, if I say:
>
> hg log --removed -r '1.5::1.6 and file(hg)'
>
> ..I've already done all the work of finding the relevant revisions
> before getting here. (Most of this code is obsoleted by revsets.)
>
> --
> Mathematics is the supreme nostalgia of our time.
>
>
>



-- 
Nicolas Dumazet — NicDumZ


More information about the Mercurial-devel mailing list