[PATCH STABLE] revset: have rev() drop out-of-range or filtered rev explicitly (issue4396)

Pierre-Yves David pierre-yves.david at ens-lyon.org
Mon Oct 20 11:46:15 CDT 2014



On 10/20/2014 06:02 AM, Yuya Nishihara wrote:
> On Mon, 20 Oct 2014 03:06:01 -0700, Pierre-Yves David wrote:
>> On 10/19/2014 01:48 AM, Yuya Nishihara wrote:
>>> # HG changeset patch
>>> # User Yuya Nishihara <yuya at tcha.org>
>>> # Date 1413704913 -32400
>>> #      Sun Oct 19 16:48:33 2014 +0900
>>> # Branch stable
>>> # Node ID 76ab4107fb04497dcb9fc6eb9f90272572105963
>>> # Parent  a516f593498da57f5ec9c7967c967ee42626cd15
>>> revset: have rev() drop out-of-range or filtered rev explicitly (issue4396)
>>>
>>> The recent optimization of "and" operation relies on the assumption that
>>> the rhs set does not contain invalid revisions.  So rev() has to remove
>>> invalid revisions.
>>>
>>> This is still faster than using `.filter(lambda r: r == l)`.
>>>
>>> revset #0: rev(25)
>>> 0) wall 0.026341 comb 0.020000 user 0.020000 sys 0.000000 (best of 113)
>>> 1) wall 0.000038 comb 0.000000 user 0.000000 sys 0.000000 (best of 66567)
>>> 2) wall 0.000062 comb 0.000000 user 0.000000 sys 0.000000 (best of 43699)
>>> (0: bbf4f3dfd700^, 1: 3.2-rc, 2: this patch)
>>>
>>> diff --git a/mercurial/revset.py b/mercurial/revset.py
>>> --- a/mercurial/revset.py
>>> +++ b/mercurial/revset.py
>>> @@ -1359,6 +1359,8 @@ def rev(repo, subset, x):
>>>        except (TypeError, ValueError):
>>>            # i18n: "rev" is a keyword
>>>            raise error.ParseError(_("rev expects a number"))
>>> +    if l < 0 or l >= len(repo) or l not in repo:
>>
>> The `for l in repo` should be sufficient. We can switch it to `for l in
>> repo.changelog` for efficiency purpose and I do not believe it matter
>> much for such revset (but does not hurt).
>>
>> However, this playing with both version, I uncover a bug that make
>> repo[len(repo)] return nullrev instead of raisin a lookup error. (the
>> issue is in changectx, calling `changelog.node()` without prior (or
>> post) checking.
>>
>> I've changed it to::
>>
>>       if l not in repo.changelog:
>>
>> And pushed the result to the clowncopter.
>
> Isn't it slower than "l in repo" for large number?
> revlog doesn't have  __contains__().
>
> revset #0: rev(210000)
> 0) wall 0.026161 comb 0.030000 user 0.030000 sys 0.000000 (best of 107)
> 1) wall 0.000045 comb 0.000000 user 0.000000 sys 0.000000 (best of 58973)
> 2) wall 0.000063 comb 0.000000 user 0.000000 sys 0.000000 (best of 42704)
> 3) wall 0.002724 comb 0.000000 user 0.000000 sys 0.000000 (best of 1060)

gasp! I did not realised that. I switched it to `l not in 
fullreposet(repo)` in a hurry. We should probably fix changelog 
containment in 3.3. Thanks alot for double checking this, should not 
have tried to get thing "better" (but I happy to have found a bug).

> (using mozilla-central repo
>   0: bbf4f3dfd700^,
>   1: 3.2-rc,
>   2: original patch,
>   3: 231a4f6c08b6 in clowncopter)
>
>> However, mercurial 2.9 used to allow rev(-1) to select nullrev. So we
>> maybe want to restore that (or to decide for a proper behavior because
>> this could be confusing for users:
>>
>>     "hg log -r -1" → tip
>>     "hg log -r rev("-1")" → nullrev
>
> Really?  I think revset was limited to 0:tip because the initial subset
> was range(len(repo)), list(repo) or spanset(repo).

actually not. I messed up my initial testing. Current state is fine (but 
for the perf regression, pushed a fix).

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list