[PATCH 1 of 7] revset: introduce a _revinterval type and revsubset function

Pierre-Yves David pierre-yves.david at ens-lyon.org
Fri Jun 22 16:30:24 CDT 2012


On 5 juin 2012, at 00:25, Bryan O'Sullivan wrote:

> # HG changeset patch
> # User Bryan O'Sullivan <bryano at fb.com>
> # Date 1338848634 25200
> # Node ID 4f118bc8e20a571cf283695269a625d013f19eb4
> # Parent  3b266a181a5826e3c4b19bab142d1b208bafa7cc
> revset: introduce a _revinterval type and revsubset function
> 
> Very often, the set against which an intermediate revset is filtered
> is a contiguous sequence of revision numbers.  A _revinterval is
> far cheaper for this purpose than a set or list, as the amount of
> data to allocate is small and constant.
> 
> If a _revinterval is not appropriate, revsubset falls back to a
> set.  It also avoids duplication of an existing set if possible.
> 
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -13,6 +13,77 @@ import match as matchmod
> from i18n import _
> import encoding
> 
> +class _revinterval(object):
> +    """Half-closed revision interval: [start, stop)"""
> +    # can't use xrange instead, as it doesn't support slicing or cheap
> +    # __contains__ when reversed
> +
> +    def __init__(self, what=None, start=None, stop=None, reverse=False):
> +        if what is None:
> +            self._start, self._stop, self._reversed = start, stop, reverse
> +            return
> +        try:
> +            if isinstance(what, xrange):
> +                a, b = what[0], what[-1]
> +                self._start, self._stop = sorted((a, b))
> +                self._reversed = a > b
> +            else:
> +                # avoid localrepository.__getitem__
> +                self._start = iter(what).next()
> +                self._stop = self._start + len(what)
> +                self._reversed = False
> +        except (IndexError, StopIteration):
> +            self._start, self._stop, self._reversed = 0, 0, False

When does the except happen? Why can we silently fallback to 0:0 ?

> +    def __contains__(self, rev):
> +        return rev >= self._start and rev < self._stop

Fyi, python supports: return self._start <= rev < self._stop

> +    def __len__(self):
> +        return self._stop - self._start
> +
> +    def reverse(self):
> +        self._reversed = not self._reversed
> +
> +    def _iter(self, reverse):
> +        if reverse:
> +            return iter(xrange(self._stop - 1, self._start - 1, -1))
> +        return iter(xrange(self._start, self._stop))
> +
> +    def __iter__(self):
> +        return self._iter(self._reversed)
> +
> +    def __reversed__(self):
> +        return self._iter(not self._reversed)
> +
> +    def __getitem__(self, i):
> +        l = len(self)
> +        if isinstance(i, slice):
> +            start, stop, step = i.indices(l)
> +            if abs(step) != 1:
> +                raise ValueError('non-unit step not supported')
> +            if self._reversed:
> +                start, stop = self._stop - stop, self._stop - start
> +            else:
> +                start += self._start
> +                stop += self._start
> +            return _revinterval(start=start, stop=stop, reverse=step < 0)
> +        if i < 0:
> +            i += l
> +        if 0 <= i <= l:
> +            if self._reversed:
> +                return self._stop - i - 1
> +            return self._start + i
> +        raise IndexError('interval index out of range')
> +
> +def revsubset(subset):
> +    """Convert to a type that supports fast membership checking"""
> +    if isinstance(subset, (set, _revinterval)):
> +        return subset
> +    if isinstance(subset, list):
> +        # can't be assumed to be contiguous
> +        return set(subset)
> +    return _revinterval(subset)

What kind of object can subject be?

Given What I read, it could be: list, set, xrange or repo. Is there any other case ?

It may worse mention this in a comment near the last return.





More information about the Mercurial-devel mailing list