This page is primarily intended for Mercurial's developers.
Lazy Revsets Project
Revsets in Mercurial are used everywhere. This project is aimed to change the way they are processed and start returning values as soon as possible instead of building the entire revision set up front before returning.
Revsets are expressions used to filter revisions based on different conditions.
The main idea for evaluating a revset is to start with a subset which consists of the entire changelog represented as a list of revision numbers.
With this, filters start being applied based on the specified revset and the mentioned list starts mutating until every condition has been applied.
2. New-Style Revsets
Instead of the old filtering where we had the entire list in memory for every step, we now build a filter tree which consists on different nested structures that represent the specified revset.
Once this is done, we iterate over the entire changelog revision by revision (without actually having the entire list in memory up front) and as soon as a revision goes successfully through all the filters it's yielded. This means that showing the first result now takes much less time than before.
3. Main Classes
All the classes implemented duck-type baseset, which is the base data structure used for this project.
This class is a wrapper for list which has an inner set which is built lazily as soon as it is needed to provide faster membership testing.
The idea for this class is to be an initial wrapper of the structures that will be handled across revset methods when no smarter structure can be used.
This class contains 2 main parameters:
- A subset
- A condition
It provides lazy iteration over the subset checking for every element whether it satisfies the condition or not.
3.2.1. Ordered Lazyset
Lazyset with an ordered subset as underlying structure (in an ascending or descending order). Provides lazy implementations of methods like max, min, sort, ascending and descending
This class represents a revision range and uses the start and end numbers instead of building the entire list in memory.
It takes one mandatory parameter which is the repo object (to be able to work with the changeset evolution extension) and two optional parameters which represent the start and end revisions (which default to 0 and len(repo)).
When representing the entire list of revisions of a repo this structure should be used when possible
m = revset.match(repo.ui, spec) revs = m(repo, revset.spanset(repo))
Provides very fast implementation for methods like contains, reverse, max or min which only involve the start and end parameters.
This class is not a duck-type of baseset and it's not supposed to be used outside revset.py
It's basic use is to wrap a generator (since some revsets return a generator which used to be wrapped in a list/set) and provide lazy iteration and membership while reconstructing a list of it's elements to be able to be iterated more than once.
3.4.1. Asc/Desc Generatorset
Extension of the Generatorset class for generators that yield ordered values.
It overrides the contains method to stop iterating over the elements once it's gone past the value.
4. Operations Between Structures
Every structure has the and, sub, and add methods implemented to be able to return the best possible structure given itself and the structure it's operating with.
s = ...some revset structure... return [r for r in subset if r in s]
Should now be:
s = ...some revset structure... return subset & s
5. The 'filter' Method
Every structure that duck-types baseset should have a method called filter which takes a condition and returns a new structure with itself as the underlying set and the condition.
This new structure will be either a lazyset or an orderedlazyset based on the order of the structure on which filter is being called.
l = lambda x: ...some condition... return [r for r in subset if l(r)]
Should now be:
l = lambda x: ...some condition... return subset.filter(l)