[RFC] revision sets

Bill Barry after.fallout at gmail.com
Mon Apr 19 22:22:37 CDT 2010


Matt Mackall wrote:
> Right now we have a notion of revision ranges, ie:
>
>  hg log -r 0:1.0
>
> Internally, we iterate over this with something like:
>
>  for rev in commandutil.revrange(opts['rev']):
>
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
>
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
>
> Which would be identical to:
>
>  hg log -b foo -k bar -d "mar 1 - apr 1"
>
> but would magically work anywhere -r ranges were accepted (export, push,
> etc.).
>
> Further, we'd be able to add lots of interesting primitives:
>
>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"
>
> Read that as: every cset that is descended from the second parent of
> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> sorted by date in reverse order.
>
> revrange would be replaced by a new revset function that would parse the
> query/queries and build an iterator. Some of the operations, like
> keyword() and author(), would obviously be fairly expensive and many
> would fail to work (at least for now) on remote repos.
>
> I've pitched an idea like this before, usually with a weird
> operator-intensive syntax. This time, I think the right thing is an
> easily-read but more verbose query language.
>
> Steps to get from here to there:
>
> - change all callers of revrange to revset
>   
Why would that even need to be done? Surely this new design could be 
done in place... Backwards compatability need not be broken.
> - design a BNF for the revset query language
>   
meh... I don't think this needs to be done either, it looks to me like 
this dsl is very simple:

"descendant(parent2(1.0)) and ancestor(2.0) and author(george) and sorted(date) and reversed()"

step 1: encapsulate literals
    1.0 => '1.0'
    2.0 => '2.0'
    george => 'george'
    date => 'date'
step1="descendant(parent2('1.0')) and ancestor('2.0') and 
author('george') and sorted('date') and reversed()"

step 2: convert "method calls" to class instance method calls:

step2="self.descendant(self.parent2('1.0')) and self.ancestor('2.0') and self.author('george') and self.sorted('date') and self.reversed()"

step 3: convert and/or into method calls:

step3="self.reversed(self.sorted(self.and(self.and(self.descendant(self.parent2('1.0')), self.ancestor('2.0')), self.author('george')), 'date'))"

 step 4: eval this statement

class revfinder(object):
    def __init__(self, repo, query):
        self.repo=repo
        step1 = re.sub(r'\(([^)]*)\)', "('\1')", query)
        step2 = re.sub(r'(\w+)\(', 'self.\1(', step1)
        def fixor(statement):
            stack = []
            for item in re.split(r'\s+or\s+', step2):
                stack.append(item)
                if stack.count==2:
                    stack = ['self.orf('+', '.join(stack)+')']
            return stack[0]

        andstack=[]
        for item in re.split(r'\s+and\s+', step2):
            if item.startswith('self.sorted('):
                andstack=[item.replace('self.sorted(', 
'self.sorted('+andstack[0]+', ')]
            elif item.startswith('self.reversed('):
                andstack=['self.reversed('+andstack[0]+')']
            else:
                andstack.append(fixor(item))
                if andstack.count==2:
                     andstack = ['self.andf('+', '.join(andstack)+')']
        self.statement=andstack[0]

    def evalstmt(self):
        return eval(self.statement)

    def andf(self, set1, set2):
        pass
    def orf(self, set1, set2):
        pass
    def reversed(self, changes):
        pass
    def sorted(self, changes, key):
        pass
    ... (would need to actually define all these functions)

Bonus points that this class is extensible as normal python; extensions 
can add their own search functions, sort keys, etc; the whole thing is a 
simple dsl that translates right back to normal python; a formal 
language wouldn't have these advantages. In each function, changes is 
either a string or a list of changesets.
as to the code above, I wouldn't exactly trust it (my python fu isn't 
exactly gosu and I wouldn't recommend using any code written in 
thunderbird) but as pseudocode I think it has some merit.
> - build a query parser/"compiler"
> - add filters for the query functions
> - simplify some of the existing options (like -d and -k) by turning them
> into queries internally
>
> Thoughts?
>
>   



More information about the Mercurial-devel mailing list