[PATCH 1 of 3 evolve-ext-V2] evolve: add ordering of the revisions for evolve --rev

Thu Jun 4 12:00:51 CDT 2015

On 06/04/2015 09:54 AM, Laurent Charignon wrote:
>
>> On Jun 4, 2015, at 9:19 AM, Pierre-Yves David <pierre-yves.david at ens-lyon.org> wrote:
>>
>>
>>
>> On 06/03/2015 02:04 PM, Laurent Charignon wrote:
>>> # HG changeset patch
>>> # User Laurent Charignon <lcharignon at fb.com>
>>> # Date 1433283749 25200
>>> #      Tue Jun 02 15:22:29 2015 -0700
>>> # Node ID 16e32f9b706fa06a5a3c88e8a26139bfd128f4bf
>>> # Parent  b4a62d6f03534de6e65962c0e92cadea4b6cbad1
>>> evolve: add ordering of the revisions for evolve --rev
>>>
>>> When running evolve --rev we want to process the revisions in an optimal fashion
>>> to solve the maximum amount of trouble in the minimum number of steps.
>>> This patch adds a step to evolve --rev to order the revision before solving
>>> the troubles. A simple test is added to cover a basic case.
>>
>> We are getting there. I will take a version with fixed docstring, see below (and discuss).
>>
>>> diff --git a/hgext/evolve.py b/hgext/evolve.py
>>> --- a/hgext/evolve.py
>>> +++ b/hgext/evolve.py
>>> @@ -28,6 +28,7 @@
>>>   from StringIO import StringIO
>>>   import struct
>>>   import re
>>> +import collections
>>>   import socket
>>>   import errno
>>>   sha1re = re.compile(r'\b[0-9a-f]{6,40}\b')
>>> @@ -1232,6 +1233,69 @@
>>>       if repo['.'] != startnode:
>>>           ui.status(_('working directory is now at %s\n') % repo['.'])
>>>
>>> +def _singlesuccessor(repo, p):
>>
>> This function is non trivial and need a docstring explaining what is its purpose (returning the one unique latest successors of 'p' or die trying.)
>>
>>> +    if not p.obsolete():
>>> +        return p
>>> +    obs = repo[p]
>>> +    ui = repo.ui
>>> +    newer = obsolete.successorssets(repo, obs.node())
>>> +    # search of a parent which is not killed
>>> +    while not newer or newer == [()]:
>>> +        ui.debug("stabilize target %s is plain dead,"
>>> +                 " trying to stabilize on its parent\n" %
>>> +                 obs)
>>> +        obs = obs.parents()[0]
>>> +        newer = obsolete.successorssets(repo, obs.node())
>>> +    if len(newer) > 1:
>>> +        raise util.Abort(_("conflict rewriting. can't choose destination\n"))
>>> +    return repo[newer[0][0]].rev()
>>> +
>>> +def orderrevs(repo, revs):
>>> +    """ Compute an ordering to solve instability for the given revs
>>> +
>>> +    - Takes revs a list of troubled revisions
>>> +    - Returns the same revisions ordered to solve their instability
>>> +
>>> +    First we compute dependencies between the revs.
>>> +    For each troubled revision we keep track of what instability if any should
>>> +    be resolved in order to resolve it.
>>> +
>>> +    Then we remove the revisions with no dependency(A) and add them to the
>>> +    ordering.
>>> +    Removing these revisions leads to new revisions with no dependency (the one
>>> +    depending on A) that we can remove from the dependency graph and add to the
>>> +    ordering. We progress in a similar fashion until the ordering is build
>>> +    """
>>
>> That docstring is full of good inforation, however it a bit too low level for a docstring. Most of it should be moved as inline comment (#). The goal of the docstring is to carry the 'contract' of the function, its high level goal. The current content is details about the implementation.
>>
>> What this function do is something along the line of
>>
>> - return revision from the bottom to the top of the -final- stack that the stabilization process will produce.
>>
>> And this important because:
>>
>> - This ensure we can stabilize each revision on its final destination. (or phrased the other way. final destination is always available when we process a revision)
>
> Ok
>
>>
>>> +    # dependencies = {3: [6], 6:[]}
>>> +    # Means that: 6 has no dependency, 3 depends on 6 to be solved
>>> +    dependencies = {}
>>> +    # rdependencies is the inverted dict of dependencies
>>> +    rdependencies = collections.defaultdict(set)
>>> +
>>> +    # Step 1: Build the dependency graph
>>> +    for r in revs:
>>> +        dependencies[r] = set()
>>> +        for p in repo[r].parents():
>>> +            succ = _singlesuccessor(repo, p)
>>> +            if succ in revs:
>>> +                dependencies[r].add(succ)
>>> +                rdependencies[succ].add(r)
>>> +
>>> +    # Step 2: Build the ordering
>>> +    solvablerevs = collections.deque([ r for r in sorted(dependencies.keys())\
>>> +                                         if len(dependencies[r]) == 0 ])
>>
>> Small nits:
>> - you could use a heap, they are sorted by definition
>> - You may want to use a stack instead this would give some of the 'one branch after the other' property for free.
>
> I think that it is the best option, let's go for that

I think I would be happy to take this patch as is (with fixed docstring) 
and take code refactor in another patch. To make the topic move forward.

>> - you can use dependencies.items() to get both key value in the same go
>
> Yes but it makes the sorting code more complicated if we don't go with the heap.

does it?

[key for key, value in sorted(dependencies.keys()) if value]

(other nit, not space inside (), [] amd {})

>> - Python style recommendation is to use `if not dep[r]` instead of checking len to 0
>
> Ok
>
>>
>>> +    ordering = []
>>> +    while solvablerevs:
>>> +        rev = solvablerevs.popleft()
>>> +        for dependent in rdependencies[rev]:
>>> +            dependencies[dependent].remove(rev)
>>> +            if len(dependencies[dependent]) == 0:
>>
>> (ditto, small nits, regarding len usage here)
>>
>>> +                solvablerevs.append(dependent)
>>> +        ordering.append(rev)
>>
>> V2 looks simpler and more efficient, nice job.
>>
>> --
>> Pierre-Yves David
>

-- 
Pierre-Yves David