[PATCH] Add script to rewrite manifest to workaround lack of parent deltas
Greg Ward
greg-hg at gerg.ca
Fri Aug 21 16:58:13 CDT 2009
On Thu, Aug 20, 2009 at 7:04 PM, Benoit
Boissinot<benoit.boissinot at ens-lyon.org> wrote:
> On Thu, Aug 20, 2009 at 05:20:24PM -0400, Greg Ward wrote:
>> # HG changeset patch
>> # User Greg Ward <greg-hg at gerg.ca>
>> # Date 1233047576 0
>> # Node ID 7e0bbea3935b1044d3c5acfecae8005941dfc8ec
>> # Parent 2484868cffde3893e3fafb8e515d396346b87e17
>> Add script to rewrite manifest to workaround lack of parent deltas.
>
> Thanks for cleaning it up. Some comments below.
>
>> +
>> +def good_sort(rl):
>
> maybe toposort() ?
Yep. Renamed the other two functions to be more Hg-ish too.
>
> You can directly iterate on the revs:
>
> for i in rl:
Fixed.
> The following is simpler:
>
> children[i] = []
> parents = [p for p in rl.parentrevs(i) if p != -1]
> for p in parents:
> assert p in children
> children[p].append(i)
> if len(parents) == 0:
> root.append(i)
Duh, of course. Thanks. Fixed. Also used node.nullrev as suggested
by Dirkjan.
> Maybe it's cleaner to pop from the end
> i = visit.pop()
[...]
>> + if len(parents_with_child) == 0:
>> + next.append(c)
>> + visit = next + visit
> if you pop from the end, then you can do:
> visit += next
Not only is it cleaner, it massively improves the shrinkage factor on
my test repo: from 8.5x smaller to 16.9x smaller. Specifically,
pop(0) with "visit = next + visit" shrank a 56.1 MB manifest to 5.5
MB; fiddling with the order shrank it to 3.3 MB instead. Wow! I
suppose I should test it for correctness though. ;-)
>> + ret.append(i)
>> + if i not in children:
>> + # this only happens if some node's p1 == p2, which can happen in the
>> + # manifest in certain circumstances
>> + break
>
> break or continue ?
Ooh, good catch. Fixed.
>> + next = []
>> + for c in children.pop(i):
>> + parents_with_child = [p for p in rl.parentrevs(c) if p != -1 and p in children]
>
> parents_unseen is maybe better, we don't care if they have children, but
> we care if we already visited them.
I'll take your word for it. I don't entirely understand the
algorithm, and most of the names are taken from your original code.
So if you think parents_unseen is a better name, I'm all for it.
Done.
>> +def main():
[...]
>> + (tmpfd, tmpindexfn) = tempfile.mkstemp(
>> + dir=repo.join('store'), prefix='00manifest.', suffix='.i')
>
> I found it a bit cleaner to split after at least one arg.
OK, sure. Done.
> Is the non-nested except/try/finally possible with python 2.4 ?
Oops. Darn, that's my favourite 2.5 feature. Reverted to 2.4 syntax.
Thanks for the review! New patch coming soon.
Greg
More information about the Mercurial-devel
mailing list