[PATCH 1 of 5 V5] dirstate: rename the dirstate parsing and packing methods

Laurent Charignon lcharignon at fb.com
Fri Dec 18 21:14:06 CST 2015


> On Dec 18, 2015, at 3:02 PM, Matt Mackall <mpm at selenic.com> wrote:
> 
> On Thu, 2015-12-17 at 15:44 -0800, Laurent Charignon wrote:
>> # HG changeset patch
>> # User Laurent Charignon <lcharignon at fb.com>
>> # Date 1450380353 28800
>> #      Thu Dec 17 11:25:53 2015 -0800
>> # Node ID 44eafafb98c9d24bdff7d6c46213ffe2cf8edc8d
>> # Parent  7e8a883da1711008262150bb6f64131812de3e0b
>> dirstate: rename the dirstate parsing and packing methods
> 
> Explodes my hg on apply. The pure/ code is not available for fallback
> unless the C module doesn't exist at all, fallback doesn't happen on a
> function by function basis.
> 
> Consider this strategy:
> 
> - don't touch parse_dirstate
> - don't touch the read/write methods
> - add a simple Python func (in dirstate, not pure) to build
> nonnormalmap from map
> - attach this to a propertycache so we only build it when needed
> - add a C version of that function in parsers
> - add a clause in Python function to call C function if available

Thanks for the review.

Your solution should work but it forces us to pick one of the following way build the non-normal map:
1) build the non-normal map from map (ad you suggest)
2) parse the dirstate file again

Both 1) and 2) come at a performance cost that is prohibitive according to Durham's first email on the topic sent to the list.
The time breakdown he sent was:
"A) parsing the data (370ms)
B) iterating over the dirstate looking for added/removed/lookup files (350ms)
C) 100ms of GC time
D) 60ms of reading the raw data off disk"

The goal of this series was to eliminate the cost of B by increasing the cost of A a little bit.
If we go with 1) or 2), it comes down to cutting ~350ms and adding either ~350ms or ~370ms.
If I understand correctly, your suggestion would only improve commands iterating multiple times over the map looking for non-normal entries (computed once and for all).
Am I understanding this correctly?

> 
> Something else to consider: it might be better for nonnormalmap to be a
> set?

We can probably do that indeed.
I used a dictionary to avoid having to do another lookup in the dmap if we need to access the entries.
I will check where we would potentially use it to see if it makes sense.

Thanks,

Laurent
> 
> -- 
> Mathematics is the supreme nostalgia of our time.
> 



More information about the Mercurial-devel mailing list