[PATCH 6 of 6 V2] obscache: use the obscache to compute the obsolete set

Sun May 21 17:52:05 EDT 2017

I chatted a bit with Jun on IRC, he told me to ignore the first two 
emails, so I did (I have not read them)

On 05/20/2017 11:00 PM, Jun Wu wrote:
> After examining this area more carefully, my final conclusions are:
>
>   - _fm1readmarkers, _addsuccessors, _addprecursors are painfully slow.
>   - The above slow functions are NOT improved by obscache.
>   - For "hg id", none of the above slow functions are called with obscache.
>     So, "hg id" is faster.
>
>   - "bumped" and "divergent" revset still require slow functions.
>   - For "hg log" with default "troubles" template, slow functions are called
>     and obscache won't help speed it up.

I was a bit confused by this part, so let us clarify for other readers. 
Yes, troubles computations currently requires to load the full history 
and doing so it slow. But this is unrelated to the current series.

Right now, obsolescence has a baseline impacts all mercurial commands. 
The cache in this series successfully remove that baseline impact.

Commands that needs to access the obsolescence history (for troubles or 
other) still has to pay the obsstore loading cost. We'll have to 
eventually improves that but its is not what this series is about.

> I believe a more general purposed, and better way to solve the perf issue is
> to build indexes, so precursors[x], successors[x] are O(1) instead of
> O(len(allmarkers)). If we go the indexing way, it will solve more problems
> and the obscache approach becomes unnecessary.

On disk index would be useful and we want to have them at some point. 
However, This is significantly more complex to build and not ready yet. 
According to our IRC discussion, Jun started poking at indexes but this 
is at an early stage. The cache in this series has been successfully 
deployed and used by many people for multiple weeks already. So moving 
forward with it for now seems better.

In addition, it is not clear indexes would remove the needs for other 
caches entirely:

* For example, the cache in this series use a straightforward bytearray 
to store a rev-indexable flag. This is extremely efficient to read and 
use. So that cache might still be useful after indexes land.

* Likewise, even if troubles computation get faster with indexes, some 
troubles are still pretty expensive to compute so we'll likely wants a 
cache for them too.

Finally half of this series is about introducing the cache base class. 
That class introduce a simple cache key logic (based on what we already 
did for changelog) and the associated incremental update capability. 
Newer caches in this area will likely reuse that logic. So alternative 
will be able to reuse it to build potential replacement for the cache in 
this series.

To conclude, I'll be happy to see some index for obsolescence markers, 
but the current caches as value on its own and is already completed

Cheers,

-- 
Pierre-Yves David