Proposal for detecting history rewriting on shared repos

Gregory Szorc gregory.szorc at
Wed Feb 12 21:14:12 CST 2014

On 2/12/14, 4:24 PM, Pierre-Yves David wrote:
> On 02/12/2014 04:20 PM, Gregory Szorc wrote:
>> The share extension and workflow is very fragile. If rewriting occurs on
>> the original repository, there's a good chance shared clones of that
>> repo will get corrupted. While there is a giant warning in the output of
>> `hg help share` to warn you about this, Mercurial currently offers
>> little to no assistance to detect and recover from this.
> […]
>> Thoughts?
> The branch cache have logic to detect non-append only operation on the
> view. The same kind of logic should be applicable here.
> I know that the current cache key generated by branchcache have some
> weakness for some corner case. If you hit them, feel free to improve it.

I didn't realize that code existed!

It sounds like you are proposing storing a hash of some set of revlog 
data (possibly the revs or nodes of the changelog) as the store ID. I 
think this could work. You're essentially proposing a direct test vs an 
indirect one. The indirect one, while faster, relies on code paths being 
complete or else we miss updates.

The current branch cache code is computing a hash over all filtered 
revs. I /think/ that because the branch cache doesn't care about 
filtered revs that it can get away with computing just the revs and not 

For the share case, I /think/ we would need to hash nodes so history 
rewriting that doesn't change rev count won't fall through a crack.

If a repo has changed since last open, we'll need to scan most of at 
least the changelog index to get all the nodes. That's a few MB of I/O 
on open assuming a repo with over 100k commits (my Firefox repo has a 12 
MB 00changelog.i). It should hopefully be in the page cache. But still - 
that could add up. Is this acceptable? Paging the changelog index 
doesn't happen normally, does it?

More information about the Mercurial-devel mailing list