Proposal for detecting history rewriting on shared repos
gregory.szorc at gmail.com
Wed Feb 12 21:14:12 CST 2014
On 2/12/14, 4:24 PM, Pierre-Yves David wrote:
> On 02/12/2014 04:20 PM, Gregory Szorc wrote:
>> The share extension and workflow is very fragile. If rewriting occurs on
>> the original repository, there's a good chance shared clones of that
>> repo will get corrupted. While there is a giant warning in the output of
>> `hg help share` to warn you about this, Mercurial currently offers
>> little to no assistance to detect and recover from this.
> The branch cache have logic to detect non-append only operation on the
> view. The same kind of logic should be applicable here.
> I know that the current cache key generated by branchcache have some
> weakness for some corner case. If you hit them, feel free to improve it.
I didn't realize that code existed!
It sounds like you are proposing storing a hash of some set of revlog
data (possibly the revs or nodes of the changelog) as the store ID. I
think this could work. You're essentially proposing a direct test vs an
indirect one. The indirect one, while faster, relies on code paths being
complete or else we miss updates.
The current branch cache code is computing a hash over all filtered
revs. I /think/ that because the branch cache doesn't care about
filtered revs that it can get away with computing just the revs and not
For the share case, I /think/ we would need to hash nodes so history
rewriting that doesn't change rev count won't fall through a crack.
If a repo has changed since last open, we'll need to scan most of at
least the changelog index to get all the nodes. That's a few MB of I/O
on open assuming a repo with over 100k commits (my Firefox repo has a 12
MB 00changelog.i). It should hopefully be in the page cache. But still -
that could add up. Is this acceptable? Paging the changelog index
doesn't happen normally, does it?
More information about the Mercurial-devel