[PATCH 4 of 4] mmapindex: set default to 1MB

Boris FELD boris.feld at octobus.net
Mon Jan 7 03:45:32 EST 2019


On 03/01/2019 09:58, Yuya Nishihara wrote:
> On Wed, 2 Jan 2019 23:40:11 +0100, Boris FELD wrote:
>> On 04/12/2018 12:09, Yuya Nishihara wrote:
>>> On Sun, 02 Dec 2018 17:17:43 +0100, Boris Feld wrote:
>>>> # HG changeset patch
>>>> # User Boris Feld <boris.feld at octobus.net>
>>>> # Date 1542949784 -3600
>>>> #      Fri Nov 23 06:09:44 2018 +0100
>>>> # Node ID 9708243274585f9544c70925eb0b0fa0ec7aba4f
>>>> # Parent  0fff68dfbe48d87dce8b8736c0347ed5aa79030e
>>>> # EXP-Topic mmap
>>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 970824327458
>>>> mmapindex: set default to 1MB
>>> Can you check if strip/rollback properly copy the revlog before truncating it?
>>>
>>> If a mmapped revlog is truncated by another process, the mapped memory could be
>>> invalid. The worst case, the process would be killed by SIGBUS.
>> Hum good catch, process reading a repository being stripped have always
>> been up for troubles. However, mmap makes it worse by raising a signal
>> instead of just having wonky seek. It also introduces new cases where
>> this can happen.
> mmap isn't worse because of SIGBUS, but because the index data can be updated
> after the index length is determined. Before, a single in-memory revlog index
> was at least consistent.
>
>> What shall we do here, I guess our best bet is to intercept these SIGBUS
>> when reading revlog index?
Yes, but it would be inconsistent with the data it was pointing to.
Access to this data would result in error too. The new thing is that we
can get SIGBUS while accessing the index data themselves, as you are
pointing out.
> I don't think it'll be easy to handle SIGBUS properly. And SIGBUS won't always
> be raised. Perhaps, the easiest solution is to copy the revlog index before
> strip/rollback.

I'm afraid at the performance impact, we are talking of potentially
hundreds of MB of index data to be rolled back.

Maybe we can keep the current truncation in normal transaction rollback
and use the slower copies for the hg strip command itself (and rewrite)?

However, I'm afraid we need to come up with a solution for mmap as it
would be useful to use it more and more.

Maybe we can come up with something catching the SIGBUS? Or maybe we
need to never truncate files and keep an alternative way to track the
maximum offset? Any other ideas?

>
> IIRC, the mmap implementation was highly experimental. I don't know if it's
> widely tested other than in FB where strip is never used.
We have been using it internally, and one of our clients deployed it
too. It results in significant speed and memory improvement.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list