[PATCH 4 of 4] mmapindex: set default to 1MB

Yuya Nishihara yuya at tcha.org
Fri Jan 18 07:32:26 EST 2019


On Wed, 16 Jan 2019 15:23:57 +0100, Boris FELD wrote:
> On 11/01/2019 15:46, Yuya Nishihara wrote:
> > On Fri, 11 Jan 2019 09:59:49 +0100, Boris FELD wrote:
> >> On 11/01/2019 01:05, Gregory Szorc wrote:
> >>> On Thu, Jan 10, 2019 at 9:03 AM Boris FELD <boris.feld at octobus.net
> >>> <mailto:boris.feld at octobus.net>> wrote:
> >>>
> >>>     On 08/01/2019 15:41, Yuya Nishihara wrote:
> >>>     > On Mon, 7 Jan 2019 09:45:32 +0100, Boris FELD wrote:
> >>>     >> On 03/01/2019 09:58, Yuya Nishihara wrote:
> >>>     >>> On Wed, 2 Jan 2019 23:40:11 +0100, Boris FELD wrote:
> >>>     >>>> On 04/12/2018 12:09, Yuya Nishihara wrote:
> >>>     >>>>> On Sun, 02 Dec 2018 17:17:43 +0100, Boris Feld wrote:
> >>>     >>>>>> # HG changeset patch
> >>>     >>>>>> # User Boris Feld <boris.feld at octobus.net
> >>>     <mailto:boris.feld at octobus.net>>
> >>>     >>>>>> # Date 1542949784 -3600
> >>>     >>>>>> #      Fri Nov 23 06:09:44 2018 +0100
> >>>     >>>>>> # Node ID 9708243274585f9544c70925eb0b0fa0ec7aba4f
> >>>     >>>>>> # Parent  0fff68dfbe48d87dce8b8736c0347ed5aa79030e
> >>>     >>>>>> # EXP-Topic mmap
> >>>     >>>>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >>>     >>>>>> #              hg pull
> >>>     https://bitbucket.org/octobus/mercurial-devel/ -r 970824327458
> >>>     >>>>>> mmapindex: set default to 1MB
> >>>     >>>>> Can you check if strip/rollback properly copy the revlog
> >>>     before truncating it?
> >>>     >>>>>
> >>>     >>>>> If a mmapped revlog is truncated by another process, the
> >>>     mapped memory could be
> >>>     >>>>> invalid. The worst case, the process would be killed by SIGBUS.
> >>>     >>>> Hum good catch, process reading a repository being stripped
> >>>     have always
> >>>     >>>> been up for troubles. However, mmap makes it worse by raising
> >>>     a signal
> >>>     >>>> instead of just having wonky seek. It also introduces new
> >>>     cases where
> >>>     >>>> this can happen.
> >>>     >>> mmap isn't worse because of SIGBUS, but because the index data
> >>>     can be updated
> >>>     >>> after the index length is determined. Before, a single
> >>>     in-memory revlog index
> >>>     >>> was at least consistent.
> >>>     >>>
> >>>     >>>> What shall we do here, I guess our best bet is to intercept
> >>>     these SIGBUS
> >>>     >>>> when reading revlog index?
> >>>     >> Yes, but it would be inconsistent with the data it was pointing to.
> >>>     >> Access to this data would result in error too.
> >>>     > Correct.
> >>>     >
> >>>     >> The new thing is that we
> >>>     >> can get SIGBUS while accessing the index data themselves, as
> >>>     you are
> >>>     >> pointing out.
> >>>     > Another new thing is that truncated revisions can be seen as
> >>>     valid changesets
> >>>     > of '000...' hash with 0-length text. If the whole (or maybe
> >>>     dozens of)
> >>>     > revisions were truncated, SIGBUS would be raised. In other
> >>>     cases, the truncated
> >>>     > part would probably be read as zeros.
> >>>     >
> >>>     >>> I don't think it'll be easy to handle SIGBUS properly. And
> >>>     SIGBUS won't always
> >>>     >>> be raised. Perhaps, the easiest solution is to copy the revlog
> >>>     index before
> >>>     >>> strip/rollback.
> >>>     >> I'm afraid at the performance impact, we are talking of potentially
> >>>     >> hundreds of MB of index data to be rolled back.
> >>>     >>
> >>>     >> Maybe we can keep the current truncation in normal transaction
> >>>     rollback
> >>>     >> and use the slower copies for the hg strip command itself (and
> >>>     rewrite)?
> >>>     >>
> >>>     >> However, I'm afraid we need to come up with a solution for mmap
> >>>     as it
> >>>     >> would be useful to use it more and more.
> >>>     >>
> >>>     >> Maybe we can come up with something catching the SIGBUS? Or
> >>>     maybe we
> >>>     >> need to never truncate files and keep an alternative way to
> >>>     track the
> >>>     >> maximum offset? Any other ideas?
> >>>     > I've no idea. My point is that catching SIGBUS wouldn't save us
> >>>     from many
> >>>     > possible failures.
> >>>     >
> >>>     >>> IIRC, the mmap implementation was highly experimental. I don't
> >>>     know if it's
> >>>     >>> widely tested other than in FB where strip is never used.
> >>>     >> We have been using it internally, and one of our clients
> >>>     deployed it
> >>>     >> too. It results in significant speed and memory improvement.
> >>>     > Yup. I'm just afraid of enabling it by default.
> >>>
> >>>     Ok, what do you think we should do for 4.9?
> >>>
> >>>
> >>> We're introducing a new repo requirement in 4.9 for sparse revlogs.
> >> We are not introducing a new repo requirement in 4.9 for sparse-revlogs.
> >> The repo-requirements and its guarantees have been introduced in 4.7.
> >>> I believe this strip problem (along with a ton of other problems) goes
> >>> away if we have reader locks on repos.
> >>>
> >>> Since we are already introducing a new repo requirement for new repos,
> >>> we can introduce yet more repo requirements without any additional
> >>> significant user inconvenience.
> >> (except we are not introducing a new requirement)
> >>> I know it is a stretch since we only have a few days left before the
> >>> 4.9 RC, but is anybody will to implement reader locks (or some other
> >>> repo mutual exclusion mechanism that could mitigate the strip
> >>> problem)? Of course, reader locks break the ability to open a
> >>> repository on a read-only filesystem. So, we shouldn't use them
> >>> lightly! But we could do something like write a repo/transaction
> >>> "generation number" into .hg/store and read it to detect repo
> >>> mutations before certain operations. We could potentially also have
> >>> "mmap mode" require writing a file, socket file, etc into .hg/store
> >>> and have any writer refuse to strip if there is an active reader doing
> >>> mmap.
> >>>
> >>> That's a lot of random ideas, I know. But I think that in order to
> >>> enable mmap by default, we need to lock out mutations on mmap'd file
> >>> segments in order to prevent crashes. This seemingly requires a reader
> >>> lock or transaction changes of some form.
> >> We are a week away from the freeze and reader locks are neither written
> >> nor tested. In addition, reader locks solve the strip issue but come
> >> with their own set of troubles, the first one being that they require
> >> write access.
> >>
> >> I don't think it is realistic to use reader lock to solve the mmap
> >> question right now. Having a new repository format to handle better
> >> handle transaction is something we will need, but also a significant
> >> amount of work and testing.
> > Agreed. I don't think reader locks would be implemented correctly in 5 days.
> > It'll require more careful inspection given our storage codes splattered
> > around various modules. I also think mmap is in the same boat. It isn't
> > a drop-in replacement for read()/write(). IMHO, we should carefully design
> > the data structure and the operations so that mmap-ed data are reliably
> > processed.
> >
> >> I would rather focus on having a "graceful" way to catch SIGBUS (and
> >> potential other errors that Yuya pointed out). The mmap function has
> >> been tested in multiple places without troubles, and it raises large
> >> benefit. Just consolidating the potential crashing corner case seems
> >> good enough to get it shipped to all. We can spend more time later to
> >> narrow these corner case if they prove troublesome.
> >>
> >> In all case, we have the easy fallback of disabling it again in case of
> >> need, it does not affect the on-disk format.
> >>
> >> Yuya, what do you think?
> > My vote is to not enable the mmap by default. I'm okay to move the option
> > out of the experimental so it gets more reachable by experienced users.
> > FWIW, it appears not documented.
> 
> Okay, so let's put mmaped indexes back to disabled by default for this
> release. The potential SIGBUS/zero issues pointed by Yuya seems serious
> enough.
> 
> I'm tempted to move the config back to experimental too until we decide
> what final form the feature will take. Given the issue we need to solve
> we will probably have to change some of the read/write/truncates
> patterns and possibly the on-disk format. This will need a new
> requirement. The simplest option is probably to have strip/rollback copy
> the file over, but this will require careful performance testing.
> 
> Moving mmap back to experimental means backing out 74a9f428227e and
> 875d2af8cb4e.

Can you send backout patches?

Matt Harbison found Windows issue. Perhaps, an mmapped revlog file would
be locked while it's open, and couldn't be updated by another process.

https://groups.google.com/d/msg/thg-dev/X7RLiEzSPm0/qiTMjDRNEAAJ


More information about the Mercurial-devel mailing list