This page is primarily intended for Mercurial's developers.
MMap Usage Plan
Main proponents: Pierre-YvesDavid
This is a speculative project and does not represent any firm decisions on future behavior.
Plan to increase mmap usage to access repository data.
Reading large data from disk is not very efficient, it requires to allocate dedicated memory for the data. This take times and increase memory consumption. We want to make more use of the mmap feature both on linux on windows. However, acessing data using mmap comes with various challenge as a new class of error can appears
2. Detailed description
2.1. The Benefits
Using mmap provide a performance boost for large store file:
running on mozilla try: $ hg perfnodemap --rev -10: ! wall 0.000024 comb 0.000000 user 0.000000 sys 0.000000 (best of 100) $ hg perfnodemap --rev -10: --config experimental.mmapindexthreshold=1k ! wall 0.000008 comb 0.000000 user 0.000000 sys 0.000000 (best of 37681)
It could also simplify the reading pattern for delta chain since we could rely on the OS pagination.
Last but not least, on busy server, accessing file using mmap means that all processes read their data from the same memory (the OS fs cache), greatly reducing the memory usage.
2.2. The Problems
Using mmaps comes with a handful of new constraints:
- truncating mmaped file can lead to SIGBUS being raise on READ
mmaped files are considered open, this can confused windows
2.3. The Solutions
I chatted about this issue with someone at Microsoft, using the right flag during file operation can solve the issue on Windows.
The truncation issue is "quite simple": We can no longer truncate any file in place. Any actual truncation needs the full file being rewritten (using reflink copies could help here). However, truncation are currently quite common in Mercurial. For example, any aborted transaction can lead to file truncation. To make this viable, we need to reduce the number of actual truncation we makes. To do so, we can use an extra "pointer" file, that file will indicate the current range of data to read from any mmap eligible file. That way, some extra data can be kept at the end of a file without impacting normal operation.
A normal read sequence would be:
1: read the xxx.pointer file, it contraints two value (<uuid>, <size>) 2: mmap the xxx.data file 3: check that xxx.data file <uuid> (start of the file) is the same, if not goto 1 4: bound any access to xxx.data to <size>
A write sequence would be:
1: read the `xxx.pointer` file, it contraints two values (`<uuid>`, `<size>`) 2: open `xxx.data` (check `<uuid>` for good measure, repository could be corrupt) 3: write data starting at `<size>` 4: flush/close `xxx.data`, 5: write `xxx.pointer` as (`<uuid>`, `<size>`)
A truncation sequence would be
1: read the `xxx.pointer` file, it contraints two value (`<uuid>`, `<size>`) 2: open `xxx.data` (check `<uuid>` for good measure, repository could be corrupt) 3: open `xxx.data.tmp` and write `<newuuid>` 4: write content of `xxx.data` from `sizeof(<uuid>)` to `<truncate>` in `xxx.data.tmp` (or copy and truncate if reflink is available ?) 5: flush/close `xxx.data.tmp`, 6: write `xxx.pointer.tmp` as (`<newuuid>`, `<truncate>`) 7: rename `xxx.pointer.tmp` to `xxx.pointer` and `xxx.data.tmp` to `xxx.pointer`
Note: If we are concerned about the race window when renaming both xxx.pointer.tmp and xxx.data.tmp, we could use data file in the form xxx.<uuid>.data
All this would have to be set behind a new repository requirements
- introduce a new requirements
introduce the .pointer/<uuid>.data mechanism
- update the file access call to use the proper API for all files
- Introduce an API to access data using mmap instead of actual reading
- start using the new mechanism for all relevant files
- changelog index
- manifest index
- various cache/index ?
- data files ?
- filelog ?
- Windows support:
- make sure the proper flags are passed for all file operations
- enable the feature on windows