[PATCH 3 of 3] help: document known deficiencies with revlog format

Sat Mar 11 20:47:21 EST 2017

At Mon, 27 Feb 2017 12:54:02 -0800,
Gregory Szorc wrote:
> 
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1488228075 28800
> #      Mon Feb 27 12:41:15 2017 -0800
> # Node ID 4ecfe89fa8c1477ab54fa8ef271589a8b20c5497
> # Parent  33cdab923127f930357032db7eb6c4c9441e73ae
> help: document known deficiencies with revlog format
> 

[snip]

> +Integers stored as big-endian
> +   Integers in revlog index entries are stored as big-endian. Most systems
> +   that Mercurial runs on are little-endian.
> +
> +   Having a bit ordering mismatch between index entries and the running
> +   machine means that integers must be converted before they can be
> +   used. This means that a revlog implementation in a systems language on
> +   little-endian machines can't simply cast a memory address as a struct to
> +   read index entries. This adds overhead in the form of data conversion
> +   and extra allocations.
> +
> +   Changing revlog data to little-endian would facilitate 0-copy operations
> +   on most machines that Mercurial runs on and would make things faster.

Just FYI, because I don't have any actual measurement of benefit and
so on :-)

ZFS provides "adaptive endianness", which stores "byte-order to be
used" into each data block, for balancing between performance and
portability.

By default, data block is written out in "host" byte-order. Therefore,
a host (or hosts using same byte-order) can read data blocks written
by itself without any byte-swapping.

When storage itself is connected to another host using different
byte-order, that host can read data blocks correctly with "byte-order
to be used" information (byte-swapping is needed, though).

-- 
----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp