Differences between revisions 9 and 10
Revision 9 as of 2009-01-03 00:24:32
Size: 3533
Editor: KyujinShim
Comment: The location of index file is changed.
Revision 10 as of 2009-05-19 19:31:05
Size: 3552
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
''A new revlog format was introduced for Mercurial 0.9: see ["RevlogNG"].'' ''A new revlog format was introduced for Mercurial 0.9: see [[RevlogNG]].''
Line 6: Line 6:
data structure and represents all versions of a file in a [:Repository:repository]. Each version is data structure and represents all versions of a file in a [[Repository|repository]]. Each version is
Line 19: Line 19:
 * the [:Nodeid:nodeid] of the file version  * the [[Nodeid|nodeid]] of the file version
Line 24: Line 24:
 * the linkrev pointing to the corresponding [:ChangeSet:changeset]  * the linkrev pointing to the corresponding [[ChangeSet|changeset]]
Line 54: Line 54:
Revlogs are also used for [:Manifest:manifests] and [:ChangeSet:changesets]. Revlogs are also used for [[Manifest|manifests]] and [[ChangeSet|changesets]].
Line 59: Line 59:
The [http://www.selenic.com/hg/index.cgi/file/tip/contrib/ contrib directory] in the sources contains python scripts [http://www.selenic.com/hg/index.cgi/file/tip/contrib/dumprevlog dumprevlog] and [http://www.selenic.com/hg/index.cgi/file/tip/contrib/undumprevlog undumprevlog] (see changeset [http://www.selenic.com/hg/rev/ec5d77eb3431 ec5d77eb3431]). The [[http://www.selenic.com/hg/index.cgi/file/tip/contrib/|contrib directory]] in the sources contains python scripts [[http://www.selenic.com/hg/index.cgi/file/tip/contrib/dumprevlog|dumprevlog]] and [[http://www.selenic.com/hg/index.cgi/file/tip/contrib/undumprevlog|undumprevlog]] (see changeset [[http://www.selenic.com/hg/rev/ec5d77eb3431|ec5d77eb3431]]).
Line 63: Line 63:
 * ''"Towards a Better SCM: Revlog and Mercurial"'', Matt Mackall ([attachment:Presentations/ols-mercurial-paper.pdf PDF])
 * ''[http://hgbook.red-bean.com/hgbookch4.html#x8-640004 "Behind the scenes"]'' in ''[http://hgbook.red-bean.com/hgbook.html "Distributed revision control with Mercurial"]'', Bryan O’Sullivan
 * ''"Towards a Better SCM: Revlog and Mercurial"'', Matt Mackall ([[attachment:Presentations/ols-mercurial-paper.pdf|PDF]])
 * ''[[http://hgbook.red-bean.com/hgbookch4.html#x8-640004|"Behind the scenes"]]'' in ''[[http://hgbook.red-bean.com/hgbook.html|"Distributed revision control with Mercurial"]]'', Bryan O’Sullivan
Line 66: Line 66:
see also: ["Presentations"] see also: [[Presentations]]

Revlog

A new revlog format was introduced for Mercurial 0.9: see RevlogNG.

A revlog, for example .hg/data/somefile.d, is the most important data structure and represents all versions of a file in a repository. Each version is stored compressed in its entirety or stored as a compressed binary delta (difference) relative to the preceeding version in the revlog. Whether to store a full version is decided by how much data would be needed to reconstruct the file. This system ensures that Mercurial does not need huge amounts of data to reconstruct any version of a file, no matter how many versions are stored.

The reconstruction requires a single read, if Mercurial knows when and where to read. Each revlog therefore has an index, for example .hg/store/data/somefile.i, containing one fixed-size record for each version. The record contains:

  • the nodeid of the file version

  • the nodeids of its parents
  • the length of the revision data
  • the offset in the revlog saying where to begin reading
  • the base of the delta chain
  • the linkrev pointing to the corresponding changeset

Here's an example:

$ hg debugindex .hg/store/data/README.i
   rev    offset  length   base linkrev nodeid       p1           p2
     0         0    1125      0       0 80b6e76643dc 000000000000 000000000000
     1      1125     268      0       1 d6f755337615 80b6e76643dc 000000000000
     2      1393      49      0      27 96d3ee574f69 d6f755337615 000000000000
     3      1442     349      0      63 8e5de3bb5d58 96d3ee574f69 000000000000
     4      1791      55      0      67 ed9a629889be 8e5de3bb5d58 000000000000
     5      1846     100      0      81 b7ac2f914f9b ed9a629889be 000000000000
     6      1946     405      0     160 1d528b9318aa b7ac2f914f9b 000000000000
     7      2351      39      0     176 2a612f851a95 1d528b9318aa 000000000000
     8      2390       0      0     178 95fdb2f5e08c 2a612f851a95 2a612f851a95
     9      2390     127      0     179 fc5dc12f851b 95fdb2f5e08c 000000000000
    10      2517       0      0     182 24104c3ccac4 fc5dc12f851b fc5dc12f851b
    11      2517     470      0     204 cc286a25cf37 24104c3ccac4 000000000000
    12      2987     346      0     205 ffe871632da6 cc286a25cf37 000000000000
...

With one read of the index to fetch the record and then one read of the revlog, Mercurial can reconstruct any version of a file in time proportional to the file size.

So that adding a new version requires only O(1) seeks, the revlogs and their indices are append-only.

Revlogs are also used for manifests and changesets.

The contrib directory in the sources contains python scripts dumprevlog and undumprevlog (see changeset ec5d77eb3431).

References

see also: Presentations


CategoryInternals

Revlog (last edited 2012-02-24 16:15:45 by WagnerBruna)