Differences between revisions 1 and 34 (spanning 33 versions)
Revision 1 as of 2011-05-11 12:24:36
Size: 2108
Editor: cyanite
Comment: initial version
Revision 34 as of 2018-02-10 00:05:58
Size: 2056
Editor: AviKelman
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This page describes the second iteration of the bundle format, tentatively called HG19 (since we hope to include it with Mercurial 1.9). {{{#!wiki caution
Line 3: Line 3:
Bundles consist of the following sections:
 * A bundle header, describing the version and features present in the data.
 * Changegroup sections for the changelog, the manifest and each relevant filelog.
 * Optionally, a footer containing an index for more efficient random access?
This information was derived by reverse engineering. Some details may be incomplete. Hopefully someone with intimate familiarity with the code can improve it.}}}
Line 8: Line 5:
<<TableOfContents>> The v2 bundle file format is in practice quite similar to v1 (see BundleFormat), in that it comprises a file header followed by a changegroup, but it differs in a few significant ways.
Line 10: Line 7:
== Sections ==
=== Header ===
The bundle header has the following format:
== Practical differences from v1 bundles ==
 * The file has a more verbose multi-stage ASCII header containing key:value pairs. (more below)
 * Zstandard compression (new default) also supported.
 * Uses version 2 deltagroup headers instead of version 1. (see the spec at [[Topic:internals.changegroups|help internals.changegroups]])
 * Everything after the header is shredded into N-byte chunks after it is assembled (N is a parameter defined in the source code).
Line 14: Line 13:
|| '''Offset''' || '''Size''' || '''Type''' || '''Description''' ||
||<)> 0 ||<)> 4 || string || Bundle format version. Always contains "HG19". ||
||<)> 4 ||<)> 2 || string || Compression type. Either "BZ", "GZ" or "UN". ||
||<)> 6 ||<)> a || string || Bundle features (or requirements). A list of comma separated words (lowercase ASCII) describing features present in the bundle. The string is terminated by a newline character. ||
== Reading the header ==
Line 19: Line 15:
=== Changegroup sections ===
The changegroup sections has the following format:
=== stage 1 ===
|| 'HG20' || Compression Chunk || rest of file ||
Line 22: Line 18:
|| '''Offset''' || '''Size''' || '''Type''' || '''Description''' ||
||<)> 0 ||<)> 4 || uint || Number of changelog entries. ||
||<)> 4 ||<)> b || group || Changegroup containing changelog entries. ||
||<)> b + 4 ||<)> 4 || uint || Number of manifest entries. ||
||<)> b + 8 ||<)> c || group || Changegroup containing manifest entries. ||
||<)> b + c + 8 ||<)> 4 || uint || Number of filelog changegroups (note: not the number of entries). ||
Compression Chunk will be either null or contain the ASCII 'Compression=XX' where XX is a code indicating which decompression to use on the rest of the file.
Line 29: Line 20:
Then, for each filelog, the following: === stage 2 ===
|||| rest of file from stage 1 ||
|| Parameters Chunk || shredded changegroup (and possibly other sections?) ||
Line 31: Line 24:
|| '''Offset''' || '''Size''' || '''Type''' || '''Description''' ||
||<)> 0 ||<)> 4 || uint || Number of filelog entries. ||
||<)> 4 ||<)> 4 || uint || Length of filename. ||
||<)> 8 ||<)> d || string || Filename (unterminated). ||
||<)> d + 8 ||<)> e || group || Changroup containing filelog entries. ||
Parameters Chunk contains (among possibly other things?) the fact that the file contains a changegroup ('\x0bCHANGEGROUP'), a null chunk, and then a complex nested sequence of two parameter categories. The nested sequence contains, first, indicators for how many key:value pairs are in the first category, followed by how many pairs are in the second category, followed by the length of an ASCII key, followed by the length of its ASCII value (repeated for all keys and values).
Line 37: Line 26:
== Changegroups ==
...
Example Parameters Chunk:
|| chunk length |||| description of contents || #section1 parameters || #section2 parameters || len(key1),len(value1) || len(key2),len(value2) || key1 || value1 || key2 || value2||
|| 4 bytes || \x0bCHANGEGROUP || 4 bytes null || \x01 || \x01 || \x07\x02 || \t\x01 || version || 02 || nbchanges || 7 ||

This information was derived by reverse engineering. Some details may be incomplete. Hopefully someone with intimate familiarity with the code can improve it.

The v2 bundle file format is in practice quite similar to v1 (see BundleFormat), in that it comprises a file header followed by a changegroup, but it differs in a few significant ways.

Practical differences from v1 bundles

  • The file has a more verbose multi-stage ASCII header containing key:value pairs. (more below)
  • Zstandard compression (new default) also supported.
  • Uses version 2 deltagroup headers instead of version 1. (see the spec at help internals.changegroups)

  • Everything after the header is shredded into N-byte chunks after it is assembled (N is a parameter defined in the source code).

Reading the header

stage 1

'HG20'

Compression Chunk

rest of file

Compression Chunk will be either null or contain the ASCII 'Compression=XX' where XX is a code indicating which decompression to use on the rest of the file.

stage 2

rest of file from stage 1

Parameters Chunk

shredded changegroup (and possibly other sections?)

Parameters Chunk contains (among possibly other things?) the fact that the file contains a changegroup ('\x0bCHANGEGROUP'), a null chunk, and then a complex nested sequence of two parameter categories. The nested sequence contains, first, indicators for how many key:value pairs are in the first category, followed by how many pairs are in the second category, followed by the length of an ASCII key, followed by the length of its ASCII value (repeated for all keys and values).

Example Parameters Chunk:

chunk length

description of contents

#section1 parameters

#section2 parameters

len(key1),len(value1)

len(key2),len(value2)

key1

value1

key2

value2

4 bytes

\x0bCHANGEGROUP

4 bytes null

\x01

\x01

\x07\x02

\t\x01

version

02

nbchanges

7

BundleFormat2 (last edited 2018-02-10 00:05:58 by AviKelman)