This page describes the second iteration of the bundle format, tentatively called HG19 (since we hoped to include it with Mercurial 1.9).

Bundles consist of the following sections:

Nomenclature

For the sake of this document, the following (otherwise often quite ambigious) terms are used:

bundle

headerless bundle

chunk

changegroup

Sections

For each section, the offsets are given relative to the beginning of the section. Fields with unknown length are assigned constants a, b, c etc.

The format of the bundle header is described below. Traditionally, the first part of the header (only part in the existing format), is often left out in internal processing and over the wire. This part consists of the first 6 bytes up to and including the compression type. In such cases, the bundles are always considered to be uncompressed. It has not been decided what we will do with the new bundle format.

Offset

Size

Type

Description

0

4

string

Bundle format version. Always contains "HG19".

4

2

string

Compression type. Either "BZ", "GZ" or "UN".

6

4

uint

Length of feature string, in bytes.

10

a

string

Bundle features (or requirements). A list of newline separated strings describing features present in the bundle (unterminated).

Changegroups section

The changegroups section has the following format:

Offset

Size

Type

Description

0

4

uint

Number of changelog entries.

4

b

group

Changegroup containing changelog entries.

b + 4

4

uint

Number of manifest entries.

b + 8

c

group

Changegroup containing manifest entries.

b + c + 8

4

uint

Number of filelog changegroups (note: not the number of entries).

Then, for each filelog, the following:

Offset

Size

Type

Description

0

4

uint

Number of filelog entries.

4

4

uint

Length of filename, in bytes.

8

d

string

Filename (unterminated).

d + 8

e

group

Changroup containing filelog entries.

The changegroup format is described below.

Changegroups

A changegroup consists of a number of chunks describing revisions. Each chunk has the following format:

Offset

Size

Type

Description

0

4

uint

Total length of the chunk, including the 104 bytes header described here.

4

20

sha-1 hash

Node of this revision.

24

20

sha-1 hash

First parent of this revision.

44

20

sha-1 hash

Second parent of this revision (or 0-bytes).

64

20

sha-1 hash

Link pointer back to the changelog.

84

20

sha-1 hash

Parent for the delta (or 0-bytes for a snapshot).

104

f

data

Delta or full version snapshot.

So in the above table, we always have chunk length = f + 104.

Further requirement

Additional feature have landed into Mercurial since this design. We also wish to support the following data in a bundle