Note:

This page is primarily intended for developers of Mercurial.

The Mercurial API

Rough introduction to Mercurial internal API.

{X} Using this API is a strong indication that you're creating a "derived work" subject to the GPL. Before going any further, read the License page.

1. Why you shouldn't use Mercurial's internal API

Mercurial's internals are continually evolving to be simpler, more consistent, and more powerful, a process we hope will continue for the foreseeable future. Unfortunately, this process means we will regularly be changing interfaces in ways that break third-party code in various (mostly minor) ways.

For the vast majority of third party code, the best approach is to use Mercurial's published, documented, and stable API: the command line interface. Alternately, use the CommandServer or the libraries which are based on it to get a fast, stable, language-neutral interface.

{X} There are NO guarantees that third-party code calling into Mercurial's internals won't break from release to release.

/!\ If you do use Mercurial's API for published third-party code, we expect you to test your code before each major Mercurial release (see TimeBasedReleasePlan). This will prevent various bug reports from your users when they upgrade their copy of Mercurial.

2. The high level interface

It is possible to call Mercurial commands directly from within your code. Every Mercurial command corresponds to a function defined in the mercurial.commands module, with the calling signature

    CMD(ui, repo, ...)

Here, ui and repo are the user interface and repository arguments passed into an extension function as standard (see WritingExtensions for more details). If you are not calling the Mercurial command functions from an extension, you will need to create suitable ui and repo objects yourself. The ui object can be instantiated from the ui class in mercurial.ui; the repo object can either be a localrepository, a httprepository, an sshrepository or a statichttprepository (each defined in their own modules), though it will most often be a localrepository.

The remainder of the parameters come in two groups:

A reasonably complex example might be hg commit -m "A test" --addremove file1.py file2.py. This would have an equivalent API form

    from mercurial import commands
    commands.commit(ui, repo, 'file1.py', 'file2.py', message="A test", addremove=True)

In practice, some of the options for the commit command are required in a call, and must be included as keyword parameters - adding date=None, user=None, logfile=None would be sufficient in this case. This detail can be ignored for now.

Commands which fail will raise a mercurial.error.Abort exception, with a message describing the problem:

    from mercurial import error
    raise error.Abort("The repository is not local")

Generally, however, you should not use this interface, as it mixes user interface and functionality. If you want to write robust code, you should read the source of the command function, and extract the relevant details. For most commands, this is not as hard as it seems - there is often a "core" function (usually in the cmdutil or hg module) which performs the important work of the command.

3. Setting up repository and UI objects

In order to get started, you'll often need a UI and a repository object. The UI object keeps access to input and output objects and all the relevant config bits (machine-global, user-global, repo-wide, and specific for this invocation), the repository represents, well, the repository. A repository object can be any of a number of objects (as enumerated below), but these two lines make it easy to create an appropriate repository object:

    from mercurial import ui, hg
    repo = hg.repository(ui.ui(), '.')

Here, the '.' is the path to the repository, which could also be something starting with http or ssh, for example. You'll often need these objects to get any work done through the Mercurial API, for example by using the commands as detailed above.

mercurial.ui instances have two flavors - global and repo. When you instantiate a new ui instance, it automatically reads all of the site-wide and user config files. When you pass a ui instance to hg.repository(), the repo copies it, then reads (adds) its repository configuration. Global ui instances are interchangeable, but once it has included repository setup you don't want to use it again for another repository, else you get bleed-through.

4. Communicating with the user

Most extensions will need to perform some interaction with the user. This is the purpose of the ui parameter to an extension function. The ui parameter is an object with a number of useful methods for interacting with the user.

Writing output:

Accepting input:

Useful values:

4.1. Collecting output

Output from a ui object is usually to the standard output, sys.stdout. However, it is possible to "divert" all output and collect it for processing by your code. This involves the ui.pushbuffer() and ui.popbuffer() functions. At the start of the code whose output you want to collect, call ui.pushbuffer(). Then, when you have finished the code whose output you wish to collect, call ui.popbuffer(). The popbuffer() call returns all collected output as a string, for you to process as you wish (and potentially pass to ui.write()) in some form, if you just want to edit the output and then send it on.

Here is a sample code snippet adapted from http://selenic.com/pipermail/mercurial/2010-February/030231.html:

from mercurial import ui, hg, commands
u = ui.ui()
repo = hg.repository(u, "/path/to/repo")
u.pushbuffer()
# command / function to call, for example:
commands.log(u, repo)
output = u.popbuffer()
assert type(output) == str

4.2. Reading configuration files

All relevant configuration values should be represented in the UI object -- that is, global configuration (/etc/mercurial/hgrc), user configuration (~/.hgrc) and repository configuration (.hg/hgrc). You can easily read from these using the following methods on the ui object:

5. Repositories

There are a number of different repository types, each defined with its own class name, in its own module. All repository types are subclasses of mercurial.repo.repository.

Protocol

Module

Class Name

local

mercurial.localrepo

localrepository

http

mercurial.httprepo

httprepository

static-http

mercurial.statichttprepo

statichttprepository

ssh

mercurial.sshrepo

sshrepository

bundle

mercurial.bundlerepo

bundlerepository

Repository objects should be created using module.instance(ui, path, create) where path is an appropriate path/URL to the repository, and create should be True if a new repository is to be created. You can also use the helper method hg.repository(), which selects the appropriate repository class based on the path or URL passed.

Repositories have many methods and attributes, but not all repository types support all of the various options.

Some key methods of (local) repositories:

TODO: Add more details here.

6. Change contexts

A change context is an object which provides convenient access to various data related to a particular changeset. Change contexts can be converted to a string (for printing, etc - the string representation is the short ID), tested for truth value (false is the null revision), compared for equality, and used as keys in a dictionary. They act as containers for filenames - all of the following work:

Some informational methods on change context objects:

7. File contexts

A file context is an object which provides convenient access to various data related to a particular file revision. File contexts can be converted to a string (for printing, etc - the string representation is the "path@shortID"), tested for truth value (False is "nonexistent"), compared for equality, and used as keys in a dictionary.

Some informational methods on file context objects:

8. Revlogs

Revlogs are the storage backend for Mercurial. They are not fully documented here, as it is unlikely that extension code will require detailed access to revlogs. However, a couple of key methods which may be generally useful are:

9. Unicode and user data

{X} Don't pass Unicode strings to Mercurial APIs!

All Mercurial internals pass byte strings exclusively. The vast majority of these are encoded and manipulated in the "local" encoding (as set in 'encoding.encoding'). Code that passes Unicode objects will almost certainly break as soon it's used with non-ASCII data. The 'encoding.fromlocal()' and 'tolocal()' functions will handle transcoding from the "local" encoding to UTF-8 byte strings.

/!\ Don't transcode non-metadata!

Mercurial aims to preserve user's project data (filenames and file contents) byte-for-byte, so converting such data to Unicode and back is potentially destructive. Only metadata such as usernames and changeset descriptions are considered to be in a known encoding (stored as UTF-8 internally). See Encoding Strategy.

10. See also


CategoryInternals CategoryDeveloper

MercurialApi (last edited 2016-03-17 17:14:20 by AugieFackler)