Note:

This page is primarily intended for developers of Mercurial.

Error reporting cleanup

Status: Early draft

Main proponents: RodrigoDamazio

/!\ This is a speculative project and does not represent any firm decisions on future behavior.

This proposes cleaning up how Mercurial failures are reported, both to the user and for any extensions that gather/upload failure information.

1. Goal

Currently, Mercurial has a small set of exceptions that are used to report errors. Notably, a large number of failure cases indiscriminately use error.Abort, without regard to the nature of the failure. While, as long as descriptive error messages are used, the end user may not care about the distinction, any deployments where this data is aggregated (such as real-time monitoring/alerting)

The goal is to separate the errors into categories, as well as their likely culprit.

2. Detailed description

2.1. Categorization

I propose these categories of errors:

Category

Description

Example

Return code

HTTP equivalents

Input

The failure was caused by invalid user input

Tried to check out non-existent revision

10

400, 404

State

The failure was caused by the repository or environment being in an invalid state for the requested operation

There are unresolved conflicts

20

402, 409, 412, 418, 422, 423, 428, 451

Configuration

The failure was caused by configuration

Failure to parse config file, unmet requires

30

N/A (always local)

Storage

when the failure is caused by interacting with storage

IOError, file corruption

50

N/A (always local)

Remote

When the remote (e.g. server) causes a failure

Disconnected, invalid payload received, required capability missing

100

other 5xx, other 4xx

Security

when the failure is caused by some aspect of security

Bad server credentials, expired local credentials for network filesystem, mismatched GPG signature, DoS protection

150

401, 403, 407, 420, 425, 429, 450, 463, 495-498, 511, 525, 526

Intervention required

Not a final failure, but rather indicative that user intervention is required before hg can continue

merge conflict resolution required

240

N/A (always local)

Cancelled

When the user cancels the operation

KeyboardInterrupt, aborting the editor, etc.

250

499?

Internal

unexpected crashes or unexpected internal state

Any unexpected Python exception

254

500, 520

2.2. Remote error categorization

Remote errors can be caused by most of the above issues. The Remote category will only be applied when the underlying cause is not known, or the remote is clearly at fault.

For the HTTP transport, HTTP errors will be mapped to the above categories per the table above. For the SSH transport, TBD

2.3. Status codes

There'll be a static mapping from the above basic categories to error codes, per the table above.

Specializations of the categories in extensions (usually as exception subclasses), when required, may override the error code. The spacing of the return codes in the table is meant to allow grouping of related codes.

The status codes 1 and 255 are explicitly avoided so reporting or monitoring systems can tell between an older version of Mercurial (which will use one of those) and these new codes.

2.4. High-level code changes

Most high-level code, such as the implementation of commands, will be updated to use the appropriate exception classes according to the categorization above.

2.5. Low-level library changes (e.g. revlog)

In many cases, the code generating the error is not at the high level where the user's intentions are better known, but rather inside low-level libraries.

For instance, one can use scmutil.revsingle() to fetch a revision, and that will raise an exception if the revision is not found, without knowledge of whether that revision was requested by the user or calculated/expected by some internal logic. These instances will be updated to use internal-only exceptions, and calling sites will be required to expect that and translate it to the appropriate type, with more context. This also gives the opportunity for the higher-level code to implement any additional logic that may make sense to determine the cause of the failure, and hopefully provide more helpful messages to the users.

If any such internal exception ever reaches the user, a develwarn will also be printed, and it will otherwise be treated as the Internal type from the table.

This is likely to be a large and repetitive code change, but one that promotes good code health in error handling.

2.6. Cleanup

error.Abort will be mercilessly deleted once all its uses are gone. Its memory will not be honored.

2.7. Future improvements


CategoryDeveloper CategoryNewFeatures

ErrorCategoriesPlan (last edited 2020-06-13 00:08:49 by RodrigoDamazio)