[PATCH hgweb-help-followup] help: teach topic symbols how to dedent

Gregory Szorc gregory.szorc at gmail.com
Tue Feb 10 02:12:29 CST 2015


On Mon, Feb 9, 2015 at 3:45 PM, Augie Fackler <raf at durin42.com> wrote:

> On Mon, Feb 09, 2015 at 03:06:03PM -0800, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc at gmail.com>
> > # Date 1423522744 28800
> > #      Mon Feb 09 14:59:04 2015 -0800
> > # Node ID bd6a8cf1710d95b593babf3cad3224adff6064f9
> > # Parent  1f4fd12991f2b32b1277bd3e047132d5f087b914
> > help: teach topic symbols how to dedent
> >
> > When using docstrings for documenting symbols such as revsets,
> > templates, or hgweb commands, documentation likely has leading
> > whitespace corresponding to the indentation from the Python source
> > file.
> >
> > Up until this point, the help system stripped all leading and
> > trailing whitespace and replaced it with 2 spaces of leading
> > whitespace. There were a few bad side-effects. First, sections
> > could not be used in docstrings because they would be indented
> > and the rst parser would fail to parse them as sections. Also,
> > any rst elements that required indentation would lose their
> > indentation, again causing them to be parsed and rendered
> > incorrectly.
> >
> > In this patch, we teach the topic symbols system how to dedent
> > text properly. I argue this mode should be enabled by default.
> > However, I stopped short of changing that because it would cause
> > a lot of documentation reformatting to occur. I'm not sure if
> > people are relying on or wanting indentation. So, dedenting has
> > only been turned on for hgweb symbols. This decision should be
> > scrutinized.
>
> This looks reasonable, as does the conservative approach. A good
> followup series (IMO) would be to figure out what can't handle
> dedentation and fix that up so we can just turn this on globally.
>

I've looked into this and it is pain and more pain.

First, using textwrap.dedent() won't work universally. e.g.

def foo():
    """This is a summary.

    This is a new line."""

foo.__doc__ is "This is a summary.\n\n    This is a new line." The first
line is not indented, so textwrap.dedent() no-ops, since its job is to look
for universal indenting, ignoring empty lines.

PEP-257 recommends a parsing algorithm that does a custom strip based on
minimal leading whitespace in lines[1:]. Looks sane to me. We'll want to
implement that.

So I did.

But that took me to the next hurdle.

After implementing the recommended docstring normalization method, docs
weren't rendering properly at all. Why? Well, a lot of our docstrings (like
revsets) have the form:

def func():
    """``func()``
    This is a description.
    """

Our existing docstring normalizer normalizes this to "``func()``\n  This is
a description" (note the leading 2 spaces on subsequent lines). That
leading whitespace is important. Without it, a rst parser will assume the
next line is a line
continuation and it will merge the contents into a single paragraph. This,
of course, completely ruins the help documentation formatting.

On one hand, Mercurial's docstrings for things like revsets are in
violation of the recommendations in PEP-257 because there isn't a blank
line between the summary and description lines. We should look into
correcting that. On the other hand, this has been the convention for ages.
We can't be naive and assume extensions haven't cargo culted this
convention (I know I have). So, we'll probably want a mechanism to insert
an empty line between lines[0] and lines[1] in docstrings before we can
even talk about dedenting docstrings.

If we dedent doc strings, that replaces the 2 space indentation with...
nothing. Unfortunately, that 2 space indent is hard-coded in a number of
help pages (see the revset operators docs). So, we're talking about scope
creep to reformat a lot of help pages as well to match the new conventions.

If you get this far, you find that losing indentation makes the docs not
very readable. So we'll probably also want to replace "``foo``" patterns
with rst sections. Again, we'll likely want to update docstrings in source
*and* teach the help formatter how to do it so existing extensions don't
blow up help formatting. This will restore indentation and readability to
the plain text docs. As a bonus, it probably makes the HTML docs a little
easier to read too.

I think I know what needs to be done. But this rabbit hole runs deeper than
I initially thought. But good docs is worth the trouble. Let me see if I
can find time to crank out a patch series.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150210/2bf1fdc7/attachment.html>


More information about the Mercurial-devel mailing list