Bug 2567 - NUL in description breaks rss and atom feed once
Summary: NUL in description breaks rss and atom feed once
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: unspecified
Hardware: All All
: normal bug
Assignee: Bugzilla
URL:
Keywords: easy
Depends on:
Blocks:
 
Reported: 2010-12-24 04:56 UTC by Ton
Modified: 2012-10-26 13:30 UTC (History)
4 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ton 2010-12-24 04:56 UTC
Somehow I managed to have a description of a changeset that incorporate a 
linefeed, this is no problem for Mercurial itself.
However when using the atom *or* rss feeds for that repository with 
HGwebdir and Apache the linefeed is misplaced once.
Here's the report from Opera on the Atom feed:
------------
133:   <content type="xhtml">
134:    <div xmlns="http://www.w3.org/1999/xhtml">
135:     <pre xml:space="preserve">Issue 66
136: </pre>
137:    </div>
138:   </content>
139:  </entry>
------------
(notice the linefeed after 'Issue 66' before the </pre> on line 135)

Somehow a refresh of the feed returns a valid stream.

Here's the rss-feed:
--------------------
 60:     <title>Issue 66</title>
 61:     <guid isPermaLink="true">http://code.tcplomp.nl/mercurial/
rev/40f7a013741d</guid>
 62:     <description><![CDATA[Issue 66<br/>
 63: ]]></description>
 64:     
<author>&#116;&#111;&#110;&#32;&#60;&#116;&#99;&#112;&#108;&#111;&#109;&#112;&#64;&#103;&#109;&#97;&#105;&#108;&#46;&#99;&#111;&#109;&#62;</
author>
 65:     <pubDate>Tue, 07 Dec 2010 19:53:46 +0100</pubDate>
 66: </item>
----------------
(notice the linefeed after the <br/> on line 62)

Ton
Comment 1 kiilerix 2010-12-24 07:38 UTC
Right. The escape filter doesn't work in cdata sections.

I guess the best solution is a change like this:

--- a/mercurial/templatefilters.py
+++ b/mercurial/templatefilters.py
@@ -196,6 +196,7 @@
     "basename": os.path.basename,
     "stripdir": stripdir,
     "age": age,
+    "cdata": lambda x: '<![CDATA[' + x.replace(']]>', ']]]]><![CDATA[>') +
                        ']]>',
     "date": lambda x: util.datestr(x),
     "domain": domain,
     "email": util.email,

--- a/mercurial/templates/rss/changelogentry.tmpl
+++ b/mercurial/templates/rss/changelogentry.tmpl
@@ -1,7 +1,7 @@
 <item>
     <title>{desc|strip|firstline|strip|escape}</title>
     <guid isPermaLink="true">{urlbase}{url}rev/{node|short}</guid>
-   
<description><![CDATA[{desc|strip|escape|addbreaks|nonempty}]]></description>
+    <description>{desc|strip|escape|addbreaks|nonempty|cdata}</description>
     <author>{author|obfuscate}</author>
     <pubDate>{date|rfc822date}</pubDate>
 </item>

Can you confirm that it solves your problem?
Comment 2 Matt Mackall 2010-12-24 11:08 UTC
This report is wrong:

   <div xmlns="http://www.w3.org/1999/xhtml">
    <pre xml:space="preserve">Issue 66
^@</pre>
   </div>

The mysterious problematic character is a NUL, not a linefeed. XML and Atom
should have no issue with a linefeed, but XML doesn't like NULs. This also
explains why strip in the filter list wasn't removing it.

It's quite mysterious that this behavior isn't repeatable for you. It's also
rather worrisome that your web server takes between 10 seconds and 100
seconds to return a page. Compare with the performance of
http://hg.intevation.org/
Comment 3 kiilerix 2012-05-11 05:38 UTC
So what should happen here?

"Don't do that, then"?

Or should a new or existing templating filter remove NULs?
Comment 4 Bugzilla 2012-05-12 09:15 UTC

--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:15 EDT  ---

This bug was previously known as _bug_ 2567 at http://mercurial.selenic.com/bts/issue2567
Comment 5 Matt Mackall 2012-07-29 18:50 UTC
We should probably filter out nulls in the escape filter.
Comment 6 HG Bot 2012-10-16 15:12 UTC
Fixed by http://selenic.com/repo/hg/rev/823a7d79ef82
Siddharth Agarwal <sid0@fb.com>
hgweb: make the escape filter remove null characters (issue2567)

(please test the fix)