xml style doesn't generate valid xml
Haszlakiewicz, Eric
EHASZLA at transunion.com
Tue Nov 23 15:19:34 CST 2010
>-----Original Message-----
>From: Matt Mackall [mailto:mpm at selenic.com]
>
>> However, for the issue with xmlescape turning things into spaces, I
>> think that's because there's an explicit line of code in xmlescape
>> that does that! In templatefilters.py, the last line of xmlescape is:
>> return re.sub('[\x00-\x08\x0B\x0C\x0E-\x1F]', ' ', text)
>
>Ugh, who ordered that.
>
Well, I poked around a bit and came up with this as a better implementation of xmlescape. What do you think:
def xmlescape(text):
text = (text
.replace('&', '&')
.replace('<', '<')
.replace('>', '>')
.replace('"', '"')
.replace("'", ''')) # ' invalid in HTML
moretoencode = True
new_s = ""
while moretoencode:
try:
text.decode("UTF-8", "strict")
new_s += text
moretoencode = False
except UnicodeDecodeError, inst:
preerror = text[0:inst.start]
new_s += preerror
escaped = "&#" + "%d" % ord(text[inst.start]) + ";"
new_s += escaped
text = text[inst.start+1:]
def fixupcontrols(matchobj):
return "&#" + "%d" % ord(matchobj.group(0)) + ";"
return re.sub('[\x00-\x08\x0B\x0C\x0E-\x1F]', fixupcontrols, new_s)
eric
More information about the Mercurial
mailing list