[Adium-devl] Unicode support in AIHTMLDecoder
Ryan Govostes
rgovostes at gmail.com
Tue Jul 15 11:39:17 UTC 2008
I've noticed that when pasting text with line breaks, especially when
copied from Safari or Word, sometimes shows up in the message view
stripped of all breaks. More interestingly, though the problem shows
up in the message view it does not appear in the logs or in Growl
notifications.
Some playing around shows that the following HTML, rendered by Safari
and pasted into Adium, triggers the issue:
<p>
This is on the first line.<br />
This is on the second line.
</p>
However, remove those paragraph tags and the resulting paste doesn't
show the issue. And render it in Firefox and the paste will work fine
too.
As it turns out, the paste is of type NSRTFPboardType, and the problem
arises when Safari copies the line breaks as Unicode (U+2028, line
separator) instead of ASCII (0x0A, line feed). It ends up bypassing
AIHTMLDecoder's substitution routines, which are only set up to
recognize \n and \r:
- When sending the message to AIM's server, thingsToInclude.nonASCII =
false, so it does a very rudimentary find/replace of \r\n, \r, and \n.
- When sending the message to the message view,
thingsToInclude.nonASCII = true, so we end up around line 620 being
escaped as 

Ideally this code would be updated to properly replace all Unicode
line breaks as <br>. Wikipedia has the exhaustive list taken from the
Unicode Standard 4.0 guidelines:
http://en.wikipedia.org/wiki/Newline#Unicode
Otherwise, Apple's character sets and Unicode utilities don't seem to
include all of those (strangely enough).
Regards,
Ryan Govostes
More information about the devel
mailing list