March 18, 2011, 11:47 p.m.
posted by pythonics
Special Character Encoding
For the most part, characters within documents that are not part of a tag are rendered as is by the browser. However, some characters have special meaning and are not directly rendered, and other characters can't be typed into the source document from a conventional keyboard. Special characters need either a special name or a numeric character encoding for inclusion in a document.
As has become obvious in the discussion and examples leading up to this section, three characters in source documents have very special meaning: the less-than sign (<), the greater-than sign (>), and the ampersand (&). These characters delimit tags and special character references. They'll confuse a browser if left dangling alone or with improper tag syntax, so you have to go out of your way to include their actual, literal characters in your documents.[*]
Similarly, you have to use special encoding to include double quotation mark characters within a quoted string, or when you want to include a special character that doesn't appear on your keyboard but is part of the ISO Latin-1 character set that most browsers implement and support.
Inserting Special Characters
To include a special character in your document, enclose either its standard entity name or a pound sign (#) and its numeric position in the Latin-1 standard character set[*] inside a leading ampersand and an ending semicolon, without any spaces in between. Whew. That's a long explanation for what is really a simple thing to do, as the following examples illustrate. The first example shows how to include a greater-than sign in a snippet of code by using the character's entity name. The second demonstrates how to include a greater-than sign in your text by referencing its Latin-1 numeric value:
if a > b, then t = 0 if a > b, then t = 0
Both examples cause the text to be rendered as follows:
if a > b, then t = 0
The complete set of character entity values and names appears in Appendix F. You could write an entire document using character encodings, but that would be silly.