These pages display all of the characters that can be included in an HTML 4.01 document as a named entity. To the best of my knowledge, every named entity is included. I am also working on a system to display various combinations of combining characters (such as accents) with various Latin letters by generating these with a CGI script.
All of these documents are declared as HTML 4.01. As such, they should be assumed to be using the character set ISO 10646 (a.k.a. Unicode) by any HTML 4.01 compliant browser since no other character set is specified. The first 256 ISO 10646 characters are identical to ISO 8859-1 (a.k.a. Latin 1), which is what any HTML 3.2 compliant browser should assume this page is. (Incidentaly, the first 128 ISO 8859-1 characters are identical to US-ASCII.)
With some combinations of browsers and platforms, the ability to render some characters will be affected by which font your browser is currently using. For example, Microsoft has released various "web fonts" which are more likely to contain glyphs for all these characters. A good default to use on Microsoft Windows is the True Type font Verdana, which was designed for viewing on a computer screen.
The named character entities on these pages are based on the W3's actual SGML entity files. Every named character entity in HTML 4.01 is included. In order to test browser compliance, the first file also contains all 256 Latin 1 characters. The other file includes translation tables from Microsoft's CP1252 (a superset of ISO Latin 1) to Unicode 3.0. For images of the various glyphs, consult the Unicode consortium.
The numeric entities in the first file are given in both decimal and hexadecimal in order to test browser compliance. Those in the others are just hexadecimal, although some browsers which can render these characters may only recognize them in decimal.
All Unicode characters may be included in an HTML 4.01 document by means of a numeric entity. In practice, languages written in non-Latin letters usually use a specialized 8 or 16 bit character set (such as Koi8R or ShiftJIS, respectively), instead of specifying the 32 bit Unicode character directly. However, it may occasionally be useful for web page authors to include a few characters outside of the usualy Latin-1 range in documents written without a character set specificied. One example is U+2153, which could be written as ⅓ to render ⅓ (the fraction one third), for which there is no Latin-1 equivalent.
The following page allows you to request a range of Unicode characters from a script I wrote. You'll see the character number, the numeric entity, the character itself if your browser can render it, and the Unicode consortium's description of the character.
Comments welcome. I will credit suggestions on my log of changes and corrections (last updated: 10 May 2002).
Todo: allow image glyphs on Unicode pages, write test for combining characters