The Unicode Solution

In order to resolve this problem, a computer industry consortium was founded to create a second generation of ASCII that addressed the problem of multiple languages and alphabets, and the numerous special symbols used in scientific and technical writing. This standard, called Unicode, specifies the use of 16-bit characters in order to represent most of the known characters and languages in modern writing.

The Unicode standard defines a character as the representation within a computer or on storage media of the letters, punctuation, and other signs that comprise natural language, mathematical, or scientific text. The character is not what you see; glyphs appear on the screen or paper as a representation of one or more characters. A complete set of glyphs make up a font. These definitions will be used throughout the rest of this paper.

In attempting to solve the problem that users experience when communicating between computers, Unicode has one fatal flaw: current computer systems communicate in 7-bit or 8-bit bytes. Although support for Unicode is growing, especially in the PC, Macintosh, and UNIX workstation markets, the vast majority of computer users must use 8-bit character sets to communicate.