Code Page Overview
One of the challenges in supporting multiple languages in computer software is
the amazing array of phrases and mechanisms used to describe something that
seems so simple: When I press a key on the keyboard, how does the computer know
which character to draw on the screen?
To draw text on the screen, the computer uses a character set. A character
set consists of a font, the set of symbols you see on the screen or printed page,
and a character encoding, which assigns a numerical value to each letter or
punctuation mark in the language.
To a user trying to communicate data between two computers, this can have
either no effect at all, or dire consequences. If both computers agree on the
character set no difficulties are encountered. If one computer is configured to
work in Français and the other in US English, problems occur: the standard US
ASCII character set defines only 95 printable characters and does not include the
ç character required to display the word Français!
The MS-DOS (and PC-DOS) operating systems assign several different character
sets for customers with different language needs. These character sets, which
are supported by PC hardware in the keyboard and video display, are known as
code pages. Each code page supports 256 characters. Many similarities exist
between the various code pages, but no two are identical. In some cases, the PC
code page commonly used in a country does not agree with that country's national
standard character set.
In many cases, the different national and industry standards for character
encoding overload the meaning of characters: the same character represents a
different letter in the different encodings. A simple example of this is the 35th
character in US ASCII, the # character. In the United Kingdom, this character
is £, which is not represented in US ASCII.
For more information on code pages, also see:
For more information on character sets, also see: