What is the difference between ASCII and Unicode?

ASCII is a 7-bit standard covering 128 characters (0–127). Unicode is a universal standard covering 149,000+ characters from all writing systems. The first 128 Unicode code points (U+0000–U+007F) are identical to ASCII.

What is the difference between ASCII and UTF-8?

ASCII defines code points 0–127. UTF-8 encodes Unicode as byte sequences. For the ASCII range, UTF-8 is byte-for-byte identical — one byte per character. Characters above 127 use 2–4 bytes in UTF-8.

Why does Windows-1252 differ from ISO-8859-1?

Both extend ASCII to 256 characters but differ in the 0x80–0x9F range. ISO-8859-1 reserves those bytes for C1 control codes. Windows-1252 maps them to useful printable characters including the Euro sign (€), curly quotes, and em dashes.

What are ASCII control characters?

Code points 0–31 are non-printable control characters: NUL (0), TAB (9), LF (10), CR (13), ESC (27), and others. Originally designed for teletype control, Tab, linefeed, and carriage return remain in daily use today.

ASCII Table — Characters, Codes & Encodings

What is the ASCII Table?

ASCII (American Standard Code for Information Interchange) defines 128 characters using 7-bit codes (0–127). It covers control characters (0–31), printable characters including letters, digits and punctuation (32–126), and the delete character (127). ASCII forms the foundation of virtually every modern text encoding.

Supported Encodings

Windows-1252

The most widely used single-byte encoding on Windows. It extends ASCII with 128 additional characters in the 0x80–0xFF range, including the Euro sign (€), curly quotes, em dashes, and common European accented characters. Bytes 0x80–0x9F hold special characters rather than the C1 control codes used by ISO-8859-1.

ISO-8859-1 (Latin-1)

An international standard for Western European languages. Bytes 128–159 are reserved C1 control characters, and bytes 160–255 map directly to the corresponding Unicode code points (U+00A0–U+00FF). Most HTML pages historically defaulted to Latin-1 before UTF-8 became universal.

ISO-8859-15 (Latin-9)

A revision of Latin-1 that replaces eight less-used characters to add the Euro sign (€), Š/š, Ž/ž, Œ/œ, and Ÿ — improving support for French, Finnish, and other Western European languages.

ISO-8859-9 (Latin-5 / Turkish)

Identical to ISO-8859-1 except for six positions (0xD0, 0xDD, 0xDE, 0xF0, 0xFD, 0xFE) which are replaced by Turkish-specific letters: Ğ/ğ (G with breve), İ/ı (dotted/dotless I), and Ş/ş (S with cedilla). The de-facto standard for Turkish text before UTF-8.

ISO-8859-7 (Greek)

The standard 8-bit encoding for the modern Greek alphabet. The ASCII range (0–127) is unchanged; bytes 0xB4–0xFF cover the full Greek alphabet including accented vowels, the tonos and dialytika diacritics, and the final sigma (ς). Bytes 0xA4 and 0xA5 map to the Euro (€) and Drachma (₯) signs in the 2003 revision.

ISO-8859-5 (Cyrillic)

An ISO standard for Cyrillic script used by Russian, Bulgarian, Serbian and related languages. Bytes 0xB0–0xEF map sequentially to the full 64-character Cyrillic block (А–я), with supplementary characters like Ё/ё at 0xA1/0xF1 and the numero sign (№) at 0xF0.

Windows-1251 (Cyrillic)

The dominant Cyrillic encoding on Windows and the web before UTF-8. Like Windows-1252 it repurposes bytes 0x80–0x9F for useful characters (typographic quotes, em dash, €, and additional Cyrillic letters) rather than C1 controls. 0xC0–0xFF map contiguously to А–я, making it compact and easy to use.

ISO-8859-6 (Arabic)

The ISO standard for Arabic script. Much of the 0xA0–0xFF range is undefined — only bytes 0xC1–0xDA (basic Arabic consonants) and 0xE0–0xF2 (vowel marks / harakat) carry characters. The Arabic comma (،) sits at 0xAC and the Arabic question mark (؟) at 0xBF. Right-to-left rendering is not encoded in the byte stream.

ISO-8859-8 (Hebrew)

The ISO standard for modern Hebrew. Bytes 0xE0–0xFA map to the 27 Hebrew letters (Alef–Tav including final forms). The range 0xA0–0xBF largely mirrors ISO-8859-1 but substitutes the multiplication sign (×) at 0xAA and division sign (÷) at 0xBA. Bytes 0xFD/0xFE carry Left-to-Right and Right-to-Left marks.

CP437 (DOS/OEM)

The original IBM PC character set. Code points 0–31 carry graphical glyphs (smiley faces, card suits, box-drawing pieces) instead of control characters, and 128–255 include Greek letters, mathematical symbols, and the iconic box-drawing characters used throughout classic DOS software.

Understanding Character Codes

Decimal: The standard base-10 code point used in most documentation (e.g., 65 for 'A')
Hexadecimal: Base-16 representation preferred in programming (e.g., 0x41)
Octal: Base-8, used in Unix file permissions and older systems (e.g., 101)
Binary: The raw 8-bit value as stored in memory (e.g., 01000001)
HTML Entity: Named or numeric reference for safe inclusion in HTML markup
Unicode: The universal standard code point (U+0041) — Latin-1 characters map 1:1 to Unicode

View Modes

Table: Full data table showing all representations — ideal for looking up a specific character
Grid: Visual card grid sorted by code — best for browsing and copying by appearance
Compact: Classic 16-column reference matrix — the traditional format used in programming manuals
Chart: Category distribution bar chart and heatmap — shows the structure of the encoding at a glance

Common Use Cases

Debugging encoding issues: Identify the decimal or hex value of a character that looks wrong in a string — essential when tracking down byte-order marks, non-breaking spaces, or invisible control characters.
Low-level programming: Look up the numeric code for a character to use in comparisons without magic numbers — 'A' = 65 = 0x41 = 0b01000001.
HTML entities: Find the HTML entity or numeric reference for any special character to include safely in markup without escaping issues.
Legacy system migration: Map Windows-1252 or CP437 code points to their Unicode equivalents when migrating data from older systems.

Frequently Asked Questions

What is the difference between ASCII and Unicode?: ASCII is a 7-bit standard covering 128 characters (0–127). Unicode is a universal standard covering 149,000+ characters from all writing systems. The first 128 Unicode code points (U+0000–U+007F) are identical to ASCII.
What is the difference between ASCII and UTF-8?: ASCII defines code points 0–127. UTF-8 encodes Unicode as byte sequences. For the ASCII range, UTF-8 is byte-for-byte identical — one byte per character. Characters above 127 use 2–4 bytes in UTF-8.
Why does Windows-1252 differ from ISO-8859-1?: Both extend ASCII to 256 characters but differ in the 0x80–0x9F range. ISO-8859-1 reserves those bytes for C1 control codes. Windows-1252 maps them to useful printable characters including the Euro sign (€), curly quotes, and em dashes.
What are ASCII control characters?: Code points 0–31 are non-printable control characters: NUL (0), TAB (9), LF (10), CR (13), ESC (27), and others. Originally designed for teletype control, Tab, linefeed, and carriage return remain in daily use today.

Privacy & Security

This is a reference table — no data is entered or transmitted. All character lookups happen locally in your browser.