Markdown
Character Encoding Standards - Markdown
Character encoding standards are specifications for representing characters as digital data in computers. Various schemes exist including ASCII, UTF-8, UTF-16, Shift_JIS, and EUC-JP, each with different character sets, byte structures, and compatibility characteristics. While UTF-8 has become the international standard widely adopted today, legacy encodings like Shift_JIS and EUC-JP are still used in Japanese computing environments.
character encoding
Unicode
UTF-8
ASCII
Shift_JIS
EUC-JP
charset
internationalization
| code | slug | name | description | asciiCompatible | byteStructure | japaneseSupport | maxCharacters | usage | yearIntroduced |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| ASCII | ascii | ASCII | American Standard Code for Information Interchange. Represents alphanumeric characters and symbols in 7 bits. | true | 固定長(1バイト、7ビット使用) | false | 128 | legacy | 1963 |
| UTF-8 | utf-8 | UTF-8 | Variable-length Unicode encoding. ASCII-compatible and represents characters worldwide. | true | 可変長(1〜4バイト) | true | 1114112 | standard | 1993 |
| UTF-16 | utf-16 | UTF-16 | 16-bit Unicode encoding. Widely used in Windows and Java. | false | 可変長(2または4バイト) | true | 1114112 | system | 1996 |
| UTF-32 | utf-32 | UTF-32 | Fixed-length 32-bit Unicode encoding. Used for internal processing. | false | 固定長(4バイト) | true | 1114112 | internal | 1996 |
| Shift_JIS | shift-jis | Shift_JIS | Legacy Japanese encoding. Widely used in Windows systems. | false | 可変長(1〜2バイト) | true | 10000 | legacy | 1978 |
| EUC-JP | euc-jp | EUC-JP | Japanese encoding used in Unix/Linux. ASCII-compatible. | true | 可変長(1〜3バイト) | true | 11000 | legacy | 1988 |
| ISO-2022-JP | iso-2022-jp | ISO-2022-JP | 7-bit escape sequence Japanese encoding. Used in email. | true | 7ビット可変長(エスケープシーケンス使用) | true | 10000 | legacy | 1983 |
| GB2312 | gb2312 | GB2312 | National standard encoding for Simplified Chinese. | false | 可変長(1〜2バイト) | false | 7445 | legacy | 1980 |
| Big5 | big5 | Big5 | Traditional Chinese encoding used in Taiwan and Hong Kong. | false | 可変長(1〜2バイト) | false | 13000 | legacy | 1984 |
| Windows-1252 | windows-1252 | Windows-1252 | Western European encoding used in Windows. | true | 固定長(1バイト) | false | 256 | legacy | 1992 |