TSV
Character Encoding Standards - TSV
Character encoding standards are specifications for representing characters as digital data in computers. Various schemes exist including ASCII, UTF-8, UTF-16, Shift_JIS, and EUC-JP, each with different character sets, byte structures, and compatibility characteristics. While UTF-8 has become the international standard widely adopted today, legacy encodings like Shift_JIS and EUC-JP are still used in Japanese computing environments.
character encoding
Unicode
UTF-8
ASCII
Shift_JIS
EUC-JP
charset
internationalization
code slug name description asciiCompatible byteStructure japaneseSupport maxCharacters usage yearIntroduced
ASCII ascii ASCII American Standard Code for Information Interchange. Represents alphanumeric characters and symbols in 7 bits. true 固定長(1バイト、7ビット使用) false 128 legacy 1963
UTF-8 utf-8 UTF-8 Variable-length Unicode encoding. ASCII-compatible and represents characters worldwide. true 可変長(1〜4バイト) true 1114112 standard 1993
UTF-16 utf-16 UTF-16 16-bit Unicode encoding. Widely used in Windows and Java. false 可変長(2または4バイト) true 1114112 system 1996
UTF-32 utf-32 UTF-32 Fixed-length 32-bit Unicode encoding. Used for internal processing. false 固定長(4バイト) true 1114112 internal 1996
Shift_JIS shift-jis Shift_JIS Legacy Japanese encoding. Widely used in Windows systems. false 可変長(1〜2バイト) true 10000 legacy 1978
EUC-JP euc-jp EUC-JP Japanese encoding used in Unix/Linux. ASCII-compatible. true 可変長(1〜3バイト) true 11000 legacy 1988
ISO-2022-JP iso-2022-jp ISO-2022-JP 7-bit escape sequence Japanese encoding. Used in email. true 7ビット可変長(エスケープシーケンス使用) true 10000 legacy 1983
GB2312 gb2312 GB2312 National standard encoding for Simplified Chinese. false 可変長(1〜2バイト) false 7445 legacy 1980
Big5 big5 Big5 Traditional Chinese encoding used in Taiwan and Hong Kong. false 可変長(1〜2バイト) false 13000 legacy 1984
Windows-1252 windows-1252 Windows-1252 Western European encoding used in Windows. true 固定長(1バイト) false 256 legacy 1992