CSV

Character Encoding Standards - CSV

Character encoding standards are specifications for representing characters as digital data in computers. Various schemes exist including ASCII, UTF-8, UTF-16, Shift_JIS, and EUC-JP, each with different character sets, byte structures, and compatibility characteristics. While UTF-8 has become the international standard widely adopted today, legacy encodings like Shift_JIS and EUC-JP are still used in Japanese computing environments.

character encoding Unicode UTF-8 ASCII Shift_JIS EUC-JP charset internationalization
code,slug,name,description,asciiCompatible,byteStructure,japaneseSupport,maxCharacters,usage,yearIntroduced
ASCII,ascii,ASCII,American Standard Code for Information Interchange. Represents alphanumeric characters and symbols in 7 bits.,true,固定長(1バイト、7ビット使用),false,128,legacy,1963
UTF-8,utf-8,UTF-8,Variable-length Unicode encoding. ASCII-compatible and represents characters worldwide.,true,可変長(1〜4バイト),true,1114112,standard,1993
UTF-16,utf-16,UTF-16,16-bit Unicode encoding. Widely used in Windows and Java.,false,可変長(2または4バイト),true,1114112,system,1996
UTF-32,utf-32,UTF-32,Fixed-length 32-bit Unicode encoding. Used for internal processing.,false,固定長(4バイト),true,1114112,internal,1996
Shift_JIS,shift-jis,Shift_JIS,Legacy Japanese encoding. Widely used in Windows systems.,false,可変長(1〜2バイト),true,10000,legacy,1978
EUC-JP,euc-jp,EUC-JP,Japanese encoding used in Unix/Linux. ASCII-compatible.,true,可変長(1〜3バイト),true,11000,legacy,1988
ISO-2022-JP,iso-2022-jp,ISO-2022-JP,7-bit escape sequence Japanese encoding. Used in email.,true,7ビット可変長(エスケープシーケンス使用),true,10000,legacy,1983
GB2312,gb2312,GB2312,National standard encoding for Simplified Chinese.,false,可変長(1〜2バイト),false,7445,legacy,1980
Big5,big5,Big5,Traditional Chinese encoding used in Taiwan and Hong Kong.,false,可変長(1〜2バイト),false,13000,legacy,1984
Windows-1252,windows-1252,Windows-1252,Western European encoding used in Windows.,true,固定長(1バイト),false,256,legacy,1992