YAML

Character Encoding Standards - YAML

Character encoding standards are specifications for representing characters as digital data in computers. Various schemes exist including ASCII, UTF-8, UTF-16, Shift_JIS, and EUC-JP, each with different character sets, byte structures, and compatibility characteristics. While UTF-8 has become the international standard widely adopted today, legacy encodings like Shift_JIS and EUC-JP are still used in Japanese computing environments.

character encoding Unicode UTF-8 ASCII Shift_JIS EUC-JP charset internationalization
- code: "ASCII"
  slug: "ascii"
  name: "ASCII"
  description: "American Standard Code for Information Interchange. Represents alphanumeric characters and symbols in 7 bits."
  yearIntroduced: 1963
  byteStructure: "固定長(1バイト、7ビット使用)"
  maxCharacters: 128
  asciiCompatible: true
  japaneseSupport: false
  usage: "legacy"
- code: "UTF-8"
  slug: "utf-8"
  name: "UTF-8"
  description: "Variable-length Unicode encoding. ASCII-compatible and represents characters worldwide."
  yearIntroduced: 1993
  byteStructure: "可変長(1〜4バイト)"
  maxCharacters: 1114112
  asciiCompatible: true
  japaneseSupport: true
  usage: "standard"
- code: "UTF-16"
  slug: "utf-16"
  name: "UTF-16"
  description: "16-bit Unicode encoding. Widely used in Windows and Java."
  yearIntroduced: 1996
  byteStructure: "可変長(2または4バイト)"
  maxCharacters: 1114112
  asciiCompatible: false
  japaneseSupport: true
  usage: "system"
- code: "UTF-32"
  slug: "utf-32"
  name: "UTF-32"
  description: "Fixed-length 32-bit Unicode encoding. Used for internal processing."
  yearIntroduced: 1996
  byteStructure: "固定長(4バイト)"
  maxCharacters: 1114112
  asciiCompatible: false
  japaneseSupport: true
  usage: "internal"
- code: "Shift_JIS"
  slug: "shift-jis"
  name: "Shift_JIS"
  description: "Legacy Japanese encoding. Widely used in Windows systems."
  yearIntroduced: 1978
  byteStructure: "可変長(1〜2バイト)"
  maxCharacters: 10000
  asciiCompatible: false
  japaneseSupport: true
  usage: "legacy"
- code: "EUC-JP"
  slug: "euc-jp"
  name: "EUC-JP"
  description: "Japanese encoding used in Unix/Linux. ASCII-compatible."
  yearIntroduced: 1988
  byteStructure: "可変長(1〜3バイト)"
  maxCharacters: 11000
  asciiCompatible: true
  japaneseSupport: true
  usage: "legacy"
- code: "ISO-2022-JP"
  slug: "iso-2022-jp"
  name: "ISO-2022-JP"
  description: "7-bit escape sequence Japanese encoding. Used in email."
  yearIntroduced: 1983
  byteStructure: "7ビット可変長(エスケープシーケンス使用)"
  maxCharacters: 10000
  asciiCompatible: true
  japaneseSupport: true
  usage: "legacy"
- code: "GB2312"
  slug: "gb2312"
  name: "GB2312"
  description: "National standard encoding for Simplified Chinese."
  yearIntroduced: 1980
  byteStructure: "可変長(1〜2バイト)"
  maxCharacters: 7445
  asciiCompatible: false
  japaneseSupport: false
  usage: "legacy"
- code: "Big5"
  slug: "big5"
  name: "Big5"
  description: "Traditional Chinese encoding used in Taiwan and Hong Kong."
  yearIntroduced: 1984
  byteStructure: "可変長(1〜2バイト)"
  maxCharacters: 13000
  asciiCompatible: false
  japaneseSupport: false
  usage: "legacy"
- code: "Windows-1252"
  slug: "windows-1252"
  name: "Windows-1252"
  description: "Western European encoding used in Windows."
  yearIntroduced: 1992
  byteStructure: "固定長(1バイト)"
  maxCharacters: 256
  asciiCompatible: true
  japaneseSupport: false
  usage: "legacy"