YAML
World Languages Details - YAML
World Languages Details provides comprehensive information about major languages spoken worldwide, based on linguistic research such as Ethnologue. For each language, it includes language family classification, writing system (script), number of native speakers (L1), second language speakers (L2), and total speaker statistics. While there are currently over 7,000 languages in the world, this dataset focuses primarily on major languages with 50 million or more speakers. It can be utilized in various fields including linguistic research, international business, education, and translation services.
language
language family
writing system
speakers
linguistics
multilingual
international communication
- code: "1"
slug: "english"
name: "English"
description: "The most widely spoken language in the world, functioning as an international lingua franca."
languageFamily: "Indo-European (Germanic)"
writingSystem: "Latin alphabet"
totalSpeakers: 1520000000
nativeSpeakers: 380000000
secondLanguageSpeakers: 1140000000
- code: "2"
slug: "mandarin-chinese"
name: "Mandarin Chinese"
description: "The language with the most native speakers worldwide, primarily used in mainland China, Taiwan, and Singapore."
languageFamily: "Sino-Tibetan (Sinitic)"
writingSystem: "Chinese characters (Simplified/Traditional)"
totalSpeakers: 1180000000
nativeSpeakers: 940000000
secondLanguageSpeakers: 240000000
- code: "3"
slug: "hindi"
name: "Hindi"
description: "A major language of India belonging to the Indo-Aryan branch."
languageFamily: "Indo-European (Indo-Aryan)"
writingSystem: "Devanagari script"
totalSpeakers: 610000000
nativeSpeakers: 350000000
secondLanguageSpeakers: 260000000
- code: "4"
slug: "spanish"
name: "Spanish"
description: "A Romance language widely used in Spain and Latin America."
languageFamily: "Indo-European (Romance)"
writingSystem: "Latin alphabet"
totalSpeakers: 560000000
nativeSpeakers: 490000000
secondLanguageSpeakers: 70000000
- code: "5"
slug: "french"
name: "French"
description: "A Romance language with growing speakers, particularly in Africa."
languageFamily: "Indo-European (Romance)"
writingSystem: "Latin alphabet"
totalSpeakers: 320000000
nativeSpeakers: 80000000
secondLanguageSpeakers: 240000000
- code: "6"
slug: "arabic"
name: "Arabic"
description: "An Afro-Asiatic language widely used in the Middle East and North Africa."
languageFamily: "Afro-Asiatic (Semitic)"
writingSystem: "Arabic script"
totalSpeakers: 340000000
nativeSpeakers: 320000000
secondLanguageSpeakers: 20000000
- code: "7"
slug: "bengali"
name: "Bengali"
description: "An Indo-Aryan language used in Bangladesh and eastern India."
languageFamily: "Indo-European (Indo-Aryan)"
writingSystem: "Bengali script"
totalSpeakers: 280000000
nativeSpeakers: 230000000
secondLanguageSpeakers: 50000000
- code: "8"
slug: "portuguese"
name: "Portuguese"
description: "A Romance language used in the Lusophone world including Brazil."
languageFamily: "Indo-European (Romance)"
writingSystem: "Latin alphabet"
totalSpeakers: 270000000
nativeSpeakers: 230000000
secondLanguageSpeakers: 40000000
- code: "9"
slug: "russian"
name: "Russian"
description: "A Slavic language widely used in former Soviet countries."
languageFamily: "Indo-European (Slavic)"
writingSystem: "Cyrillic script"
totalSpeakers: 260000000
nativeSpeakers: 150000000
secondLanguageSpeakers: 110000000
- code: "10"
slug: "urdu"
name: "Urdu"
description: "An Indo-Aryan language and the national language of Pakistan."
languageFamily: "Indo-European (Indo-Aryan)"
writingSystem: "Perso-Arabic script"
totalSpeakers: 250000000
nativeSpeakers: 70000000
secondLanguageSpeakers: 180000000
- code: "11"
slug: "indonesian"
name: "Indonesian"
description: "An Austronesian language and the official language of Indonesia."
languageFamily: "Austronesian"
writingSystem: "Latin alphabet"
totalSpeakers: 250000000
nativeSpeakers: 40000000
secondLanguageSpeakers: 210000000
- code: "12"
slug: "german"
name: "German"
description: "A Germanic language widely used in Central Europe."
languageFamily: "Indo-European (Germanic)"
writingSystem: "Latin alphabet"
totalSpeakers: 135000000
nativeSpeakers: 80000000
secondLanguageSpeakers: 55000000
- code: "13"
slug: "japanese"
name: "Japanese"
description: "A language spoken in Japan, considered a language isolate in terms of classification."
languageFamily: "Japonic"
writingSystem: "Kanji, Hiragana, and Katakana"
totalSpeakers: 125000000
nativeSpeakers: 124000000
secondLanguageSpeakers: 1000000
- code: "14"
slug: "nigerian-pidgin"
name: "Nigerian Pidgin"
description: "An English-based creole widely used in Nigeria."
languageFamily: "English-based Creole"
writingSystem: "Latin alphabet"
totalSpeakers: 120000000
nativeSpeakers: 5000000
secondLanguageSpeakers: 115000000
- code: "15"
slug: "egyptian-arabic"
name: "Egyptian Arabic"
description: "A variety of Arabic widely used in Egypt."
languageFamily: "Afro-Asiatic (Semitic)"
writingSystem: "Arabic script"
totalSpeakers: 120000000
nativeSpeakers: 100000000
secondLanguageSpeakers: 20000000
- code: "16"
slug: "marathi"
name: "Marathi"
description: "An Indo-Aryan language of India, primarily used in Maharashtra state."
languageFamily: "Indo-European (Indo-Aryan)"
writingSystem: "Devanagari script"
totalSpeakers: 99000000
nativeSpeakers: 83000000
secondLanguageSpeakers: 16000000
- code: "17"
slug: "telugu"
name: "Telugu"
description: "A Dravidian language widely used in southern India."
languageFamily: "Dravidian"
writingSystem: "Telugu script"
totalSpeakers: 96000000
nativeSpeakers: 83000000
secondLanguageSpeakers: 13000000
- code: "18"
slug: "turkish"
name: "Turkish"
description: "A Turkic language primarily used in Turkey and Cyprus."
languageFamily: "Turkic"
writingSystem: "Latin alphabet"
totalSpeakers: 91000000
nativeSpeakers: 84000000
secondLanguageSpeakers: 7000000
- code: "19"
slug: "tamil"
name: "Tamil"
description: "A classical Dravidian language used in southern India and Sri Lanka."
languageFamily: "Dravidian"
writingSystem: "Tamil script"
totalSpeakers: 86000000
nativeSpeakers: 75000000
secondLanguageSpeakers: 11000000
- code: "20"
slug: "cantonese"
name: "Cantonese (Yue Chinese)"
description: "A Chinese variety primarily used in Hong Kong, Macau, and Guangdong Province."
languageFamily: "Sino-Tibetan (Sinitic)"
writingSystem: "Chinese characters"
totalSpeakers: 86000000
nativeSpeakers: 85000000
secondLanguageSpeakers: 1000000
- code: "21"
slug: "vietnamese"
name: "Vietnamese"
description: "An Austroasiatic language and an East Asian language using the Latin alphabet."
languageFamily: "Austroasiatic (Viet-Muong)"
writingSystem: "Latin alphabet (Quoc Ngu)"
totalSpeakers: 97000000
nativeSpeakers: 85000000
secondLanguageSpeakers: 12000000
- code: "22"
slug: "wu-chinese"
name: "Wu Chinese (Shanghainese)"
description: "A Chinese variety used in the Yangtze River Delta region centered on Shanghai."
languageFamily: "Sino-Tibetan (Sinitic)"
writingSystem: "Chinese characters"
totalSpeakers: 83000000
nativeSpeakers: 83000000
secondLanguageSpeakers: 0
- code: "23"
slug: "tagalog"
name: "Tagalog (Filipino)"
description: "An Austronesian language and the official language of the Philippines."
languageFamily: "Austronesian"
writingSystem: "Latin alphabet"
totalSpeakers: 83000000
nativeSpeakers: 30000000
secondLanguageSpeakers: 53000000
- code: "24"
slug: "korean"
name: "Korean"
description: "A language spoken on the Korean Peninsula with its own writing system called Hangul."
languageFamily: "Koreanic (language isolate)"
writingSystem: "Hangul"
totalSpeakers: 82000000
nativeSpeakers: 80000000
secondLanguageSpeakers: 2000000
- code: "25"
slug: "farsi"
name: "Persian (Farsi)"
description: "An Indo-European language spoken in Iran."
languageFamily: "Indo-European (Iranian)"
writingSystem: "Perso-Arabic script"
totalSpeakers: 79000000
nativeSpeakers: 55000000
secondLanguageSpeakers: 24000000