CSV
World Languages Details - CSV
World Languages Details provides comprehensive information about major languages spoken worldwide, based on linguistic research such as Ethnologue. For each language, it includes language family classification, writing system (script), number of native speakers (L1), second language speakers (L2), and total speaker statistics. While there are currently over 7,000 languages in the world, this dataset focuses primarily on major languages with 50 million or more speakers. It can be utilized in various fields including linguistic research, international business, education, and translation services.
language
language family
writing system
speakers
linguistics
multilingual
international communication
code,slug,name,description,languageFamily,nativeSpeakers,secondLanguageSpeakers,totalSpeakers,writingSystem
1,english,English,"The most widely spoken language in the world, functioning as an international lingua franca.",Indo-European (Germanic),380000000,1140000000,1520000000,Latin alphabet
2,mandarin-chinese,Mandarin Chinese,"The language with the most native speakers worldwide, primarily used in mainland China, Taiwan, and Singapore.",Sino-Tibetan (Sinitic),940000000,240000000,1180000000,Chinese characters (Simplified/Traditional)
3,hindi,Hindi,A major language of India belonging to the Indo-Aryan branch.,Indo-European (Indo-Aryan),350000000,260000000,610000000,Devanagari script
4,spanish,Spanish,A Romance language widely used in Spain and Latin America.,Indo-European (Romance),490000000,70000000,560000000,Latin alphabet
5,french,French,"A Romance language with growing speakers, particularly in Africa.",Indo-European (Romance),80000000,240000000,320000000,Latin alphabet
6,arabic,Arabic,An Afro-Asiatic language widely used in the Middle East and North Africa.,Afro-Asiatic (Semitic),320000000,20000000,340000000,Arabic script
7,bengali,Bengali,An Indo-Aryan language used in Bangladesh and eastern India.,Indo-European (Indo-Aryan),230000000,50000000,280000000,Bengali script
8,portuguese,Portuguese,A Romance language used in the Lusophone world including Brazil.,Indo-European (Romance),230000000,40000000,270000000,Latin alphabet
9,russian,Russian,A Slavic language widely used in former Soviet countries.,Indo-European (Slavic),150000000,110000000,260000000,Cyrillic script
10,urdu,Urdu,An Indo-Aryan language and the national language of Pakistan.,Indo-European (Indo-Aryan),70000000,180000000,250000000,Perso-Arabic script
11,indonesian,Indonesian,An Austronesian language and the official language of Indonesia.,Austronesian,40000000,210000000,250000000,Latin alphabet
12,german,German,A Germanic language widely used in Central Europe.,Indo-European (Germanic),80000000,55000000,135000000,Latin alphabet
13,japanese,Japanese,"A language spoken in Japan, considered a language isolate in terms of classification.",Japonic,124000000,1000000,125000000,"Kanji, Hiragana, and Katakana"
14,nigerian-pidgin,Nigerian Pidgin,An English-based creole widely used in Nigeria.,English-based Creole,5000000,115000000,120000000,Latin alphabet
15,egyptian-arabic,Egyptian Arabic,A variety of Arabic widely used in Egypt.,Afro-Asiatic (Semitic),100000000,20000000,120000000,Arabic script
16,marathi,Marathi,"An Indo-Aryan language of India, primarily used in Maharashtra state.",Indo-European (Indo-Aryan),83000000,16000000,99000000,Devanagari script
17,telugu,Telugu,A Dravidian language widely used in southern India.,Dravidian,83000000,13000000,96000000,Telugu script
18,turkish,Turkish,A Turkic language primarily used in Turkey and Cyprus.,Turkic,84000000,7000000,91000000,Latin alphabet
19,tamil,Tamil,A classical Dravidian language used in southern India and Sri Lanka.,Dravidian,75000000,11000000,86000000,Tamil script
20,cantonese,Cantonese (Yue Chinese),"A Chinese variety primarily used in Hong Kong, Macau, and Guangdong Province.",Sino-Tibetan (Sinitic),85000000,1000000,86000000,Chinese characters
21,vietnamese,Vietnamese,An Austroasiatic language and an East Asian language using the Latin alphabet.,Austroasiatic (Viet-Muong),85000000,12000000,97000000,Latin alphabet (Quoc Ngu)
22,wu-chinese,Wu Chinese (Shanghainese),A Chinese variety used in the Yangtze River Delta region centered on Shanghai.,Sino-Tibetan (Sinitic),83000000,0,83000000,Chinese characters
23,tagalog,Tagalog (Filipino),An Austronesian language and the official language of the Philippines.,Austronesian,30000000,53000000,83000000,Latin alphabet
24,korean,Korean,A language spoken on the Korean Peninsula with its own writing system called Hangul.,Koreanic (language isolate),80000000,2000000,82000000,Hangul
25,farsi,Persian (Farsi),An Indo-European language spoken in Iran.,Indo-European (Iranian),55000000,24000000,79000000,Perso-Arabic script