TSV
World Languages Details - TSV
World Languages Details provides comprehensive information about major languages spoken worldwide, based on linguistic research such as Ethnologue. For each language, it includes language family classification, writing system (script), number of native speakers (L1), second language speakers (L2), and total speaker statistics. While there are currently over 7,000 languages in the world, this dataset focuses primarily on major languages with 50 million or more speakers. It can be utilized in various fields including linguistic research, international business, education, and translation services.
language
language family
writing system
speakers
linguistics
multilingual
international communication
code slug name description languageFamily nativeSpeakers secondLanguageSpeakers totalSpeakers writingSystem
1 english English The most widely spoken language in the world, functioning as an international lingua franca. Indo-European (Germanic) 380000000 1140000000 1520000000 Latin alphabet
2 mandarin-chinese Mandarin Chinese The language with the most native speakers worldwide, primarily used in mainland China, Taiwan, and Singapore. Sino-Tibetan (Sinitic) 940000000 240000000 1180000000 Chinese characters (Simplified/Traditional)
3 hindi Hindi A major language of India belonging to the Indo-Aryan branch. Indo-European (Indo-Aryan) 350000000 260000000 610000000 Devanagari script
4 spanish Spanish A Romance language widely used in Spain and Latin America. Indo-European (Romance) 490000000 70000000 560000000 Latin alphabet
5 french French A Romance language with growing speakers, particularly in Africa. Indo-European (Romance) 80000000 240000000 320000000 Latin alphabet
6 arabic Arabic An Afro-Asiatic language widely used in the Middle East and North Africa. Afro-Asiatic (Semitic) 320000000 20000000 340000000 Arabic script
7 bengali Bengali An Indo-Aryan language used in Bangladesh and eastern India. Indo-European (Indo-Aryan) 230000000 50000000 280000000 Bengali script
8 portuguese Portuguese A Romance language used in the Lusophone world including Brazil. Indo-European (Romance) 230000000 40000000 270000000 Latin alphabet
9 russian Russian A Slavic language widely used in former Soviet countries. Indo-European (Slavic) 150000000 110000000 260000000 Cyrillic script
10 urdu Urdu An Indo-Aryan language and the national language of Pakistan. Indo-European (Indo-Aryan) 70000000 180000000 250000000 Perso-Arabic script
11 indonesian Indonesian An Austronesian language and the official language of Indonesia. Austronesian 40000000 210000000 250000000 Latin alphabet
12 german German A Germanic language widely used in Central Europe. Indo-European (Germanic) 80000000 55000000 135000000 Latin alphabet
13 japanese Japanese A language spoken in Japan, considered a language isolate in terms of classification. Japonic 124000000 1000000 125000000 Kanji, Hiragana, and Katakana
14 nigerian-pidgin Nigerian Pidgin An English-based creole widely used in Nigeria. English-based Creole 5000000 115000000 120000000 Latin alphabet
15 egyptian-arabic Egyptian Arabic A variety of Arabic widely used in Egypt. Afro-Asiatic (Semitic) 100000000 20000000 120000000 Arabic script
16 marathi Marathi An Indo-Aryan language of India, primarily used in Maharashtra state. Indo-European (Indo-Aryan) 83000000 16000000 99000000 Devanagari script
17 telugu Telugu A Dravidian language widely used in southern India. Dravidian 83000000 13000000 96000000 Telugu script
18 turkish Turkish A Turkic language primarily used in Turkey and Cyprus. Turkic 84000000 7000000 91000000 Latin alphabet
19 tamil Tamil A classical Dravidian language used in southern India and Sri Lanka. Dravidian 75000000 11000000 86000000 Tamil script
20 cantonese Cantonese (Yue Chinese) A Chinese variety primarily used in Hong Kong, Macau, and Guangdong Province. Sino-Tibetan (Sinitic) 85000000 1000000 86000000 Chinese characters
21 vietnamese Vietnamese An Austroasiatic language and an East Asian language using the Latin alphabet. Austroasiatic (Viet-Muong) 85000000 12000000 97000000 Latin alphabet (Quoc Ngu)
22 wu-chinese Wu Chinese (Shanghainese) A Chinese variety used in the Yangtze River Delta region centered on Shanghai. Sino-Tibetan (Sinitic) 83000000 0 83000000 Chinese characters
23 tagalog Tagalog (Filipino) An Austronesian language and the official language of the Philippines. Austronesian 30000000 53000000 83000000 Latin alphabet
24 korean Korean A language spoken on the Korean Peninsula with its own writing system called Hangul. Koreanic (language isolate) 80000000 2000000 82000000 Hangul
25 farsi Persian (Farsi) An Indo-European language spoken in Iran. Indo-European (Iranian) 55000000 24000000 79000000 Perso-Arabic script