Here’s a quick brain teaser: what do the following languages all have in common? Hindi (हिन्दी), Farsi (فارسی), Russian (русский язык), Spanish (español), and English. They’re diverse major languages that use completely different writing systems and grammatical mechanics. In spite of that, these languages all descended from the same language, a mother language spoken ~5,000 years ago!
The language family
In linguistics, that’s what we call a “language family”. By studying the grammar and history of different languages, scholars can come up with a pretty good guess for which languages descended from the same ancestral language. Those languages are then considered part of the same “language family”. “Indo-European” is the common name given to the language family that English and these other languages descended from. If you’re interested in what these languages share in common culturally, you can check out more about that here (link).
Indo-European definitely isn’t the only language family in the world though! Some of the other widely spoken families include Sino-Tibetan (like Burmese and all the varieties of Chinese), Niger-Congo (like Swahili and most of the languages in sub-Saharan Africa), and Afroasiatic (like Arabic and Hebrew). See below for a more complete list of language families for languages spoken today.
Language Family | Example Languages | # speakers (approx.) |
---|---|---|
North America | ||
Algic | Cree, Ojibwe | 200,000 |
Caddoan | Caddo, Arikara | 50 |
Dené-Yeniseian* | Navajo, Apache, Chipewyan | 200,000 |
Eskimo-Aleut* | Greenlandic, Inuktitut, Yup’ik | 100,000 |
Hokan | Chontal, Havasupai-Hualapai | 10,000 |
Iroquoian | Mohawk, Cherokee | 6,000 |
Macro-Chibchan* | Miskito, Guaymí, Kuna | 500,000 |
Mayan | Kʼicheʼ, Yucatec Maya | 6,000,000 |
Muskogean | Choctaw, Muscogee/Creek | 20,000 |
Oto-Manguean | Zapotec, Mixtec | 2,000,000 |
Penutian | Tsimshian, Sahaptin | 2,000 |
Salishan | Okanagan, Lillooet | 2,000 |
Siouan | Sioux, Crow | 40,000 |
Tanoan | Jemez, Tewa, Southern Tiwa | 7,000 |
Totozoquean | Totonac, Mixe | 400,000 |
Uto-Aztecan | Nahuatl, Tarahumara, Shoshoni | 2,000,000 |
Wakashan | Nuu-chah-nulth, Kwakʼwala | 1,000 |
South America | ||
Araucanian | Mapuche, Huilliche | 300,000 |
Arawakan | Wayuu, Garifuna | 700,000 |
Aymaran | Aymara, Jaqaru | 2,000,000 |
Barbacoan | Coconucan, Awa Pit, Cha’palaa | 50,000 |
Chapacuran | Wari’, Itene | 3,000 |
Choco | Emberá, Wounaan | 10,000 |
Duho | Ticuna, Carabayo | 60,000 |
Harákmbut–Katukinan | Kulina, Harákmbut | 8,000 |
Je–Tupi–Carib | Paraguayan Guarani | 7,000,000 |
Macro-Andean | Aguaruna, Shuar | 200,000 |
Macro-Panoan | Shipibo, Chimane | 50,000 |
Mataco–Guaicuru | Toba Qom, Wichí Lhamtés Vejoz | 100,000 |
Nambikwaran | Nambikwara, Mamaindê | 1,000 |
Piaroa–Saliban | Piaroa, Saliba | 20,000 |
Puinave–Maku | Puinave, Hup | 6,000 |
Quechuan | Cuzco Quechua, Ancash Quechua | 8,000,000 |
Tucanoan | Cubeo, Tucano, Koreguaje | 30,000 |
Yanomaman | Yanomamö, Sanumá | 30,000 |
Zamucoan | Ayoreo, Chamacoco | 5,000 |
Africa | ||
Afroasiatic | Arabic, Berber, Hausa, Oromo | 500,000,000 |
Khoe | Khoekhoe, Naro | 200,000 |
Kxʼa | !Kung, ǂʼAmkoe | 15,000 |
Niger-Congo | Yoruba, Swahili, Fula | 700,000,000 |
Nilo-Saharan | Luo, Kanuri, Songhay | 50,000,000 |
Tuu | Taa, Nǁng | 3,000 |
Eurasia | ||
Austroasiatic | Vietnamese, Khmer, Mon | 100,000,000 |
Austronesian | Malay, Malagasy, Fijian | 400,000,000 |
Chukotko-Kamchatkan | Chukchi, Koryak, Itelmen | 7,000 |
Dravidian | Telugu, Tamil, Tulu | 300,000,000 |
Great Andamanese | Aka-Jeru, Aka-Cari | 3 |
Hmong–Mien | Hmong, Hmu, Iu Mien | 10,000,000 |
Indo-European | English, Hindustani, Spanish, Russian | 3,200,000,000 |
Japonic | Japanese, Okinawan | 100,000,000 |
Kartvelian | Georgian, Mingrelian, Svan | 5,000,000 |
Koreanic | Korean, Jeju | 100,000,000 |
Kra-Dai | Thai, Lao, Kam | 100,000,000 |
Mongolic | Mongolian, Oirat, Santa | 6,000,000 |
Northeast Caucasian | Chechen, Avar, Lezgian | 4,000,000 |
Ongan | Jarawa, Sentinelese | 500 |
Sino-Tibetan | Mandarin, Burmese | 1,400,000,000 |
Tungusic | Xibe, Evenki | 80,000 |
Turkic | Turkish, Kazakh, Chuvash | 200,000,000 |
Uralic-Yukaghir | Hungarian, Finnish, Mari | 30,000,000 |
Australia | ||
Bunuban | Guniyandi, Bunuba | 200 |
Greater Pama–Nyungan | Arrernte, Dhuwal, Warlpiri | 40,000 |
Jarrakan | Gija, Miriwoong | 300 |
Macro-Gunwinyguan | Bininj Gun-Wok, Anindilyakwa, Burarra | 5,000 |
Mirndi | Wambaya, Jaminjung | 100 |
Nyulnyulan | Nyigina, Yawuru | 100 |
Southern Daly | Murrinh-patha, Ngan’gi | 2,000 |
Western Daly | Marranj, Marrithiyel | 100 |
Worrorran | Ngarinyin, Wunambal | 100 |
New Guinea | ||
Arai–Samaia | Iteri, Amto | 3,000 |
Binanderean–Goilalan | Orokaiva, Fuyug, Purari | 100,000 |
Bulaka River | Yelmek, Maklew | 500 |
Central Solomons | Bilua, Savosavo | 20,000 |
East Geelvink Bay | Nisa-Anasi, Baropasi | 6,000 |
East New Britain | Qaqet, Mali, Tauli | 10,000 |
Eleman | Orokolo, Toaripi | 80,000 |
Fas | Fas, Baibai | 3,000 |
Foja Range | Kwerba, Gresi, Orya | 30,000 |
Kaure–Kosare | Kaure, Kosare | 700 |
Lakes Plain | Kâte, Iau, Fayu | 30,000 |
Madang–Upper Yuat | Waskia, Kalam, Haruai | 200,000 |
Mairasi | Mairasi, Semimi | 4,000 |
North Bougainville | Rotokas, Rapoisi | 10,000 |
Northwest Papuan-Border | Sentani, Amanab, Vanimo | 60,000 |
Papuan Gulf | Gogodala, Dadibi, Foi | 70,000 |
Pauwasi | Emem, Biksi-Yetfa | 6,000 |
Ramu–Lower Sepik | Kambot, Angoram, Rao | 70,000 |
Senagi | Dera, Angor | 3,000 |
Senu River | Nai, Kwomtari, Yalë | 2,000 |
Sepik | Abelam, Kwanga, Abau | 200,000 |
South Bougainville | Terei, Naasioi | 70,000 |
Torricelli | Bukiyip, Olo | 80,000 |
Trans-Fly | Wipi, Tabo | 10,000 |
Trans–New Guinea | Enga, Western Dani, Melpa, Makasae | 3,000,000 |
West Papuan | Ternate, Tehit, Yawa | 200,000 |
Yam | Nama, Yei, Arammba | 10,000 |
West New Britain | Anêm, Ata | 3,000 |
Yuat | Mundugumor, Mekmek | 7,000 |
For me, it’s incredible to see just how many different types of languages there are out there. Compared to the other regions on there, the island of New Guinea is tiny, but it’s home to a huge diversity of completely different languages! Presumably, these languages in different language families on New Guinea are just as different from each other as English and Arabic and Mandarin Chinese are. And there’s also huge diversity in the native languages of the Americas and Australia!
There are about 100 language families listed here. You’ll see that some of them are spoken by much more people than others. In fact, just 7 of these families account for the native language of more than 90% of the world! See the chart below for reference. As it says, the largest families are Indo-European, Sino-Tibetan, Niger-Congo, Afroasiatic, Austronesian, Dravidian, and Turkic. And more than half of the “Other” category consists of just four more families: Japonic, Austroasiatic, Kra-Dai, and Koreanic.
Language isolates
As cool as language families are, not every language can be organized into one. Some languages are pretty completely different from every other language around them. They can be so different that linguists can’t find any evidence that they descended from the same ancestral language as any other language. In that case, the language is called a “language isolate”. Here’s a list of those isolates, using the same region scheme as before:
Language Isolate | # speakers (approx.) |
---|---|
North America | |
Haida | 50 |
Huave | 20,000 |
Keres | 10,000 |
Kutenai | 300 |
Purepecha | 100,000 |
Zuni | 9,000 |
South America | |
Aikanã | 200 |
Camsá | 4,000 |
Chipaya | 2,000 |
Cofán | 2,000 |
Irantxe | 400 |
Itonama | 5 |
Kanoê | 5 |
Kawésqar | 5 |
Kwaza | 50 |
Movima | 1,000 |
Mura | 300 |
Páez | 60,000 |
Puinave | 3,000 |
Trumai | 50 |
Waorani | 2,000 |
Warao | 30,000 |
Yaruro | 8,000 |
Yuracaré | 3,000 |
Africa | |
Bangime | 4,000 |
Hadza | 1,000 |
Laal | 800 |
Eurasia | |
Ainu | 5 |
Basque | 700,000 |
Burushaski | 100,000 |
Kusunda | 90 |
Australia | |
Laragiya | 10 |
Malak-Malak | 10 |
Tiwi | 2,000 |
Wadjiginy | 5 |
New Guinea | |
Abinomn | 300 |
Maybrat | 20,000 |
Mpur | 5,000 |
Sulka | 2,000 |
Yele | 4,000 |
Many of the language isolates are somewhat rare languages that are barely even spoken anymore. Some, like Basque, still have a thriving and vibrant speaking population. But others are on the brink of going extinct. I live in the United States of America, and my ancestors are primarily European. I feel horrified and devastated to think how my ancestors contributed to wiping so many languages off the face of the earth. In these lists, I only included language families and isolates that are still spoken by some people today. Some of these “isolates” I used are actually part of language families, but all other languages in their family have gone extinct.
You might wonder: why would it be important to preserve different languages of the world? There are now more than 2 billion people in the world who understand English, at least to some extent. That’s more than a quarter of the world’s population. Admittedly, it is sort of convenient to have so many people who speak the same language.
But each different language, each different language family, is also a slightly different way of looking at and understanding the world. Every time one of these languages and even language families go extinct, we lose some of that unique perspective on language and the world. It’s a sad thing to see. Quite a few of these dying languages also have communities of people trying to learn and keep the language alive. I wish them the best and hope for their success.
Some Subdividing
Generally, language families can be broken down and classified into smaller language groups. These groups get smaller as they focus on languages more similar to each other. Often this also means the languages have a more recent common ancestral language. As an example of this, let’s look at the subgroups within two of the families: Indo-European and Niger-Congo.
Indo-European
As mentioned before, the Indo-European family is the language family with the most speakers in the world now. It can be subdivided into eight groups:
Language group | # spkrs (approx.) |
---|---|
Albanian | 8,000,000 |
Armenian | 10,000,000 |
Balto-Slavic | 300,000,000 |
Celtic | 2,000,000 |
Germanic | 500,000,000 |
Hellenic | 10,000,000 |
Indo-Iranian | 1,500,000,000 |
Italic (Romance) | 900,000,000 |
Some of these divisions basically account for a single language that’s quite different from all the other Indo-European languages. Some examples of that are Albanian, Armenian, and Greek (Hellenic). Others account for larger groups of languages. The combined Balto-Slavic grouping is pretty new; Baltic and Slavic languages used to always be classified separately, but they’re also similar in some unique ways, so linguists now often treat them as a single group.
Niger-Congo
The idea of a single Niger-Congo language family is a more recent concept. Some of the subgroups within the family used to be considered their own language families. According to more recent linguistic research, these languages may all have descended from a single ancestral language. Since this language family is pretty new, the groups are less agreed upon and standardized. Here’s one possible grouping:
Language group | # spkrs (approx.) |
---|---|
Bantoid | 400,000,000 |
Dogon | 600,000 |
Ijoid | 300,000 |
Katla–Rashad | 80,000 |
Kru | 2,000,000 |
Kwa | 40,000,000 |
Mande | 40,000,000 |
Platoid | 20,000,000 |
Savannas | 40,000,000 |
Senufo | 2,000,000 |
Volta-Niger | 70,000,000 |
West Atlantic | 30,000,000 |
You’ll notice that the Bantoid group is by far the largest. It includes a wide variety of languages, some of which are somewhat mutually intelligible. Probably the best-known language in this group is Swahili, though there are many other languages as well, spanning a huge geographic swath of central and southern Africa.
A note on pidgin/creole languages
One variety of language that isn’t really included here is pidgin and creole languages. A “pidgin” is a simple language made up from different parts of various other languages put together, like their vocabulary and grammar. A “creole” is basically the same thing, but further evolved into essentially its own language. Some major examples include Tok Pisin in New Guinea (based on English and local languages), Haitian Creole in Haiti (based on French plus some West African languages), and Papamiento in the Dutch Caribbean (based on Portuguese with some West African influence).
Pidgin and creole languages aren’t really addressed here. They are often formed as a mix of other languages from different languages, so they can be hard to classify. But languages like these are still widely spoken around the world, and have great regional cultural significance. Some of these pidgin and creole languages show up in one of my other articles, “Best foreign languages to learn” (link).