Masakhane African Languages Hub

Countries
Kenya
Chenai Chair
Principal Investigator(s)
Chenai Chair
Innovation Networks, Language
Masakhane African Languages Hub

Artificial intelligence is only as inclusive and reliable as the data that underpins it. Today, African languages—spoken by more than one billion people across over 2,000 languages—are largely missing from the foundational datasets that shape modern AI. As a result, large language models, speech technologies and generative AI systems systematically underperform in African contexts. This exclusion is not neutral: it reinforces digital inequities and disproportionately marginalizes rural communities, women, and speakers of indigenous and unwritten languages, limiting who can benefit from AI‑driven services and innovation.Emerging from the AI for Development Funders Collaborative and grounded in the Masakhane community, this flagship initiative will create high‑quality, open‑source, culturally grounded multimodal datasets—spanning text, speech and vision—across 40 African languages. There will be a deliberate focus on gender balance, cultural/regional diversity and bias mitigation. This will enable the development of inclusive AI tools tailored to African realities — like speech-based services in health care, multilingual education platforms, and local language assistants in agriculture. By anchoring this work in Masakhane, which is a grassroots, expert-led organization that champions participatory methods, the project aims to shift Africa from being a data-poor consumer of AI technologies to a globally recognized contributor to ethical and inclusive AI innovation.

enfr