5 Minutes with Tsosheletso Chidi, Ph.D.

Tsosheletso Chidi
Tsosheletso Chidi

The Capacity Accelerator Network (CAN) is building a workforce of purpose-driven data and AI practitioners to unlock the power of data for social impact. Dr. Tsosheletso Chidi is a linguistic researcher, multilingual writer, poet, and literary curator. Tsosheletso was one of the first Africa Low-Resource Language fellows.

In this rapidly evolving AI landscape, what was the “aha moment” when you realized the opportunity and the necessity to train AI on low-resource languages to unlock and accelerate Africa’s AI potential?

My “aha moment” came when I realised that my intensive cultural and creative sectors background actually qualified me for this opportunity.  For so long, many of us working in language and the arts believed that AI belonged solely to engineers and data scientists. We excluded ourselves from conversations that deeply affect the futures of our languages and cultures. But then I recognised that our absence was the gap and our inclusion is the opportunity. Working with indigenous African languages, I saw how AI systems often mistranslate, misrepresent, or ignore them entirely. Training AI on these languages isn’t just a technical task — it’s a cultural necessity. Without it, Africa’s digital future risks being shaped by systems trained on foreign values. Inclusive AI can empower communities to define themselves in digital spaces not as data points, but as agents of meaning.

Working with indigenous African languages, I saw how AI systems often mistranslate, misrepresent, or ignore them entirely. Training AI on these languages isn’t just a technical task — it’s a cultural necessity.

Tsosheletso Chidi Tsosheletso Chidi, Ph.D. Research Commons Officer University of Pretoria

How does your work with low-resource languages move the needle for data and AI for social impact work? What are some of the biggest challenges you have faced in doing so?

My work with low-resource African languages advances AI for social impact by centering people, not just data. I come from a literary and linguistic background, and I approach this work by asking: What’s the best way to engage with these languages meaningfully? That question continues to guide me. One of my biggest challenges is holding deep conversations with data scientists and asking hard questions like: Who is this for? My role is making sure African communities are not reduced to data sources, that our cultural nuances are respected, and that this work is not treated as a niche for profit. I see myself as a bridge helping to facilitate relationships between communities and AI practitioners. For me, social impact in AI means ensuring that African languages and the people who speak them are central to the design and purpose of these systems.

What are the diverse, interdisciplinary skills that are required to do this work effectively? Which one surprised you the most?

Linguistic expertise, community engagement, ethical research practices, technical literacy, machine translation, project management, advocacy, and policy awareness are diverse interdisciplinary skills required to do this work effectively. Linguistic and cultural knowledge is foundational, especially when working with indigenous languages that carry deep histories and nuanced meanings. At the same time, you need the technical ability to navigate the language of AI, machine translation, and data ethics — even if you’re not building the models yourself.

The skill that surprised me the most was community engagement. I had underestimated how central it would be to the success of AI projects involving low-resource languages. Building trust, working ethically with people, and communicating across power dynamics are not side tasks — they are the core of the work. Without community participation, even the most accurate models fall flat in impact and relevance. This work doesn’t sit neatly in one discipline. It thrives in the space between them, and that’s where I’ve found my purpose. Being able to connect the dots, sit at multiple tables, and bridge knowledge systems is what allows me to push for more inclusive, culturally grounded AI in Africa.

What key responsible practices should AI practitioners prioritize when developing and training AI systems in African—or other low-resource languages?

Key responsible practices include transparency about how data will be used, co-designing projects with language speakers, and ensuring that communities benefit from the tools being developed. AI practitioners must also avoid extractive data collection, where languages are sourced for model training with little regard for who owns, controls, or understands the outcomes. Community trust isn’t just important – it’s essential. Without it, you may get data, but not meaning. Communities need to see themselves reflected in the process, have access to the outputs, and feel respected in how their languages and stories are handled. This is especially true in African contexts where colonial histories have left deep scars around knowledge extraction. Guardrails should include ethical review processes tailored to cultural contexts, open dialogue between technologists and language practitioners, and mechanisms to track and respond to potential harm. Inclusion must be more than representation; it must be active collaboration. Ultimately, AI systems built for low-resource languages will only be sustainable if they are built with the people who speak them.

Communities need to see themselves reflected in the process, have access to the outputs, and feel respected in how their languages and stories are handled. This is especially true in African contexts where colonial histories have left deep scars around knowledge extraction.

Tsosheletso Chidi Tsosheletso Chidi, Ph.D. Research Commons Officer University of Pretoria

What is the importance of cross-sector collaborations in building inclusive AI? What advice would you offer to people interested in this work?

Cross-sector collaboration is essential to building inclusive AI because language equity cannot be solved by one field alone. Technologists bring the tools, but linguists, cultural workers, educators, and communities bring the context. Without that blend, we risk building systems that are technically impressive but socially disconnected. In my work, I have seen how the most meaningful AI projects emerge when people from different sectors come together to listen, challenge assumptions, and co-create new approaches. To those interested in AI language equity, my advice is simple: start where you are, and bring your full skillset. You don’t need to be a coder to matter. You need curiosity, humility, and a deep respect for the languages and people you’re working with. Learn to speak across disciplines. Ask hard questions about ethics, power, and access. And most importantly, remember that inclusion is not just about who’s in the room, but about who gets to shape the outcome.