Natural Language Processing for Tigrinya

Improve the coverage of Tigrinya Language in the digital world.


We are all familiar with applications like automatic translation, speech recognition, siri, optical character recognition (OCR), and other technologies. We know how useful they are in our daily life. Unfortunately, they are not readily available for minority languages like Tigrinya.

Relying on commercial companies like Google, Microsoft for this type of service in Tigrinya might be a long wait. Tigrinya may not have the financial gain that these companies are looking for, atleast not yet. This means Tigrinya speakers have to work a little harder to enjoy these technologies. Our software developers, linguistic experts and translators need to do most of the job that was normally done by companies like Google.

The good news is that the majority of these products are built on top of artificial intelligence called Deep learning neural network. Until recently, Dep learning neural network was only run by university graduates and specially trained people. But now thanks to open source projects like [Hugging Face] (https://huggingface.co/transformers/), the technology has been easily accessible to people with basic knowledge of the domain. We can make use of these projects for Tigrinya, with no, or minimum code change. All we need is training data (a lot of it).

In this page we try to document, what we are doing in this area. The intention is to harmonize our efforts in a collective manner, and give someone a starting point to implement all these cool technologies in Tigrinya.


This page is an initiative by SERET Foundation -language team. Anyone is free to use without any limitation.