Development of Stemmer for Afar-af text: A Hybrid Approach

Main Article Content

Kelil Ali Ebrahim

Abstract

Utmost natural language processing systems practices stemmer as a distinct module in their architecture. Specially, it is crucial for developing, machine translator, speech recognizer and search engines. In linguistic morphology, stemming is the process for reducing inflected (or sometimes derived) words to their root, stem or base form.


In this article, a stemming system for Afar-af is presented. This system takes as input a word/terms and removes its affixes (suffix, prefix) rendering to a rule based algorithm. This stemmer is not adequate to describe every rule applied in Afar-af word formation. Consequently, N-gram is combined with the rule to handle cases that are not covered by rule in the hybrid approach of this stemmer. The algorithm follows the well-known Porter algorithm for the English language and it is advanced according to the grammatical rules of the Afar, language.


Afar-af morphology was studied and defined in order to model the language and develop an automatic procedure for conflation. The inflectional and derivational morphologies of the language are discussed


Afar-af words are very rich in morphology and requires an operative stemming algorithm, which can regulate diverse morphological arrangements that are associated with words.


An evaluation of the system indicates that the algorithms accuracy works with better performance than other earlier stemming algorithms for Afar-af giving accuracy of 98.73 percent. Furthermore, Possible extensions of the planned work and advance evaluation approaches are briefly reviewed

Article Details

Section
Articles