2nd Option Title Research
2nd Option Title Research
CONCEPT NOTE
SPELLING CORRECTOR FOR WORD OF AFAN OROMO USING DEEP
LEARNING
MATTU,ETHIOPIA
DECEMBER ,2023
1. Introduction
Spell corrector is the process of detecting and providing spelling suggestions for incorrectly
spelled words in Afaan Oromoo. It is directly interposed with several applications like post
handwritten text digital correction and user word correction in the retrieval process. This thesis
will describes the design architecture, implementation and testing of a model to developing to
detect and correct word of Afaan Oromoo. The main focus of this study is to design Spelling
corrector for word of Afaan Oromo writing depends on the spelling error patterns of language
based on the sequence of words in the input sentences contextually. The technique used for this
spelling correction is unsupervised statistical approach. Unsupervised statistical approach helps
to prepare manually tagged data sets to help under resource like Afan Oromo language from
collected corpus. The Process of spelling correction is undertaken through the following major
phases: error detection, candidate suggestion and ranking candidate suggestion. Error detection is
based on the dictionary look up method and bigram analysis.
2. Problem statement
Poor spelling (misspelled) can hinder the communication between authors and readers(Treiman and
Kessler, 2005). As a result, authors did not transfer his/her knowledge clearly for the readers and also
poor spelling also has influence on the development of languages. This problem is happened in Afan
Oromo writing system, due to misspelling words Oromia government take a correction measurement over
government institution, non-government and business center misspelled words written on the banner and
posted publically.
Up till now, in Afan Oromo writing system there is absence of research conducted to design a spelling
corrector for correcting misspelling in the writing system. However, rule based Afan Oromo Grammar
Checker was designed in order to form well organized arrangement of words in Afan Oromo sentence
(Tesfaye, 2011). But, this cannot hinder the problem of misspelling in Afan Oromo, since grammar focus
on the construction of the sentence, while this spelling checker focuses on the contexts to check spelling
error at the words level. Therefore, the aim of this research is to design a prototype context based
spellchecking in order to solve the problem of misspelling in the Afan Oromo writing system.
3. Purpose of the Research
The main focus of this study is to design Spelling corrector for word of Afaan Oromo writing
depends on the spelling error patterns of language based on the sequence of words in the
input sentences contextually. The thesis work is to correct especially real word error
happened with noun, place and gender.
The researcher will be collect the data from the different sources and prepare the dictionary and
bigram model for spelling correction. The non-word error candidate generation is based on
calculating the similarity between the misspelled word and list of token in the dictionary,
similarity is measured using the Levenshtein to the dictionary token and ranking accordingly and
for real word error, bigram frequency was used to detect the error and bigram probability was
computed for the correction of misspelled.
An error correction algorithm aims at finding candidate corrections for the erroneous word.
Generally, the correction module used for this study was Levenshtein edit distance to correct
non-word and Bigram probability to correct real word.
The research will be use employed a dictionary-based strategy to associate and detect input
strings in a dictionary, lexicon, corpus, or a combination of lexicons and corpora for the study.
The datasets or lexicon files for the Afan Oromo language compile with the help of linguistic
experts from various genres that contain balanced corpora and/ or lexicon. After collecting
balanced corpora from different ernes and text preprocessing mechanisms will be applied. Here,
text preprocessing, spell checking, and spelling suggestions are the three key aspects.
5. Research Questions
This study answers the following research questions to come up with the solution for the
misspelling problems for the success of context based spelling correction for Afan
Oromo:
After the completion of this thesis the following questions need to be answered:
1. How to handle the errors committed in spelling Afaan Oromoo words?
2. How would dictionary based n-gram approach can be applied to correct misspellings in
Afaan oromoo?
3. To what extent would dictionary based n-gram approach is effective for spelling error
Correction?
6. Reference
[1]. Henok Dawit Danie: Context Based Afaan Oromo Language Spell Checker For Handheld
Device ,June, 2022
[2]C. Patil, R. Rodrigues and R. Ron, "Auto-Spelling Checker using Natural Language Processing,"
Chinmay Patil Xavier Institute of Engineering, Mumbai, Maharashtra, India ,Volume: 07 No 08, Aug,
2020.
[3]T. Debela, “A rule-based Afaan Oromo Grammar Checker,” International Journal of Advanced
Computer Science and Applications, Vol. 2, No. 8, CA ,pp. 126–130 , 2011.