Morphological analysis
Shaalan, K., M. Magdy, and A. Fahmy,
"Analysis and Feedback of Erroneous Arabic Verbs",
Journal of Natural Language Engineering , vol. 21, issue 2, pp. 271-323, 2015.
AbstractArabic language is strongly structured and considered as one of the most highly inflected and
derivational languages. Learning Arabic morphology is a basic step for language learners to
develop language skills such as listening, speaking, reading, and writing. Arabic morphology
is non-concatenative and provides the ability to attach a large number of affixes to each
root or stem that makes combinatorial increment of possible inflected words. As such, Arabic
lexical (morphological and phonological) rules may be confusing for second language learners.
Our study indicates that research and development endeavors on spelling, and checking of
grammatical errors does not provide adequate interpretations to second language learners’
errors. In this paper we address issues related to error diagnosis and feedback for second
language learners of Arabic verbs and how they impact the development of a web-based
intelligent language tutoring system. The major aim is to develop an Arabic intelligent
language tutoring system that solves these issues and helps second language learners to
improve their linguistic knowledge. Learners are encouraged to produce input freely in
various situations and contexts, and are guided to recognize by themselves the erroneous
functions of their misused expressions. Moreover, we proposed a framework that allows
for the individualization of the learning process and provides the intelligent feedback that
conforms to the learner’s expertise for each class of error. Error diagnosis is not possible with
current Arabic morphological analyzers. So constraint relaxation and edit distance techniques
are successfully employed to provide error-specific diagnosis and adaptive feedback to learners.
We demonstrated the capabilities of these techniques in diagnosing errors related to Arabic
weak verbs formed using complex morphological rules. As a proof of concept, we have
implemented the components that diagnose learner’s errors and generate feedback which
have been effectively evaluated against test data acquired from real teaching environment.
The experimental results were satisfactory, and the performance achieved was 74.34 percent
in terms of recall rate.
Shaalan, K., and M. Attia,
"Handling Unknown Words in Arabic FST Morphology",
The 10th edition of the International Workshop on Finite State Methods and Natural Language Processing (FSMNLP 2012), San Sebastian, Spain, 23 July, 2012.
Abstract A morphological analyser only recognizes words that it already knows in the lexical database. It needs, however, a way of sensing significant changes in the language in the form of newly borrowed or coined words with high frequency. We develop a finite-state morphological guesser in a pipelined methodology for extracting unknown words, lemmatizing them, and giving them a priority weight for inclusion in a lexicon. The processing is performed on a large contemporary corpus of 1,089,111,204 words and passed through a machine-learning-based annotation tool. Our method is tested on a manually-annotated gold standard of 1,310 forms and yields good results despite the complexity of the task. Our work shows the usability of a highly non-deterministic finite state guesser in a practical and complex application.
Shaalan, K., M. Magdy, and A. Fahmy,
"Morphological Analysis of Ill-formed Arabic Verbs for Second Language Learners",
Applied Natural Language Processing: Identification, Investigation and Resolution, issue Hershey, PA, USA, PA, USA, IGI Global, pp. 1 - 659, 2012.
AbstractArabic is a language of rich and complex morphology. The nature and peculiarity of Arabic make its morphological and phonological rules confusing for second language learners (SLLs). The conjugation of Arabic verbs is central to the formulation of an Arabic sentence because of its richness of form and meaning. In this research, we address issues related to the morphological analysis of ill-formed Arabic verbs in order to identify the source of errors and provide an informative feedback to SLLs of Arabic. The edit distance and constraint relaxation techniques are used to demonstrate the capability of the proposed system in generating all possible analyses of erroneous Arabic verbs written by SLLs. Filtering mechanisms are applied to exclude the irrelevant constructions and determine the target stem which is used as the base for constructing the feedback to the learner. The proposed system has been developed and effectively evaluated using real test data. It achieved satisfactory results in terms of the recall rate.
Shaalan, K. F., M. Magdy, and A. Fahmy,
"Morphological Analysis of Ill-Formed Arabic Verbs in Intelligent Language Tutoring Framework",
The 23rd International Florida Artificial Intelligence Research Society Conference (FLAIRS-23), Florida, USA, FLAIRS, pp. 277–282, may, 2010.
AbstractArabic is a language of rich and complex morphology. The nature and peculiarity of Arabic make its morphological and phonological rules confusing for second language learners (SLLs). The conjugation of Arabic verbs is central to the formulation of an Arabic sentence because of its richness of form and meaning. In this paper, we address issues related to the morphological analysis of ill-formed Arabic verbs in order to identify the source of errors and provide an in-formative feedback to SLLs of Arabic. The edit distance and constraint relaxation techniques are used to demonstrate the capability of the proposed approach in generating all possible analyses of erroneous Arabic verbs written by SLLs. Filtering mechanisms are applied to exclude the irrelevant constructions and determine the target stem. A morphological analyzer has been developed and effectively evaluated using real test data. It achieved satisfactory results in terms of the recall rate.
Shaalan, K., and E. Othman,
"Issues in the Morphological Analysis of the Arabic Passive Verb",
The Seventh Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE), Cairo, Egypt, Ain Shams University, dec, 2007.
AbstractArabic is a strongly structured and highly derivational language. Arabic morphology and syntax provide the ability to add a large number of affixes to each word which makes combinatorial increment of possible words. In Arabic, passive voice is used as a writing style when: 1) the subject is unknown, 2) the subject is unimportant enough to be mentioned, or 3) the author wants to highlight the object. In this paper, the issues related to the recognition of the Arabic passive verbs which impact the automated understanding of Arabic sentences were addressed. An experiment using the Buckwalter Arabic morphological analyzers, one of the mature Arabic morphological analyzer, were conducted in order to highlight the limitations in the analysis of Arabic passive verbs. Results indicated that there exists a need for handling the problems related to the morphological analysis of passive verbs in order to improve the recognition accuracy of Arabic words.
Rafea, A., and K. Shaalan,
"Lexical Analysis of Inflected Arabic words using Exhaustive Search of an Augmented Transition Network",
Software Practice and Experience, vol. 23, issue 6, no. 6, New York, NY, USA, John Wiley & Sons, Inc., pp. 567–588, 1993.
Abstract