Publications

Export 116 results:

]

2006

Shaalan, K., and H. Talhami, "Error analysis and handling in Arabic ICALL systems", IASTED International Conference on Artificial Intelligence and Applications (AIA 2006), Innsbruck, Austria, ACTA Press, pp. 109–114, Febrauray, 2006. Abstracterror_analysis_icall.pdf

Arabic is a Semitic language that is rich in its morphology and syntax. The very numerous and complex grammar rules of the language could be confusing even for Arabic native speakers. Many Arabic intelligent computer-assisted language-learning (ICALL) systems have neither deep error analysis nor sophisticated error handling. In this paper, we report an attempt at developing an error analyzer and error handler for Arabic as an important part of the Arabic ICALL system. In this system, the learners are encouraged to construct sentences freely in various contexts and are guided to recognize by themselves the errors or inappropriate usage of their language constructs. We used natural language processing (NLP) tools such as a morphological analyzer and a syntax analyzer for error analysis and to give feedback to the learner. Furthermore, we propose a mechanism of correction by the learner, which allows the learner to correct the typed sentence independently. This will result in the learner being able to figure out what the error is. Examples of error analysis and error handling will be given and will illustrate how the system works.

Ezzat, M., K. Shaalan, and A. Fahmy, "Component Composition Analysis for Arabic Natural Language Processing", the 6th Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE), Cairo, Egypt, Dec., 2006. Abstractcomponentcompositionanlp2.pdf

Building NLP applications from scratch is a difficult task that takes a lot of time and requires acquiring a lot of NLP knowledge. For a rich language like Arabic the difficulties is increased significantly. In this paper, we investigated how to build a tool that helps NLP application developers to build rapid and robust applications. It involves two steps. Firstly, using COM objects technology in building common NLP tools. Secondly, building NLP applications that uses these tools which can access these tools either locally or remotely. We have demonstrated the capabilities of the COM objects in developing NLP tools such as morphological analyzer and used it for building two Arabic NLP applications.

Shaalan, K., and H. Talhami, "Arabic Error Feedback in an Online Arabic Learning System", Advances in Natural Language Processing, Research in Computing Science (RCS) Journal, vol. 18, pp. 203-212, 2006. Abstracterror_feedback_2006.pdf

Arabic is a Semitic language that is rich in its morphology and syntax. The very numerous and complex grammar rules of the language could be confusing even for Arabic native speakers. Many Arabic intelligent computer assisted language-learning (ICALL) systems have neither deep error analysis nor sophisticated error handling. In this paper, we report an attempt at developing an error analyzer and error handler for Arabic as an important part of the Arabic ICALL system. In this system, the learners are encouraged to construct sentences freely in various contexts and are guided to recognize by themselves the errors or inappropriate usage of their language constructs. We used natural language processing (NLP) tools such as a morphological analyzer and a syntax analyzer for error analysis and to give feedback to the learner.
Furthermore, we propose a mechanism of correction by the learner, which allows the learner to correct the typed sentence independently. This will result in the learner being able to figure out what the error is. Examples of error analysis and error handling will be given and will illustrate how the system works.

Talhami, H., and K. Shaalan, "An Arabic/English switch for audio indexing and dialogue management", IASTED International Conference on Internet and Multimedia Systems and Applications (EuroIMSA 2006), Innsbruck, Austria, ACTA Press, pp. 189–192, 2006. Abstractaudio_indexing.pdf

This paper presents a technique for the automatic switching between Arabic and English which has been developed for audio indexing and dialogue management applications. It classifies utterances and sub-utterances as either U.S. English or Modern Standard Arabic (MSA) in a closed system. The approach extends the work of Zissman and Singer[1] to the problem of Arabic/English language identification problem. Two sets of acoustic phoneme models (English and Arabic HMMs) and two language models (phone bigrams) per acoustic model set are used. Four Large Vocabulary Continuous Speech Recognition (LVSCR) recognition passes are performed, (one for each HMM + language model set), using a phone loop grammar. The four path scores are fed into a Bayesian classifier (a multi-layer perceptron) which classifies each utterance as either English or Arabic. The technique demonstrated high accuracy on test data unseen by the system during the modelling process. The language switch has been used successfully as a front-end processor in an audio indexing and retrieval system as well as a dialogue management system.

Al Shamsi, S., H. Talhami, and K. Shaalan, "Teaching Children with Down Syndrome Pronunciations Using Speech Recognition", The IASTED International conference on Computers and Advanced Technology in Education (CATE 2006), Lima, Peru, IASTED, pp. 146–153, 2006. Abstractteaching_childernds.pdf

Several applications that are based on speech recognition have been developed to assist people with special needs to perform their daily tasks. For example, people who are physically challenged can enter data and issue commands by dictating to a computer. Visually impaired people are able to listen to what is written on the screen by using text-to-speech. However, applications that have been developed for children with special needs (like, for example, Speaking for Myself [16]) do not provide any feedback to the children. This paper proposes and compares two new approaches for teaching children with Down Syndrome (DS) pronunciations using speech recognition. These approaches make use of the major speech characteristic of children with DS to develop an educational tool that assists them in overcoming their speech communication difficulties. The tool recognizes the spoken words and provides feedback. The two approaches that are proposed are: a word-based approach that handles any phonological process, and the phone-based approach that handles one phonological process at a time. The phone-based approach is more accurate than the word-based approach. However, both approaches can be improved through tuning the speech recognition parameters and using single utterance recognition confidence scores.

2005

Shaalan, K., H. Talhami, and I. Kamel, "A Morphological Generator for the Indexing of Arabic Audio", the Proceedings of The IASTED International Conference on Artificial Intelligence and Soft Computing (ASC), Benidorm, Spain, ACTA Press, pp. 307–312, September, 2005. Abstractmorph_audio.pdf

This paper presents a novel Arabic morphological generator (AMG) for Modern Standard Arabic (MSA) which is designed and implemented using Prolog. The AMG is used to generate inflected forms of words used for the indexing of Arabic audio. These words are also the relevant terms in the Arab authority system (library information retrieval system) used in this study. The AMG generates inflected Arabic words from the root according to pre-specified morphological features that can be extended as needed. The Arabic word is represented as a feature structure which is handled through unification during the morphological generation process. The inflected forms can then be inserted automatically into a speech recognition grammar which is used to identify these words in an audio sequence or utterance.

Nabhan, A., A. Rafea, and K. Shaalan, "Enhancing Phrase Extraction from Word Alignments Using Morphology", The 5th Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE), Cairo, Egypt, Ain Shams University, pp. 57–65, sep, 2005. Abstractnabhan_nle.pdf

We propose a technique for effective extraction of bilingual phrases from word alignments using morphological processing. Morphological processing leads to an increase of the frequency of words in the corpus, consequently reduces Alignment Error Rate (AER). Intuitively, better word alignments enhance the quality of bilingual phrases extracted. Using alignments of a stemmed corpus for phrase extraction, instead of alignments of a raw one, shows significant improvements in translation quality, especially with small corpora.

Chang, C. -hui, M. Kayed, M. Girgis, and K. Shaalan, "Criteria for Evaluating Information Extraction Systems", The 3rd International Conference on Informatics and Systems (INFOS2008), Cairo, Egypt, Faculty of Comptuers and Information, mar, 2005. Abstractinfos2005.pdf

The Internet presents a huge amount of useful information which is usually formatted for its users, which makes it difficult to extract relevant data from various sources. Therefore, the availability of robust, flexible Information Extraction (IE) systems that transform the Web pages into program-friendly structures will become a great necessity. Although many approaches for data extraction from Web pages have been developed, there has been limited effort to compare such tools. In addition to briefly surveying the major data extraction approaches described in the literature,the paper also mainly presenting three classes of criteria for qualitatively analyzing these approaches. The criteria of the first class are concerned with the difficulties of an IE task, so these criteria are capable of determining why an IE system fails to handle some Web sites of particular structures. The criteria of the second class are concerned with the effort made by the user in the training process, so these criteria are capable of measuring the degree of automation for IE systems. The criteria of the third class are concerned with the techniques used in IE tasks, so these criteria are capable of measuring the performance of IE systems.

Shaalan, K., "Arabic GramCheck: a grammar checker for Arabic", Software Practice and Experience, vol. 35, no. 7, New York, NY, USA, John Wiley & Sons, Inc., pp. 643–665, 2005. Abstractarabic_gramcheck.pdfWebsite

Arabic is a Semitic language that is rich in its morphology and syntax. The very numerous and complex grammar rules of the language may be confusing for the average user of a word processor. In this paper, we report our attempt at developing a grammar checker program for Modern Standard Arabic, called Arabic GramCheck. Arabic GramCheck can help the average user by checking his/her writing for certain common grammatical errors; it describes the problem for him/her and offers suggestions for improvement. The use of the Arabic grammatical checker can increase productivity and improve the quality of the text for anyone who writes Arabic. Arabic GramCheck has been successfully implemented using SICStus Prolog on an IBM PC. The current implementation covers a well-formed subset of Arabic and focuses on people trying to write in a formal style. Successful tests have been performed using a set of Arabic sentences. It is concluded that the approach is promising by observing the results as compared to the output of a commercially available Arabic grammar checker

Shaalan, K., "An Intelligent Computer Assisted Language Learning System for Arabic Learners", Computer Assisted Language Learning, vol. 18, no. 1-2: Routledge, part of the Taylor & Francis Group, pp. 81-109, 2005. Abstractarabic_icall.pdfWebsite

This paper describes the development of an intelligent computer-assisted language learning (ICALL) system for learning Arabic. This system could be used for learning Arabic by students at primary schools or by learners of Arabic as a second or foreign language. It explores the use of Natural Language Processing (NLP) techniques for learning Arabic. The learners are encouraged to produce sentences freely in various situations and contexts and guided to recognize by themselves the erroneous or inappropriate functions of their misused expressions. In this system, we use NLP tools (including morphological analyzer and syntax analyzer) and error analyzer to issue feedback to the learner. Furthermore, we propose a mechanism of correction by the learner which allows the learner to correct the typed sentence independently, and allows the learner to realize that what the error is.

2004

Othman, E., K. Shaalan, and A. Rafea, "Towards resolving ambiguity in understanding Arabic sentence", International Conference on Arabic Language Resources and Tools, Cairo, Egypt, NEMLAR, pp. 118–122, sep, 2004. Abstractambiguity_resol_nemlar.pdf

Ambiguity is a major reason why computers do not yet understand natural language. We have made great deal strides towards developing tools for morphological and syntactic analyzers for Arabic in recent years. The absence of diacritics, which represent most vowels, in the written text creates ambiguity which hinders the development of Arabic natural language processing applications. Thus, ambiguity increases the range of possible interpretations of natural language. In this paper, we give a road map of solutions to common ambiguity problems inherent in parsing of Arabic sentence.

Shaalan, K., M. El-Badry, and A. Rafea, "A multiagent approach for diagnostic expert systems via the internet", Expert Systems with Applications, vol. 27, no. 1, Tarrytown, NY, USA, Pergamon Press, Inc., pp. 1–10, jul, 2004. Abstractamultiagentapproachfordiagnosticesviatheinternet.pdfWebsite

In recent years there has been considerable interest in the possibility of building complex problem solving systems as groups of co-operating experts. This has led us to develop a multiagent expert systems capable to run on servers that can support a large group of users (clients) who communicate with the system over the network. The system provides an architecture to coordinate the behavior of several specific agent types. Two types of agents are involved. One type works on the server computer and the other type works on the client computers. The society of agents in our system consists of expert systems agents (diagnosis agents, and a treatment agent) working on the server side, each of which contains an autonomous knowledge-based system. Typically, agents will have expertise in distinct but related domains. The whole system is capable of solving problems, which require the cumulative expertise of the agent community. Besides to the user interface agent who employs an intelligent data collector, so-called communication model in KADS, working on the client sides. We took the advantage of a successful pre-existing expert systems—developed at CLAES (Central Laboratory for Agricultural Expert Systems, Egypt)—for constructing an architecture of a community of cooperating agents. This paper describes our experience with decomposing the diagnosis expert systems into a multi-agent system. Experiments on a set of test cases from real agricultural expert systems were preformed. The expert systems agents are implemented in Knowledge Representation Object Language (KROL) and JAVA languages using KADS knowledge engineering methodology on the WWW platform.

El-Beltagy, S., M. Said, and K. Shaalan, "A Framework for Information Extraction, Storage and Retrieval", International Computer Engineering Conference: New Technologies for the Information Society (ICENCO'2004), Cairo, Egypt, Faculty of Engineering, dec, 2004. Abstractaframeworkforinformationextraction_04.pdf

This paper presents a set of tools that were developed in order to facilitate and speed up the process of building information extraction and retrieval systems for documents that exhibit a setof predefined characteristics. Specifically, the work presents a simple framework for extracting information found in publications or documents that are issued in large volumes and which cover similar concepts or issues within a given domain. The paper presents a simple model for defining background knowledge and for using that to automatically augment segments of input documents with metadata in order to assist users in easily locating information within these documents through a structured front end. The model presented makes use of both document structure as well as dynamically acquired background knowledge to achieve its goals.

Abo-Khozium, M., H. Hassan, K. Shaalan, and M. Riad, "A Prototype for an Intelligent Information System for Jamming and Anti-jamming Applications of Electromagnetic Spectrum", Egyptian Informatics Journal, vol. 5, no. 2: Faculty of Comptuers and Information, pp. 116–135, dec, 2004. Abstractabo_khozaim.pdf

As the pace of modern battle has increased, headquarters and Electronic Warfare (EW) staff need to process increasing volumes of information in a decreasing amount of time. Assistance in this critical task is proposed by developing the Electronic Warfare Intelligent Information System (EWIIS) that deals with processing of electronic warfare, communications, radar, maps, war missions … etc. This system is aimed at achieving the best performance with a friendly system in spite of the existence of hostile actions. EWIIS deals with different sources of data. It helps visualize mission scenarios and suggests the best combination of weapons to successfully complete the mission with minimum loss.

Rafea, A., S. Shafik, and K. Shaalan, "An Interactive System for Association Rule Discovery for Life Assurance", International Conference on Computer, Communication and Control Technologies (CCCT '04), Texas, USA, pp. 32–37, aug, 2004. Abstractrule_disc_ccct_2004.pdf

Knowledge discovery in financial organization have been built and operated mainly to support decision making using knowledge as strategic factor.In this paper, we investigate the use of association rule mining as an underlying technology for knowledge discovery in insurance business. Existing association rule algorithms and its extensions are inefficient in mining association rules in such data characteristics. We introduce algorithms for discovering knowledge in the form of association rules, suitable for data characteristics. Proposed data mining techniques is a hybrid of clustering partitioning and multi level rule induction. The proposed tool is managed by a repository meta model instantiated by meta-data libraries specific to insurance domain. It is implemented on a PC running on Ms Windows 2000. Samples of life data are extracted from different geographical locations of an Egyptian insurance company covering ten years. By using the induced rules, the decision- maker can define the horizontal expansion of marketing activities on new geographical area, or vertically empower the marketing forces in existing geographical area. Keywords: insurance data characteristics, macro association rules, clustering partitioning, preprocessing &transformation, OLAP aggregation, ontology, data warehouse

Shaalan, K., M. Rizk, Y. Abdelhamid, and R. Bahgat, "An expert system for the best weight distribution on ferryboats", Expert Systems with Applications, vol. 26, no. 3, pp. 397-411, apr, 2004. Abstractanesforthebestweightdistributiononferryboats.pdfWebsite

There are some problems that need expertise in order to get a satisfactory solution. Ferryboat carries goods, fresh water, diesel oil, luggage and storing rooms up to its permissible draft in order to maintain safety according to the international safety regulations. The best weight distribution on ferryboat needs human expertise to handle many variables, such as the amount of the bunker and fresh water that allow us to use more rooms for charging in order to maximize the profit. This sort of problems can be classified under Configuration Problem. In this paper, we address the development of a ferryboat expert systems (WDFB) using CommonKADS knowledge engineering methodology. We propose a reusable problem-solving approach, which is an enhancement of the structure-oriented approach, capable of solving the ferryboat configuration problem. The proposed model includes heuristics that make the search of suitable configuration more efficient, taking into consideration the transformation knowledge and the optimality criteria. The results of testing the system on a real-world data from National Navigation Company, Suez, Egypt, were satisfactory.

Shaalan, K., A. Rafea, A. Abdel-Moneim, and H. Baraka, "Machine Translation of English Noun Phrases into Arabic", The International Journal of Computer Processing of Oriental Languages, vol. 17, no. 2, pp. 121–134, 2004. Abstractmt_nlp.pdfWebsite

The present work reports our attempt in automating the translation of English noun phrase (NP) into Arabic. Translating NP is a very important task toward sentence translation since NPs form the majority of textual content of the scientific and technical documents. The system is implemented in Prolog and the parser is written in DCG formalism. The paper also describes our experience with the developed MT system and reports results of its application on real titles of theses from the computer science domain.

2003

Othman, E., K. Shaalan, and A. Rafea, "A chart parser for analyzing modern standard Arabic sentence", MT Summit IX Workshop on Machine Translation for Semitic Languages: Issues and Approaches, New Orleans, Louisiana, USA, ACL, pp. 37–44, September, 2003. Abstractchart_parser_mt_summit.pdf

The parsing of Arabic sentence is a necessary prerequisite for many natural language processing applications such as machine translation and information retrieval. In this paper we report our attempt to develop an efficient chart parser for Analyzing Modern Standard Arabic (MSA) sentence. From a practical point of view, the parser is able to satisfy syntactic constraints reducing parsing ambiguity. Lexical semantic features are also used to disambiguate the sentence structure. We explain also an Arabic morphological analyzer based on ATN technique. Both the Arabic parser and the Arabic morphological analyzer are implemented in Prolog. The linguistic rules were acquired from a set of sentences from MSA sentence in the Agriculture domain.

Abdel-Monem, A., K. Shaalan, A. Rafea, and H. Baraka, "A Proposed Approach for Generating Arabic from Interlingua in a Multilingual Machine Translation System", Language Engineering conference, Cairo, Egypt, Ain Shams University, pp. 197–206, Oct, 2003. Abstractgen_paper_nlg_conf.pdf

Intelingua (meaning) representation has been successfully used in multilingual machine translation. This paper reports our attempt to generate Arabic sentence from interlingua. The proposed system will be compatible with the NESPOLE consortium. In NESPOLE an Interlingua called interchange format or IF, designed for travel planning is used. Our approach describes how to generate grammatically correct Arabic sentence from Interlingua. It involves two main components a mapper for converting intelingua into syntactic structure (feature-structure) and a generator for generating the target Arabic sentence that represents the intended meaning. A translation example is provided to explain the inner working of the system.

Shaalan, K., A. Allam, and A. Gomah, "Towards automatic spell checking for Arabic", Proceedings of the Fourth Conference on Language Engineering, Egyptian Society of Language Engineering (ELSE), Egypt: Faculty of Engineering, pp. 240–247, oct, 2003. Abstractspellcheck.pdf

Arabic's rich morphology (word construction) and complex orthography (writing system) present unique challenges for automatic spell checking. An Arabic checker attempts to find a dictionary word that might be the correct spelling of the misspelled or misrecognized word. In this paper, we report our attempt in developing an Arabic spelling checker program for solving this problem. Our approach is heuristic and involves developing an Arabic morphological analyzer, techniques of spelling checking and spelling correction, and efficient methods of lexicon operations. The developed Arabic spell checker is able to recognize common spelling errors for standard Arabic and Egyptian dialects.

Khaled Shaalan

Professor of Computer Science

Publications

Tags

Recent Publications