Machine Learning

El-Sedeek, M., K. Shaalan, S. Mabrouk, and A. Rafea, "A Hybrid Analogical Learning System and its Application in Employment Accidents Domain", Scientific Bulletin Part III: Electrical Engineering, vol. 38, no. 2: Faculty of Engineering, pp. 445–467, jun, 2003. Abstracthybrid_analogical_learnging_eng_journal.pdfWebsite

This paper presents a set of tools that were developed in order to facilitate and speed up the process of building information extraction and retrieval systems for documents that exhibit a set of predefined characteristics. Specifically, the work presents a simple framework for extracting information found in publications or documents that are issued in large volumes and which cover similar concepts or issues within a given domain. The paper presents a simple model for defining background knowledge and for using that to automatically augment segments of input documents with metadata in order to assist users in easily locating information within these documents through a structured front end. The model presented makes use of both document structure as well as dynamically acquired background knowledge to achieve its goals.

Rafea, A., S. Shafik, and K. Shaalan, "An Interactive System for Association Rule Discovery for Life Assurance", International Conference on Computer, Communication and Control Technologies (CCCT '04), Texas, USA, pp. 32–37, aug, 2004. Abstractrule_disc_ccct_2004.pdf

Knowledge discovery in financial organization have been built and operated mainly to support decision making using knowledge as strategic factor.In this paper, we investigate the use of association rule mining as an underlying technology for knowledge discovery in insurance business. Existing association rule algorithms and its extensions are inefficient in mining association rules in such data characteristics. We introduce algorithms for discovering knowledge in the form of association rules, suitable for data characteristics. Proposed data mining techniques is a hybrid of clustering partitioning and multi level rule induction. The proposed tool is managed by a repository meta model instantiated by meta-data libraries specific to insurance domain. It is implemented on a PC running on Ms Windows 2000. Samples of life data are extracted from different geographical locations of an Egyptian insurance company covering ten years. By using the induced rules, the decision- maker can define the horizontal expansion of marketing activities on new geographical area, or vertically empower the marketing forces in existing geographical area. Keywords: insurance data characteristics, macro association rules, clustering partitioning, preprocessing &transformation, OLAP aggregation, ontology, data warehouse

Hossny, A., K. Shaalan, and A. Fahmy, "Automatic Morphological Rule Induction for Arabic", The sixth international conference on Language Resources and Evaluation (LREC'08) workshop on HLT & NLP within the Arabic world: Arabic Language and local languages processing: Status Updates and Prospects, Marrakech, Morocco, LREC, pp. 97–101, may, 2008. Abstractautomaticruleinduction.pdf

In this paper, we introduce an algorithm for morphological rule induction using meta-rules for Arabic morphology based on inductive logic programming. The processing resources are a set of example pairs (stem and inflected form) with their feature vectors, either positive or negative, and the linguistic background knowledge from the Arabic morphological analysis domain. Each example pair has two words to be analyzed vocally into consonants and vowels. The algorithm applies two levels of mapping: between the vocal representation of the two words (stem, morphed) and between their feature vector. It differentiates between both mappings in order to accurately deduce which changes in the word structure led to which changes in its features. The paper also addresses the irregularity, productivity and model consistency issues. We have developed an Arabic morphological rule induction system (AMRIS). Successful evaluation has been performed and showed that the system performance results achieved were satisfactory.

Hossny, A., K. Shaalan, and A. Fahmy, "Machine translation model using inductive logic programming", the 2009 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE’09), Dalian, China, pp. 1–8, sep, 2009. Abstract101.pdf

Rule based machine translation systems face different challenges in building the translation model in a form of transfer rules. Some of these problems require enormous human effort to state rules and their consistency. This is where different human linguists make different rules for the same sentence. A human linguist states rules to be understood by human rather than machines. The proposed translation model (from Arabic to English) tackles the mentioned problem of building translation model. This model employs Inductive Logic Programming (ILP) to learn the language model from a set of example pairs acquired from parallel corpora and represent the language model in a rule-based format that maps Arabic sentence pattern to English sentence pattern. By testing the model on a small set of data, it generated translation rules with logarithmic growing rate and with word error rate 11%.

Khaled Shaalan

Professor of Computer Science

Machine Learning