Salama, M.,
Data Mining for Medical Informatics,
, Cairo, Cairo Unv, 2012.
AbstractThe work presented in this thesis investigates the nature of real-life data, mainly in the medical field, and the problems in handling such nature by the conventional data mining techniques. Accordingly, a set of alternative techniques are proposed in this thesis to handle the medical data in the three stages of data mining process. In the first stage which is preprocessing, a proposed technique named as interval-based feature evaluation technique that depends on a hypothesis that the decrease of the overlapped interval of values for every class label leads to increase the importance of such attribute. Such technique handles the difficulty of dealing with continuous data attributes without the need of applying discretization of the input and it is proved by comparing the results of the proposed technique to other attribute evaluation and selection techniques. Also in the preprocessing stage, the negative effect of normalization algorithm before applying the conventional PCA has been investigated and how the avoidance of such algorithm enhances the resulted classification accuracy. Finally in the preprocessing stage, an experimental analysis introduces the ability of rough set methodology to successfully classify data without the need of applying feature reduction technique. It shows that the overall classification accuracy offered by the employed rough set approach is high compared with other machine learning techniques including Support Vector Machine, Hidden Naive Bayesian network, Bayesian network and other techniques.
In the machine learning stage, frequent pattern-based classification technique is proposed; it depends on the detection of variation of attributes among objects of the same class. The preprocessing of the data like standardization, normalization, discretization or feature reduction is not required in this technique which enhances the performance in time and keeps the original data without being distorted. Another contribution has been proposed in the machine learning stage including the support vector machine and fuzzy c-mean clustering techniques; this contribution is about the enhancement of the Euclidean space calculations through applying the fuzzy logic in such calculations. This enhancement has used chimerge feature evaluation techniques in applying fuzzification on the level of features. A comparison is applied on these enhanced techniques to the other classical data mining techniques and the results shows that classical models suffers from low classification accuracy due to the dependence of un-existed presumption.
Finally, in the visualization stage, a proposed technique is presented to visualize the continuous data using Formal Concept Analysis that is better than the complications resulted from the scaling algorithms.
Ahmed Ibrahim Hafez, N. Ghali, A. E. Hassanien, and A. Fahmy,
"Genetic Algorithms for Multi-Objective Community Detection in Complex Networks ",
IEEE International Conference on Intelligent Systems Design and Applications (ISDA) , Kochi, India, pp. 460 - 465, Nov. 27-29 2012.
AbstractCommunity detection in complex networks has attracted a lot of attention in recent years. Community detection can be viewed as an optimization problem, in which an objective function that captures the intuition of a community as a group of nodes with better internal connectivity than external connectivity is chosen to be optimized. Many single-objective optimization techniques have been used to solve the problem however those approaches have its drawbacks since they try optimizing one objective function and this results to a solution with a particular community structure property. More recently researchers viewed the problem as a multi-objective optimization problem and many approaches have been proposed to solve it. However which objective functions could be used with each other is still under debated since many objective functions have been proposed over the past years and in somehow most of them are similar in definition. In this paper we use Genetic Algorithm (GA) as an effective optimization technique to solve the community detection problem as a single-objective and multi-objective problem, we use the most popular objectives proposed over the past years, and we show how those objective correlate with each other, and their performances when they are used in the single-objective Genetic Algorithm and the Multi-Objective Genetic Algorithm and the community structure properties they tend to produce.