Bakry, M. E., S. Safwat, and O. Hegazy,
"Big Data Classification using Fuzzy K-Nearest Neighbor",
International Journal of Computer Applications, vol. 132, issue 10, pp. 8-13, 2015.
AbstractBecause of the massive increase in the size of the data it becomes troublesome to perform effective analysis using the current traditional techniques. Big data put forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today there is not only a necessity for efficient data mining techniques to process large volume of data but in addition a need for a means to meet the computational requirements to process such huge volume of data. The objective of this paper is to classify big data using Fuzzy K-Nearest Neighbor classifier, and to provide a comparative study between the results of the proposed systems and the method reviewed in the literature. In this paper we implemented the Fuzzy KNearest Neighbor method using the MapReduce paradigm to process on big data. Results on different data sets show that the proposed Fuzzy K-Nearest Neighbor method outperforms a better performance than the method reviewed in the literature.
EL-Bakry, M., S. Safwat, and O. Hegazy,
"‘Fuzzy’ vs ‘Non-Fuzzy’ Classification in Big Data",
Proceedings of Second International Conference on Digital Information Processing, Data Mining, and Wireless Communications (DIPDMWC2015), Dubai,UAE, pp. 23-32, 2015.
AbstractDue to the huge increase in the size of the data it becomes troublesome to perform efficient analysis using the current traditional techniques. Big data puts forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today, there is not only a necessity for efficient data mining techniques to process large volume of data but also a need for a means to meet the computational requirements to process such huge volume of data. The objective of this research is to compare fuzzy and non-fuzzy algorithms in classification of big data, and to provide a comparative study between the results of this study and the methods reviewed in the literature. In this paper, we implemented the Fuzzy K-Nearest Neighbor method as a fuzzy technique and the Support Vector Machine as non-fuzzy technique using the map reduce paradigm to process on big data. Results on different data sets show that the proposed Fuzzy K Nearest Neighbor method outperforms a better performance than the Support Vector Machine and the method reviewed in the literature.