Plenary Lecture

Synthetic Over-Sampling and Understandable Data Mining Models

Professor Hyontai Sug
Division of Computer Engineering
Dongseo University
Korea
E-mail: sht@gdsu.dongseo.ac.kr

Abstract: Understandable data mining models like decision tree algorithms have the property that gives higher priority to the classes having more training instances with better purity for more accurate classification. Due to the property, data that belong to minority are often neglected. But, we often interested in these data. As a way to overcome the problem over-sampling has been considered a good technique for better classification of the minor class. Synthetic minority over-sampling technique supplies instances of a minor class to build better classification models for the minor class. But, if we build a data mining model using a training data set, some instances are classified wrongly. There are two reasons for the wrong classification--the limitation of the data mining algorithm itself, and imperfection of the data set itself. As a way to build better data mining for a minority class without sacrificing overall accuracy, we select good synthetic data instances for our data mining. By checking whether the synthetic data instances are classified correctly or not, and supplying the good ones only to build our target data mining model like decision tree, we could build better data mining model for a minor class. Several examples will be shown.

Brief Biography of the Speaker: Dr. Hyontai Sug: received BS degree in computer science and statistics from Busan National University, Korea in 1983, and MS degree in applied computer science from Hankuk University of Foreign Studies, Korea in 1986, majoring natural language processing, and Ph.D. degree in computer and information science and engineering from University of Florida, USA in 1998, majoring data mining. He was a researcher of Agency for Defense Development, Korea from 1986 to 1992, and a full-time lecturer of Pusan University of Foreign Studies, Korea from 1999 to 2001. Currently, he is professor of Dongseo University, Korea from 2001. He published several noticeable articles in the field of data mining, so that he has been listed in Marquis who’s who in the world since 2006. His research interests include data mining especially in the field of decision trees and association rules, and he is also interested in database application development.

Bulletin Board

Currently:

The Conference Program is online.

The Conference Guide is online.

The paper submission deadline has expired. Please choose a future conference to submit your paper.


Plenary Speakers

WSEAS Main Site

Publication Ethics and Malpractice Statement