| | |
Plenary Lecture
Knowledge Discovery in Remote Access Databases

Assistant Professor Zakaria Suliman Zubi
Computer Science Department
Al-Tahadi University
Serit Post Office,P.O. Box 727
Serit Libya
E-mail: zszubi@yahoo.com
Abstract:
Data mining is an emergent field, whose main goal is to discover
useful patterns hidden in large databases. Because of its
interdisciplinary nature, there is a wide variety of techniques
and methods that come from diverse disciplines, such as
statistics, database, machine learning, knowledge
representation, and visualization. The Knowledge Discovery in
Databases (KDD) process is modeled as an iterative process
composed of several phases, each of which contains many
obstacles, open problems, and research questions needing to be
investigated and resolved. These reflect the current limitations
of both humans and machines to generate, analyze, and interpret
knowledge from large databases. To improve the data mining
process requires strong theoretical and empirical research, that
involves, mainly, the creation of better interfaces to database
systems, new strategies to simplify the preprocessing stage,
optimization and tuning of inductive learning algorithms and
creation of a proposed new ones, and better techniques to
interpret and evaluate patterns produced by data mining methods.
In this lecture, I will focus on two research problems within
the KDD process: algorithm selection and algorithm engineering.
Currently, the selection of a data mining algorithm that
performs well in solving a data mining problem is rather
subjective and it may lead to users and data analysts to make
wrong decisions about the most appropriate technique for the
problem being solved, or they may spend significant amount of
time and effort trying to apply a technique that is not best
suited to the problem. Thus, this course of research will
introduce a set of heuristics to guide the user in the selection
of the most appropriate methods for searching for patterns in a
data set, for a particular problem or data mining task. In
addition, other issues arise when the selected data mining
algorithm is applied to a training dataset to induce a model. A
model to data mining could be one of the remote access KDD
models. One of these models which we used to call it ODBC _ KDD
(2), was proposed by us. The methodology of this model began
when an end user submitted a query. This query will be
reconstructed to be what we used to call it knowledge discovery
query language (KDQL). To meet the KDD process requirements the
classical user query must have some extra parameters or rules to
extract the hidden information or patterns in the databases.
Many data mining algorithms rely on several parameters that the
user must set, and that significantly affect the quality of
generated patterns. Generating these patterns requires logical
investigation in the form of data mining to be able to find out
the association rules that we used to discover or mine. These
rules help us to discover many associations in one particular
database. Association rules can drive us to understand the
behavior of our databases. The requirement of discovering the
association rules in our databases leads us to think for a
strong query language that could express more complex questions
then the classical SQL. Such type of languages is called data
mining query language (DMQL). Commonly, the user is forced to
explore a huge parameter space without clues about which
parameter settings are more convenient to induce an appropriate
model for the dataset being explored. Also, when the induced
model is used to predict new cases, it is fundamental that the
model be represented in such form that the user can understand
how the model is really working in making decisions, and then
exploring alternative models based on the query language and
also to the databases that have to be retrieved. According to
the databases we implement a database concept called i-extended
database. The main aim of this is to extract all the useful
information from classical databases and store it a standard
form to make it suitable for establishing the knowledge
discovery query language (KDQL). Regarding to the data mining
query language we implement a data mining query language named
as knowledge discovery query language (KDQL). The syntactic of
KDQL came from the Structured Query Language (SQL) since several
extensions to the SQL have been proposed to serve as a data
mining query language (DMQL) . However, they do not sufficiently
address how to visualize query results. I will investigate the
requirements for a SQL describing the graphical representation
of Knowledge Discovery Query (KDQ) results from the perspective
of a large database system. With frequent map output and
assesses several SQL extensions with respect to their treatment
of the graphical representation. It concludes that the SQL + DM
(rules) = is the appropriate form for this task at the user
interface. DM rules are based on the association rules to
interact i-extended database. I-extended database can access to
other type of databases such as relational databases. The
association rules will be obtain by the use of KDQL rules, and
then graphically represented in a 2D and 3D charts. The KDQL
syntax will be notified as well. The syntax was practical used
and showed great results. It provide some practical scripts from
the KDQL program by displaying some retrieving results with
charts of four different types. Visualization result can
significantly presented in 2D or 3D in forms such as: pie, bar,
line and points charts.
Brief Biography of the Speaker:
Ph.D. in Computer Science (Information Technology) (Database
Managements System) , Institute of Mathematics and Informatics,
University of Debrecen ,Debrecen, Hungary, 1998-2002. M.Sc. in
Computer Science (Artificial Intelligent), Institute of
Mathematics and Informatics, University of Debrecen , Debrecen,
Hungary, 1996-1998. B.Sc. Degree from the Department of Computer
Science , Faculty of Science ,Al-Tahadi University , Sert ,
Libya , from 1989-1993.
Many academic positions include graduate study coordinator,
Computer Science Department, Faculty of Science, Al-Tahadi
University Sirt, Libya, from 2003 till now. Head of Computer
Science Department , Faculty of Science, Al- Tahadi University,
Sirt, Libya, From 2003 until 2005. General Graduate Study
Coordinator at the Faculty of Science, Al-Tahadi University ,
Sirt, Libya, from 2004 until 2005.Undergraduate and postgraduate
lecturer in computer science department.
Scientific activities such as external and internal member of
many postgraduate examination committee boards in Libyan
universities. Official reviewer in many scientific local
journals in Libya. Member of the Libyan Quality Assurance Agency
in Higher Education.
Research activates published many papers in several
International Conferences. Published some of them in WSEAS.
Research area in Knowledge discovery in distributed databases.
| | |