Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques

Hamid Reza Khosravani

Call for Paper

June Edition

IJAIS solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 15 May 2024

Submit your paper

Know more

The week's pick

Analysis of ANN Training Algorithms for Hand Geometry-Based Access Control

Kazeem B. Adedeji Apena Waliu O. Adu Michael R.

Random Articles

FIR Linear Phase Fractional Order Digital Differentiator Design using Convex Optimization

December

2014

Type of NOSQL Databases and its Comparison with Relational Databases

March

2013

PLM based Customization for Extraction of NX Assembly from Team center to Local Drive

May

2012

Competency Management in Engineering Institutions: an Expert System based Knowledge Management Perspective

May

2012

Reseach Article

Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques

by Hamid Reza Khosravani

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 3 - Number 3

Year of Publication: 2012

Authors: Hamid Reza Khosravani

http:/ijais12-450475

Hamid Reza Khosravani . Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques. International Journal of Applied Information Systems. 3, 3 ( July 2012), 8-12. DOI=http:/ijais12-450475

@article{ http:/ijais12-450475,

author = { Hamid Reza Khosravani },

title = { Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques },

journal = { International Journal of Applied Information Systems },

issue_date = { July 2012 },

volume = { 3 },

number = { 3 },

month = { July },

year = { 2012 },

issn = { 2249-0868 },

pages = { 8-12 },

numpages = {9},

url = { https://www.ijais.org/archives/volume3/number3/210-0475/ },

doi = { http:/ijais12-450475 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-07-05T10:45:32.394935+05:30

%A Hamid Reza Khosravani

%T Proposing an Improved Semantic and Syntactic Data Quality Mining Method using Clustering and Fuzzy Techniques

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 3

%N 3

%P 8-12

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Data quality plays an important role in knowledge discovering process in databases. Researchers have proposed two different approaches for data quality evaluation so far. The first approach is based on statistical methods while the second one uses data mining techniques which caused further improvement in data quality evaluation results through relying on knowledge extracting. Our proposed method in data quality evaluation follows the second approach and focuses on accuracy dimension of data quality evaluation including both syntactic and semantic aspects.

References

Partabiyan, J. , Mohsenzadeh, M. 2009. Database quality evaluation using a data mining technique, Science and Research Branch, Islamic Azad University, Tehran, Iran.
Ghazanfari, M. , Alizadeh, S. , and Teymourpour, B. 2008. Data Mining and Knowledge Discovery, Publish Center of Iran University of Science & Technology, Tehran, Iran.
Wang, L. , Teshnehlab, M. , Saffarpour, N. , Afuni, D. 2008. Fuzzy Systems and Fuzzy Control, Publish Center of K. N Toosi university of Technology, Tehran, Iran.
Amir A. , Lipika, D. 2007. A k-mean clustering algorithm for mixed numeric and categorical data, Solid State Physics Laboratory, Timarpur, Delhi India, ScienceDirect.
Amir, A. , Lipika, D. 2007. A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set, Solid State Physics Laboratory, Timarpur, Delhi India, ScienceDirect.
Augustin-Iulian Ionescu, Eugen Dumitrascu, 2004. Database Quality-Some Problems, 7th International Conference on Develpment and Application Systems, Suceava, Romania.
Dharmendra S. , Modha, W. , Spangler, S. 2001. FeatureWeighting in k-Means Clustering , Kluwer Academic Publishers, Netherlands.
Loshin, D. 2006. Monitoring Data Quality Performance Using Data Quality Metrics, Informatica Corporation.
Luebbers, D. , Grimmer, U. , Jarke, M. 2003. Systematic Development of Data Mining-Based Data Quality Tools, Proceedings of the 29th VLDB Conference, Berlin, Germany.
Erhard Rahm, Hong Hai Do, Data Cleaning: Problems and Current Approaches, University of Leipzig, Germany.
Hipp, J. , G¨untzer, U. , Grimmer, U. 2003. Data Quality Mining, 3rd International Conference on Practical Aspects of Knowledge Management.
Dougherty, J. , Kohavi, R. , Sahami, M. 1995. Supervised and Unsupervised Discretization of Continuous Features, Computer Science Department of Stanford University, Proceeding of the 12th International Conference.
Peng, L. , Lei, L. A Review of Missing Data Treatment Methods, Department of Information Systems, Shanghai University of Finance and Economics, Shanghai, China.
Lee. 1999. Fuzzy logic in control systems: Fuzzy logic controller, IEEE Trans Systems.
Pipino, L. L. , Lee, Y. W. , Wang, R. Y. 2002. Data Quality Assessment, Communications of the ACM.
Helfert, M. , An Approach for Information Quality measurement in Data Warehousing, University of St. Gallen (Switzerland).
Ludl, M. C. , Widmer, G. , Relative Unsupervised Discretization for Association Rule Mining , Department of Medical Cybernetics and Artificial Intelligence, University of Vienna.
Scannapieco, M. , Missier, P. , Batini, C. , Data Quality at a Glance, Università di Roma "La Sapienza" , University of Manchester, Dipartimento di Informatica, Sistemistica e Comunicazione.
Mamdani; E. H;"Application of fuzzy logic to approximate reasoning using linguistic synthesis", IEEE Trans on Computers, 2003.
Manoranjan Dash, Huan Liu, Feature Selection for Clustering, National University of Singapore, Singapore.
Ohn Mar San, Van-Nas huynh, Yoshiteru Nakamori, 2004. An alternative extention of the k-means algorithm clustering categorical data, Mathematics and Statistics Department of Co-Operative Degree College Sagaing Myanmar, Japan Advanced Institute of Science and Technology Asahidai Tatsunokuchi Ishikawa Japan.
Vázquez Soler, S. , Yankelevich, D. , Quality Mining: A Data Mining Based Method for Data Quality Evaluation, Pragma Consultores and Departamento de Computación – FCEyN Universidad de Buenos Aires, Argentina.
Zhexue Huang, 1998. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values, Kluwer Academic Publishers, Netherlands.

Index Terms

Computer Science

Information Sciences

Keywords

Data Quality Mining Association Rules Categorical Feature Numerical Feature