CFP last date
15 April 2024
Reseach Article

WSD Tool for Ontology-based Text Document Classification

Published on June 2013 by Nazia Ilyas Baig, Gresha Bhatia
International Conference and workshop on Advanced Computing 2013
Foundation of Computer Science USA
ICWAC - Number 3
June 2013
Authors: Nazia Ilyas Baig, Gresha Bhatia
1d947cf2-f226-403f-8435-1e31848a2284

Nazia Ilyas Baig, Gresha Bhatia . WSD Tool for Ontology-based Text Document Classification. International Conference and workshop on Advanced Computing 2013. ICWAC, 3 (June 2013), 0-0.

@article{
author = { Nazia Ilyas Baig, Gresha Bhatia },
title = { WSD Tool for Ontology-based Text Document Classification },
journal = { International Conference and workshop on Advanced Computing 2013 },
issue_date = { June 2013 },
volume = { ICWAC },
number = { 3 },
month = { June },
year = { 2013 },
issn = 2249-0868,
pages = { 0-0 },
numpages = 1,
url = { /proceedings/icwac/number3/490-1328/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference and workshop on Advanced Computing 2013
%A Nazia Ilyas Baig
%A Gresha Bhatia
%T WSD Tool for Ontology-based Text Document Classification
%J International Conference and workshop on Advanced Computing 2013
%@ 2249-0868
%V ICWAC
%N 3
%P 0-0
%D 2013
%I International Journal of Applied Information Systems
Abstract

The classification of document is required to extract relevant information from the huge set of documents. There are various traditional approaches which are being used satisfactorily, but even such approaches or techniques are not enough. These traditional approaches require training sets of pre-classified documents in order to train the classifier. These approaches mainly depend only on 'bag of words', this representation used is unsatisfactory as it ignores possible relations between terms. When training set is not available, ontology provides us with knowledge that can be efficiently used for classification without using training sets. Ontology expresses information in the document form of hierarchical structure. For classifying the documents using ontology we need to define the class or the concepts to categorize the document. Here we use WordNet to capture the relations between the words. Also it is seen that WordNet alone is not sufficient to remove Word Sense Disambiguation (WSD). So in our approach we use Lesk algorithm to deal with the WSD. In this paper, we implement the tool which disambiguates a keyword in the text file. This tool is actually a utility where the input will be a text file and the utility will process the input file to give the best sense for the most occurring keyword in the file. There are various modules for achieving this. This keyword is further used for mapping with concepts to create ontology. The ontology will have classes/concepts defined for all the files in the corpus. Our approach is leveraging the strengths of ontology, WordNet and Lesk Algorithm for improving text document classification.

References
  1. Ontologies Improve Text Document Clustering Andreas Hotho, Steffen Staab, Gerd Stumme {hotho,staab,stumme}@aifb. uni-karlsruhe. de Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany
  2. http://books. google. co. in/books?id=Z0ExaN_ZssC&pg=PA693&dq=classification+using+ontology&hl=en&sa=X&ei=fDAiT5r1KIOrrAfrq4G0CA&ved=0CFYQ6AEwBg#v=onepage&q=classification%20using%20ontology&f=false
  3. Training-less Ontology-based Text Categorization: lsdis. cs. uga. edu/~mjanik/presentation/20080701-PHD_Defense. ppt, Dr. Krzysztof J. Kochut, LSDIS lab, Computer Science, University of Georgia
  4. Using WordNet for Text Categorization- Zakaria Elberrichi1, Abdelattif Rahmoun2, and Mohamed Amine Bentaalah1 1EEDIS Laboratory, Department of Computer Science, University Djilali Liabès, Algeri 2King Faisal University, Saudi Arabia
  5. General Framework for Text Classification based on Domain Ontology, Xi-quan Yang1, Na Sun1, Ye Zhang1, 2, De-ran Kong1. 978-0-7695-3444-2/08 $25. 00 © 2008 IEEE DOI 10. 1109/SMAP. 2008. 17
  6. An overview to Ontology, WordNet and WSD and a proposed system for Ontology-based text document classification, Nazia Ilyas Baig, Gresha Bhatia. International Conference on Computer Science & Information Technology (ICCSIT- 2012),Goa, India. ISBN:978-93-82208-03-7
  7. http://en. wikipedia. org/wiki/WordNet
  8. An ontology Clarification Tool for Word Sense Disambiguation,Manasi Kulkarni1 Suneeta Sane
  9. OntoGen: Semi-automatic Ontology Editor, Blaz Fortuna, Marko Grobelnik, and Dunja Mladenic, M. J. Smith, G. Salvendy (Eds. ): Human Interface, Part II, HCII 2007, LNCS 4558, pp. 309–318, 2007. © Springer-Verlag Berlin Heidelberg 2007
  10. http://biz. yahoo. com/
  11. Probabilistic Reasoning with Naïve Bayes and Bayesian Networks Zdravko Markov1, Ingrid Russell Department of Computer Science, Central Connecticut State University, 1615 Stanley Street, New Britain, CT 06050.
  12. http://lyle. smu. edu/~tspell/jaws/index. html
Index Terms

Computer Science
Information Sciences

Keywords

Ontology ontology-based text document classification concepts sense WSD WordNet LESK algorithm