Tokenization and Filtering Process in RapidMiner

Tanu Verma; Renu; Deepti Gaur

Call for Paper

August Edition

IJAIS solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 28 July 2025

Submit your paper

Know more

The week's pick

Enhancing Financial Time Series Predictions with a Hybrid BNN-LSTM Approach

Anika Tahsin Biva A.B.M. Shahadat Hossain Md. Shafiul Alom Khan Iqbal Habib

Random Articles

Improved Secure Biometric Authentication Protocol

July

2020

An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

April

2014

Knowledge Management System: Critical Success Factors and Weight Scoring Model of the Technical Dimensions

September

2014

Clickjacking Vulnerability and Countermeasures

December

2012

Reseach Article

Tokenization and Filtering Process in RapidMiner

by Tanu Verma, Renu, Deepti Gaur

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 7 - Number 2

Year of Publication: 2014

Authors: Tanu Verma, Renu, Deepti Gaur

10.5120/ijais14-451139

Tanu Verma, Renu, Deepti Gaur . Tokenization and Filtering Process in RapidMiner. International Journal of Applied Information Systems. 7, 2 ( April 2014), 16-18. DOI=10.5120/ijais14-451139

@article{ 10.5120/ijais14-451139,

author = { Tanu Verma, Renu, Deepti Gaur },

title = { Tokenization and Filtering Process in RapidMiner },

journal = { International Journal of Applied Information Systems },

issue_date = { April 2014 },

volume = { 7 },

number = { 2 },

month = { April },

year = { 2014 },

issn = { 2249-0868 },

pages = { 16-18 },

numpages = {9},

url = { https://www.ijais.org/archives/volume7/number2/620-1139/ },

doi = { 10.5120/ijais14-451139 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-07-05T18:54:38.635493+05:30

%A Tanu Verma

%A Renu

%A Deepti Gaur

%T Tokenization and Filtering Process in RapidMiner

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 7

%N 2

%P 16-18

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Text mining is defined as a knowledge-intensive process in which a user interacts with a document collection. As in data mining[2,4,9], text mining seeks to extract useful information from data sources through the identi?cation and exploration of interesting patterns. A key element of text mining is its focus on the document collection. A document collection can be any grouping of text-based documents. Most text mining solutions are aimed at discovering patterns across very large document collections. The number of documents can range from the many thousands to millions. In this paper, we will see how text mining is implemented in Rapidminer.

References

R. Agrawal and R. Srikant. Fast algorithms for mining association rules in Proceedings of the 20th International Conference on Very Large Databases (VLDB-94), Chile, Sept. 1994.
Margaret H. Dunham, Data Mining "Introduction and Advanced Topics".
R. Baeza-Yates and B. Ribeiro-Neto, "Modern Information Retrieval" ACM Press, New York, 1999.
Agrawal , T. lmielinski and A. Swami " Database mining: A performance perspective", IEEE Transactions on knowledge and Data Eng. , vol. 5, no. 6.
M. E. Califf, editor. Papers from the Sixteenth National Conference on Arti?cial Intelligence(AAAI-99) Workshop on Machine Learning for Information Extraction, Orlando, FL, 1999. AAAI Press.
M. E. Califf and R. J. Mooney, " Relational learning of pattern-match rules for information extraction" in Proceedings of the 16th National Conference on Arti?cial Intelligence(AAAI-99), pages 328–334, Orlando, FL, July 1999.
C. Cardie, "Empirical methods in information extraction", AI Magazine, 18(4):65–79, 1997.
C. Cardie and R. J. Mooney, "Machine learning and natural language (Introduction to special issue on natural language learning)" Machine Learning, 34:5–9, 1999.
Jiawei Han and Micheline Kamber, "Data Mining Concepts and Techniques", Morgan Kaufmann Publisher, 722
Yang Y M, "An evaluation of statistical approach to text categorization [R]" in Technical Report CMU - CS - 97-127. Computer Science Department, Carnegie Mellon University, 1997
C. Choi and Y. Park "R&D proposal screening system based on text-mining approach", Int. J. Technol. Intell. Plan. , vol. 2, no. 1, pp. 61 -72 2006
H. C. Yang and C. H. Lee "A text mining approach for automatic construction of hypertexts", Expert Syst. Appl. , vol. 29, no. 4, pp. 723 -734 2005
Agrawal R, Imielinski T and Swami A, "Mining association rules between sets of items in large database[M]", Washington, DC: SIGMOD, 1993. 207-216.

Index Terms

Computer Science

Information Sciences

Keywords

Text mining Tokenize Filtering Stop words Stemming.