CFP last date
15 May 2024
Reseach Article

PrefixSpan Algorithm for Finding Sequential Pattern with Various Constraints

by Pratik Saraf, R. R Sedamkar, Sheetal Rathi
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 9 - Number 3
Year of Publication: 2015
Authors: Pratik Saraf, R. R Sedamkar, Sheetal Rathi
10.5120/ijais15-451380

Pratik Saraf, R. R Sedamkar, Sheetal Rathi . PrefixSpan Algorithm for Finding Sequential Pattern with Various Constraints. International Journal of Applied Information Systems. 9, 3 ( June 2015), 37-41. DOI=10.5120/ijais15-451380

@article{ 10.5120/ijais15-451380,
author = { Pratik Saraf, R. R Sedamkar, Sheetal Rathi },
title = { PrefixSpan Algorithm for Finding Sequential Pattern with Various Constraints },
journal = { International Journal of Applied Information Systems },
issue_date = { June 2015 },
volume = { 9 },
number = { 3 },
month = { June },
year = { 2015 },
issn = { 2249-0868 },
pages = { 37-41 },
numpages = {9},
url = { https://www.ijais.org/archives/volume9/number3/763-1380/ },
doi = { 10.5120/ijais15-451380 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T18:59:55.937699+05:30
%A Pratik Saraf
%A R. R Sedamkar
%A Sheetal Rathi
%T PrefixSpan Algorithm for Finding Sequential Pattern with Various Constraints
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 9
%N 3
%P 37-41
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

PrefixSpan (Prefix-projected Sequential pattern mining) algorithm is very well known algorithm for sequential data mining. It extracts the sequential patterns through pattern growth method. The algorithm performs very well for small datasets. As the size of datasets increases the overall time for finding the sequential patterns also get increased. The PrefixSpan algorithm is run on different datasets and results are drawn based on minimum support value. One new parameter maximum prefix length is also considered while running the algorithm. Through maximum prefix length parameter the length of prefix pattern is set which is helpful for running the algorithm on large datasets. The paper also shows the variation in time complexity and memory utilization while running the algorithm on different size of input sequential datasets.

References
  1. R Agrawal and R Srikant, 1995. Mining sequential patterns, In Proceedings of 1995 International Conference Data Engineering (ICDE'95), pp. 3- 14, Taipei, Taiwan.
  2. R Agrawal and R Srikant, 1994. Fast algorithms for mining association rules, In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB'94), pp. 487- 499, Santiago, Chile.
  3. M. Zaki, 2001. SPADE: An Efficient Algorithm for Mining Frequent Sequences, Machine Learning, vol. 40, pp. 31- 60.
  4. Han J. , Dong G. , Mortazavi-Asl B. , Chen Q. , Dayal U. , Hsu M. -C. , 2000. Freespan: Frequent pattern-projected sequential pattern mining, In Proceedings 2000 Int. Conf. Knowledge Discovery and Data Mining (KDD'00), pp. 355-359. 2000.
  5. Jian Pei, Jiawei Han, Behzad Mortazavi, Umeshwar Dayal, 2004. Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach, IEEE transactions on knowledge and data engineering, Vol. 16, pp. 1424-1440.
  6. LIU Pei-yu, GONG Wei and JIA Xian, 2011. An Improved PrefixSpan Algorithm Research for Sequential Pattern Mining, In Proceedings 2011 International Symposium, pp. 377-380.
  7. Liang Dong and Wang Hong. 2014. A improved PrefixSpan Algorithm for Sequential Pattern Mining, In Proc. 2014 IEEE International Conference, Vol. 1, pp. 103-108.
  8. Zhou Zhao, Da Yan and Wilfred Ng. 2014. Mining Probabilistically Frequent Sequential Patterns in Large Uncertain Databases, IEEE transactions on knowledge and data engineering, Vol. 26, pp. 1171-1184.
  9. J. Pei, J. Han and W. Wang, 2007. Constraint-based sequential pattern mining: the pattern growth methods, J Intell. Inf. Syst, Vol. 28, No. 2, pp. 133 –160.
Index Terms

Computer Science
Information Sciences

Keywords

PrefixSpan Algorithm Minimum support Maximum prefix length Time complexity Memory utilization.