Google scholar arxiv informatics ads IJAIS publications are indexed with Google Scholar, NASA ADS, Informatics et. al.

Call for Paper

-

April Edition 2019

International Journal of Applied Information Systems solicits high quality original research papers for the April 2019 Edition of the journal. The last date of research paper submission is March 15, 2019.

Plagiarism Detection using Sequential Pattern Mining

Ali El-matarawy, Mohammad El-ramly, Reem Bahgat Published in Pattern Recognition

International Journal of Applied Information Systems
Year of Publication: 2013
© 2012 by IJAIS Journal
10.5120/ijais12-450846
Download full text
  1. Ali El-matarawy, Mohammad El-ramly and Reem Bahgat. Article: Plagiarism Detection using Sequential Pattern Mining. International Journal of Applied Information Systems 5(2):24-29, January 2013. BibTeX

    @article{key:article,
    	author = "Ali El-matarawy and Mohammad El-ramly and Reem Bahgat",
    	title = "Article: Plagiarism Detection using Sequential Pattern Mining",
    	journal = "International Journal of Applied Information Systems",
    	year = 2013,
    	volume = 5,
    	number = 2,
    	pages = "24-29",
    	month = "January",
    	note = "Published by Foundation of Computer Science, New York, USA"
    }
    

Abstract

This research presents a new technique for plagiarism detection using sequential pattern mining titled EgyCD. Over the last decade many techniques and tools for software clone detection have been proposed such as textual approaches, lexical approaches, syntactic approaches, semantic approaches …, etc. In this paper, the research explores the potential of data mining techniques in plagiarism detection. In particular, the research proposed a plagiarism technique based on sequential pattern mining (SPM), words/statements are treated as a sequence of transactions processed by the SPM algorithm to find frequent itemsets. The research submits an experiment to discover copy/paste in the text source and it gave good results in a reasonable and acceptable time.

Reference

  1. D. A. Black, Tracing Web Plagiarism – A guide for teachers, Internal Document, Department of Communication, Seton Hall University, Version 0. 3, Fall 1999.
  2. P. Clough ,Plagiarism in natural and programming languages: an overview of current tools and technologies, July 2000, Department of Computer Science, University of Sheffield
  3. L. R. Jones, Academic Integrity & Academic Dishonesty:A Handbook About Cheating & Plagiarism, Revised & Expanded Edition, Florida Institute of Technology, Melbourne, Florida.
  4. Schleimer, S. , Wilkerson, D. S. , Aiken, A. : Winnowing: local algorithms for document fingerprinting. In: SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data. pp. 76–85. ACM, New York, NY, USA (2003).
  5. Approaches for Intrinsic and External Plagiarism Detection Notebook for PAN at CLEF 2011, Gabriel Oberreuter, Gaston L'Huillier, Sebastián A. Ríos, and Juan D. Velásquez, Department of Industrial Engineering, University of Chile.
  6. Potthast, M. , Barrón-Cedeño, A. , Eiselt, A. , Stein, B. , Rosso, P. : Overview of the 2nd international competition on plagiarism detection. In: Braschler, M. , Harman, D. (eds. ) Notebook Papers of CLEF 2010 LABs and Workshops, 22-23 September, Padua, Italy (2010).
  7. Potthast, M. , Stein, B. , Eiselt, A. , Barrón-Cedeño, A. , Rosso, P. : Overview of the 1st international competition on plagiarism detection. In: Stein, B. , Rosso, P. , Stamatatos, E. , Koppel, M. , Agirre, E. (eds. ) SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 09). pp. 1–9. CEUR-WS. org (Sep 2009), http://ceur-ws. org/Vol-502.
  8. A. B. Cede˜no, P. Rosso ,On Automatic Plagiarism Detection Based on n-Grams Comparison, Natural Language Engineering Lab. , Dpto. Sistemas Inform´aticos y Computaci´on, Universidad Polit´ecnica de Valencia, Spain.
  9. Lyon, C. , Barrett, R. , Malcolm, J. : A Theoretical Basis to the Automated Detection of Copying Between Texts, and its Practical Implementation in the Ferret Plagiarism and Collusion Detector. In: Plagiarism: Prevention, Practice and Policies Conference, Newcastle, UK (2004).
  10. Kang, N. , Gelbukh, A. : PPChecker: Plagiarism Pattern Checker in Document Copy Detection. In: Sojka, P. , Kope?cek, I. , Pala, K. (eds. ) TSD 2006. LNCS, vol. 4188, pp. 661–667. Springer, Heidelberg (2006).
  11. M. -S. Chen, J. Han, and P. S. Yu. Data mining: an overview from a database perspective. IEEE Trans. On Knowledge And Data Engineering 8, 866-883 (1996).
  12. Q. Zhao, S. S. Bhowmick, Sequential pattern mining: a survey, Technical Report Center for Advanced Information Systems, School of Computer Engineering, Nanyang Technological University, Singapore, (2003).
  13. C. Liu, C. Chen, J. Han and P. Yu, GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp. 872-881 (2006).
  14. Vera Wahler, Dietmar Seipel, J¨urgen Wolff v. Gudenberg, and Gregor Fischer. Clone Detection in Source Code by Frequent Itemset Techniques, Source Code Analysis and Manipulation, 2004. Fourth IEEE International Workshop on16-16 Sept. 2004.
  15. M. Gabel, L. Jiang and Z. Su, Scalable Detection of Semantic Clones, in: Proceedings of the 30th International Conference on Software Engineering, ICSE 2008, pp. 321-330 (2008).
  16. A. Leitlao, Detection of Redundant Code Using R2D2, Software Quality Journal, 12(4):361-382 (2004).

Keywords

Plagiarism Detector, Plagiarized Clones, Textual Approach, Lexical Approach, Syntactic Approach, Data Mining, Apriori Property, Sequential Pattern Mining