Plagiarism Detection using Sequential Pattern Mining

Ali El-matarawy, Mohammad El-ramly, Reem Bahgat Published in Pattern Recognition

Year of Publication: 2013
This research presents a new technique for plagiarism detection using sequential pattern mining titled EgyCD. Over the last decade many techniques and tools for software clone detection have been proposed such as textual approaches, lexical approaches, syntactic approaches, semantic approaches …, etc. In this paper, the research explores the potential of data mining techniques in plagiarism detection. In particular, the research proposed a plagiarism technique based on sequential pattern mining (SPM), words/statements are treated as a sequence of transactions processed by the SPM algorithm to find frequent itemsets. The research submits an experiment to discover copy/paste in the text source and it gave good results in a reasonable and acceptable time.


Plagiarism Detector, Plagiarized Clones, Textual Approach, Lexical Approach, Syntactic Approach, Data Mining, Apriori Property, Sequential Pattern Mining