Comprehensive Plagiarism Detection in Malayalam Texts Through Web and Database Integration

Meharuniza Nazeem, Parvathy Raj, Rajeev R. R, Anitha R, Navaneeth S


Abstract
Plagiarism detection techniques have become essential for recognizing instances of plagiarism, particularly in the domain of academics where scientific papers and documents are of prime importance. We propose an application that offers a comprehensive solution for detecting plagiarism in scholarly articles written in Malayalam, enabling users to submit texts, analyze them for plagiarism, and review the results interactively. With the increasing accessibility of digital content, maintaining originality in academic writing has become more tedious. Our research addresses this challenge by providing a solution tailored to the Malayalam language. The application aids researchers and academic institutions in detecting potential plagiarism by accessing web-based content and algorithmic text analysis. The study significantly contributes to the field of plagiarism detection for low resource language such as malayalam and offers a practical way to preserve the originality of Malayalam scholarly work. The performance of four algorithms SequenceMatcher, N-Grams, Rabin-Karp, and Cosine Similarity is thoroughly evaluated. Cosine Similarity, with a 92.45% detection rate, outperformed the others, significantly surpassing Rabin-Karp(65.3%), N-Grams(58.7%) and SequenceMatcher(51.4%). Using this improved efficiency, a user-friendly web application was developed that integrates web search and database comparison features with the Cosine Similarity algorithm.
Anthology ID:
2024.icon-1.40
Volume:
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
Month:
December
Year:
2024
Address:
AU-KBC Research Centre, Chennai, India
Editors:
Sobha Lalitha Devi, Karunesh Arora
Venue:
ICON
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
349–356
Language:
URL:
https://aclanthology.org/2024.icon-1.40/
DOI:
Bibkey:
Cite (ACL):
Meharuniza Nazeem, Parvathy Raj, Rajeev R. R, Anitha R, and Navaneeth S. 2024. Comprehensive Plagiarism Detection in Malayalam Texts Through Web and Database Integration. In Proceedings of the 21st International Conference on Natural Language Processing (ICON), pages 349–356, AU-KBC Research Centre, Chennai, India. NLP Association of India (NLPAI).
Cite (Informal):
Comprehensive Plagiarism Detection in Malayalam Texts Through Web and Database Integration (Nazeem et al., ICON 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.icon-1.40.pdf