Google scholar arxiv informatics ads IJAIS publications are indexed with Google Scholar, NASA ADS, Informatics et. al.

Call for Paper


March Edition 2021

International Journal of Applied Information Systems solicits high quality original research papers for the March 2021 Edition of the journal. The last date of research paper submission is February 15, 2021.

Legal Documents Clustering using Latent Dirichlet Allocation

Ravi Kumar V, K. Raghuveer Published in Information Sciences

International Journal of Applied Information Systems
Year of Publication 2012
© 2010 by IJAIS Journal
Download full text
  1. Ravi Kumar V and K Raghuveer. Article: Legal Documents Clustering using Latent Dirichlet Allocation. International Journal of Applied Information Systems 2(6):27-33, May 2012. BibTeX

    	author = "Ravi Kumar V and K. Raghuveer",
    	title = "Article: Legal Documents Clustering using Latent Dirichlet Allocation",
    	journal = "International Journal of Applied Information Systems",
    	year = 2012,
    	volume = 2,
    	number = 6,
    	pages = "27-33",
    	month = "May",
    	note = "Published by Foundation of Computer Science, New York, USA"


At present due to the availability of large amount of legal judgments in the digital form creates opportunities and challenges for both the legal community and for information technology researchers. This development needs assistance in organizing, analyzing, retrieving and presenting this content in a helpful and distributed manner. We propose an approach to cluster legal judgments based on the topics obtained from Latent Dirichlet Allocation (LDA) using similarity measure between topics and documents. The developed topic based clustering model is capable of grouping the legal judgments into different clusters in effective manner. As per as our knowledge is concerned this is the first approach to cluster Indian legal judgments using LDA topic model


  1. J. Allen, et al. "Topic detection and tracking pilot study final report". In Proc. of the DARPA Broadcast News Transcription and understanding Workshop, 1998.
  2. Marti Hearst. "Texttiling: Segmenting text into multi-paragraph subtopic passages". Computational Linguistics, 1997, Vol. 23. Pages 33–64.
  3. M. Utiyama and H. Isahara. "A statistical model for domain-independent text segmentation". In Proc. of the ACL 2001, pages 499–506.
  4. M. Shafiei and E. Milios. "A statistical model for topic segmentation and clustering". In Proc. of Canadian AI'08.
  5. D. Beeferman, A. Berger, and J. Lafferty. "A model of lexical attraction and repulsion". In Proc. of the ACL, pages 1997, pages 373–380.
  6. F. Choi, P. Wiemer-Hastings, and J. Moore. "Latent semantic analysis for text segmentation". In Proc. of EMNLP, 2001, pages 109–117.
  7. H. Kozima. Text segmentation based on similarity between words full text. In Proc. of the ACL, pages 286–288, 1993.
  8. H. Kozima and T. Furugori. "Similarity between words computed by spreading activation on an English dictionary". In Proceedings of the ACL, 1993, pages 232–239.
  9. Wei Xu, Xin Liu and Yihong Gong. "Document Clustering Based On Non-negative Matrix Factorization". In Proc. of SIGIR'03 July 28–August 1, 2003, Toronto, Canada. Pages267-273
  10. Qiang Lu, William Keenan, Jack G. Conrad and Khalid Al-Kofahi. "Legal Document Clustering with Built-in Topic Segmentation". In Proc. of CIKM'11, October 24–28, 2011, Glasgow, Scotland, UK. Pages 383-392
  11. Anna Huang. "Similarity Measures for Text Document". In Proc. of NZCSRSC 2008, April 2008, Christchurch, New Zealand.
  12. M. Saravanan. , B. Ravindran and S. Raman. "Using Legal Ontology for Query Enhancement in Generating a Document Summary". In Proc. of JURIX 2007, 20th International Annual Conference on Legal Knowledge and Information Systems, Leiden, Netherlands, 13-15th Dec 2007. Pages 171-172.
  13. P. Berkhin. "A survey of clustering data mining techniques". Grouping Multidimensional Data 2006, pages 25–71.
  14. D. M. Blei, A. Y. Ng, and M. I. Jordan. "Latent Dirichlet allocation". Journal of Machine Learning Research Vol. 3 (2003) 993-1022.
  15. http://www. keralawyer. com/asp/sub. asp?pageVal=judgements


Latent Dirichlet Allocation (lda), Legal Judgments, Documents Clustering, Cosine Similarity