CFP last date
15 May 2024
Reseach Article

Semantic Similarity Measure for Pairs of Short Biological Texts

by Olivia Sanchez Graillet
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 4 - Number 5
Year of Publication: 2012
Authors: Olivia Sanchez Graillet

Olivia Sanchez Graillet . Semantic Similarity Measure for Pairs of Short Biological Texts. International Journal of Applied Information Systems. 4, 5 ( October 2012), 1-5. DOI=10.5120/ijais12-450699

@article{ 10.5120/ijais12-450699,
author = { Olivia Sanchez Graillet },
title = { Semantic Similarity Measure for Pairs of Short Biological Texts },
journal = { International Journal of Applied Information Systems },
issue_date = { October 2012 },
volume = { 4 },
number = { 5 },
month = { October },
year = { 2012 },
issn = { 2249-0868 },
pages = { 1-5 },
numpages = {9},
url = { },
doi = { 10.5120/ijais12-450699 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2023-07-05T10:47:21.800562+05:30
%A Olivia Sanchez Graillet
%T Semantic Similarity Measure for Pairs of Short Biological Texts
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 4
%N 5
%P 1-5
%D 2012
%I Foundation of Computer Science (FCS), NY, USA

Finding the semantic similarity between biological texts, specially short texts, such as article abstracts and experiment descriptions of microarrays, may throw important information for experts in that field. To date, these methods have not been widely explored. In this paper, a comparison of different measures to calculate the semantic similarity of pairs of short biological texts is presented. An existing method for semantic similarity between general texts was adapted to be used in the biological context by employing the UMLS ontology. An evaluation of the methods was carried out and it was found that the adapted method works well for short biological texts.

  1. P. W. Lord, R. D. Stevens, A. Brass, and C. A. Goble. Semantic similarity measures as tools for exploring the gene ontology. In Pac Symp Bio-comput Proc. , pages 601– 612, 2003.
  2. T. Pedersen, S. V. Pakhomov, S. Patwardhan, and C. G. Chute. Measures of semantic similarity and relatedness in the biomedical domain. Journal of Biomedical Informatics, 40(3):288–299, 2007.
  3. R. Rada, H. Mili, E. Bicknell, and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics, 19(1):17–30, 1989.
  4. Jay J. Jiang and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics, pages 19–33, 1997.
  5. Philip Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, 1995.
  6. Zhibiao Wu and Martha Palmer. Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, ACL '94, pages 133–138, Stroudsburg, PA, USA, 1994. Association for Computational Linguistics.
  7. T. K. Landauer, P. W. Foltz, and D. Laham. Introduction to latent semantic analysis. discourse. Discourse Processes, 25:259–284, 1998.
  8. Peter D. Turney. Mining the web for synonyms: Pmiir versus lsa on toefl. In Proceedings of the Twelfth European COnference on Machine Learning ECML. 2001, pages 491–502, 2001.
  9. J. E. Caviedes and J. J. Cimino. Towards the development of a conceptual distance metric for the umls. J. of Biomedical Informatics, 37(2):77–85, 2004.
  10. C. Leacock and M. Chodorow. Combining local context and WordNet similarity for word sense identification, pages 305–332. In C. Fellbaum (Ed. ), MIT Press, 1998.
  11. Dekang Lin. An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning, ICML '98, pages 296– 304, San Francisco, CA, USA, 1998. Morgan Kaufmann Publishers Inc.
  12. Rada Mihalcea, Courtney Corley, and Carlo Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the 21st national conference on Artificial intelligence - Volume 1, AAAI'06, pages 775–780. AAAI Press, 2006.
  13. S. Banerjee and T. Pedersen. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pages 805–810, 2003.
  14. K. Sparck-Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11–21, 1972.
  15. W. R. Hersh, C. Buckley, T. J. Leone, and D. H. Hickam. Ohsumed: An interactive retrieval evaluation and new large test collection for research. In Proceedings of the 17th Annual ACM SIGIR Conference, pages 192–201, 1994.
  16. W. R. Hersh and D. H. Hickam. Use of a multi-application computer workstation in a clinical setting. In Bulletin of the Medical Library Association, volume 82, pages 382– 389, 1994.
  17. D. L. Olson and D. Delen. Advanced Data Mining Techniques. Springer, 2008.
  18. C. Pesquita, D. Faria, A. O. Falco, P. Lord, and F. M. Couto. Semantic similarity in biomedical ontologies. PLoS Compututational Biology, 5(7), 2009.
Index Terms

Computer Science
Information Sciences


Semantic Similarity Ontology Knowledge Discovery Text Processing