Google scholar arxiv informatics ads IJAIS publications are indexed with Google Scholar, NASA ADS, Informatics et. al.

Call for Paper

-

November Edition 2020

International Journal of Applied Information Systems solicits high quality original research papers for the November 2020 Edition of the journal. The last date of research paper submission is October 15, 2020.

Improving Enterprise Search in the Upstream Oil and Gas Industry by Automatic Query Expansion using a Non-probabilistic Knowledge Representation

Paul H. Cleverley Published in Information Sciences

International Journal of Applied Information Systems
Year of Publication 2012
© 2012 by IJAIS Journal
10.5120/ijais12-450723
Download full text
  1. Paul H Cleverley. Article: Improving Enterprise Search in the Upstream Oil and Gas Industry by Automatic Query Expansion using a Non-probabilistic Knowledge Representation. International Journal of Applied Information Systems 1(1):25-32, November 2012. BibTeX

    @article{key:article,
    	author = "Paul H. Cleverley",
    	title = "Article: Improving Enterprise Search in the Upstream Oil and Gas Industry by Automatic Query Expansion using a Non-probabilistic Knowledge Representation",
    	journal = "International Journal of Applied Information Systems",
    	year = 2012,
    	volume = 1,
    	number = 1,
    	pages = "25-32",
    	month = "November",
    	note = "Published by Foundation of Computer Science, New York, USA"
    }
    

Abstract

Organizations face a vocabulary disconnect between the terminology people use in search and the inherent ambiguity of terminology in their information. The mismatch leads to critical information being missed. This paper discusses how Boolean keyword search, the most commonly used approach in Enterprise search, compares with automatic Query Expansion (QE) using a non-probabilistic Knowledge Representation (KR) created independently of the corpus.

The tests focused on the initial search results list. Optional recommendation or ‘what’s related’ options or facets were out of scope. Testing was performed on a globally created document library collection from one of the largest corporations in the world. QE recalled, on average, an additional 43% of relevant precise results in a single search, without a commensurate cost to information precision.

It is well known from set theory as more words are used in a keyword search, using an AND operator, fewer results are returned. However, it was observed as more words are used in a keyword only search, the relevant results returned, as a proportion of all relevant results in the corpus, decreases. This narrow search paradox means in general terms, when more search words are used in a query to help locate relevant information, as a proportion, more information of relevance is actually missed. This is caused by the compounding of words’ semantic fields and possible linguistic variants.It is believed this is the first time the effect has been modeled in this context, with wider significance in Information Retrieval (IR).

Reference

  1. IDC (2001), SmartLogicMindMetre (2011) Enterprise search user satisfaction surveys
  2. Hawking, D. (2004) Challenges in Enterprise Search, CSIRO ICT, Conferences in Research and Practices in Information Technology
  3. Chowdhury, S. , Gibb, F. , Landoni, M. (2011) Uncertainty in Information Seeking and Retrieval: A Study in an Academic Environment. Information Processing & Management Volume 47 (2)(pp. 157-175)
  4. Gruber, T. R. (1992)ATranslation Approach to Portable Ontology Specifications. Knowledge Acquisition 5(2) (pp. 199-220)
  5. The Energy Industry Profile of ISO 19115-1 (EIP)
  6. Ogilvie, P. , Callan, J. (2001) The Effectiveness of Query Expansion for Distributed Information Retrieval. In Proceedings of the 10th International Conference on Information and Knowledge Management (pp. 183-190)
  7. Peng, J. , Macdonald, C. , He, B. , Ounis, I. (2009) A Study of Selective Collection Enrichment for Enterprise Search. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM 2009)
  8. Yi, X. , Allan, J. (2009) A Comparative Study for Utilizing Topic Models for Information Retrieval. 31st European Conference on IR, LNCS 5478,( pp. 29-41)
  9. Vorhees, E. (1994). Query expansion using Lexical-Semantic Relations. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 1994.
  10. Kristensen, J. (1993) Expanding End User's Query Statements for Free Text Searching with a Search Aid Thesaurus. Information Processing and Management Vol 29 No. 6 (pp. 733-744)
  11. Segura, A. , Salvador-Sanchez, Garcia-Barriocana, E. , Prieto, M. (2010) An Empirical Analysis of Ontology-based Query Expansion for Learning Resource Searches using MERLOT and Gene Ontology. Knowledge Based Systems Volume 24 Issue 1 (pp. 119-133)
  12. Bhogal, J. , Macfarlane, A. , Smith, P. (2006) A Review of Ontology Based Query Expansion. Information Processing and Management 43 (2007) (pp. 866-886)
  13. Solskinnsbakk, G. , Gulla, J. (2010) Ontological Profiles in Enterprise Search. Journal of Data and Knowledge Engineering Vol 69 Issue 3 ( pp. 251-260)
  14. Resnik, P. (1999) Semantic similarity in a Taxonomy: An information based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11 (pp. 95-130)
  15. Quillian, M. (1968). Semantic Memory. Information Processing (pp. 227-270) MIT Press.
  16. Collins, A. , Loftus, E. (1975) A Spreading-activation Theory of Semantic Processing. Psychological Review Vol. 82 No. 6. , (pp. 407-428)
  17. Crestani, F. (1997) Application of Spreading Activation Techniques in Information Retrieval. Artificial Intelligence Review Volume 11 Issue 6.
  18. Wojtinnek, P. , Pulman, S. (2011). Semantic Relatedness from Automatically Generated Semantic Networks. Proceedings of the 9th International Conference on Computational Semantics (IWCS11) (pp. 390-394)
  19. Liu, W. , Weichselbraun A. Scharl, A. (2005). Semi-automatic Ontology Extension Using Spreading Activation. Journal of Universal Knowledge Management and Proceedings of iKNOW 2005.
  20. Velardi, P. , Navigli, R. , Faralli, S. Martinez, J. (2012). A New Method for Evaluating Automatically Learned Terminological Taxonomies. Proceedings of the 8th Conference on International Language Resources and Evaluation (LREC 2012), May 21-27, 2012
  21. Woon, W. and Madnick, S. (2009) Asymmetric Information Distances for Automated Taxonomy Construction. Knowledge and Information Systems Volume 21, Number 1 (pp. 99-111)
  22. Vorhees, E. M. , Harman, D. (1998) TREC-7 Experiment and Evaluation in Information Retrieval
  23. Taghavi, M. , Patel, A. , Schmidt, N. , Wills, C. , Tew, Y. (2011). An Analysis of Web Proxy Logs with Query Distribution Pattern Approach for Search Engines. Computer Standards and Interfaces (pp. 162-170)
  24. Shiri, A. , Chambers, T. (2009)Information Retrieval from Digital Libraries: Assessing Potential Utility of Thesauri in Supporting Users Search Behavior in an Interdisciplinary DomainProceedings of the 10th International Conference of the International Society for Knowledge Organization (ISKO)
  25. Schutze, H. , Pedersen, J. (1995) Information Retrieval based on work senses. In Symposium on Document Analysis and Information Retrieval. In Proceedings of SDAIR'95 Las Vegas Nevada (pp. 161-175)

Keywords

Enterprise search, Web Digital Library, Query Expansion, Knowledge Representation, Information Retrieval, Semantic Ambiguity, Taxonomy, Ontology, Combinatorial Linguistic Explosion, Petroleum Exploration and Production (E&P)