Semantic based Text Summarization using Universal Networking Language

S. Mangairkarasi, S. Gunasundari Published in Information Sciences

International Journal of Applied Information Systems
Year of Publication 2012
© 2010 by IJAIS Journal
Download full text
Text Summarization is extracting the important information from the document by leaving out the irrelevant information, and to reduce the details and collects them in a compressed way. Normally text summarization is done in single or multi documents. The advantage on processing time can be achieved in the text summarization. Converting English sentences into expressions or Interlingua is called Universal Networking Language (UNL). The given source document is preprocessed by eliminating tables and images. The preprocessed document is fed into sentence splitter and then to word separator. The given word is sent to Morphological Analyzer to find the root word. This root word is fed into UNL dictionary for finding the corresponding concepts and attributes. By using the heuristic rules, we identify the relations between concepts. UNL represents knowledge in the form of graphical format, where nodes represent concepts and links represent relations between concepts. It represents the whole document not the sentences in particular. The graph algorithm is used to find the weight age of links connected to the Universal Word. According to the highest weight age the document is summarized.


Interlingua, Document preprocessor, UNL dictionary, Morphological Analyzer