Analysis of Popular Steaming Algorithms Supporting Information Retrieval System

  IJCOT-book-cover
 
International Journal of Computer & Organization Trends  (IJCOT)          
 
© 2013 by IJCOT Journal
Volume-3 Issue-5                           
Year of Publication : 2013
Authors :  Madhurima V , Prof. T.Venkat Narayana Rao , L. Sai Bhargavi

Citation

Madhurima V , Prof. T.Venkat Narayana Rao , L. Sai Bhargavi. "Analysis of Popular Steaming Algorithms Supporting Information Retrieval System" . International Journal of Computer & organization Trends IJCOT), V3(5):11-17 Sep - Oct 2013, ISSN 2249-2593, www.ijcotjournal.org. Published by Seventh Sense Research Group.

Abstract

Information retrieval is the activity of gaining information resources significant to an information need from a collection of information resources. Searches can be based on metadata a . Computerized information retrieval systems are used to decrease what has been called "information overload". Several universities and public libraries use Information Retrieval systems to offer access to journals, books, other documents. Web search engines are the most visible Information retrieval applications. There are plenty of IR algorithms such as Stemming algorithms which process the text for reducing sometimes derived words to their stem, base or root form -generally a written word form. The use the term conflation, meaning the act of fusing, as the general term for the process of matching physiological term alternatives. Conflation can be:1. manual--using some kind of regular statements 2. automatic, via programs called stemmers. Algorithms for stemming have been studied in computer science since 1968 .Stemming programs are generally referred to as stemming algorithms or stemmers. This paper focus on some popular algorithms and also tender the comparative study with analysis on Stemming algorithms.

References

[1]. Frakes W.B. “Term conflation for information retrieval”. Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval. 1984,383-389.
[2]. Porter M.F. “An algorithm for suffix stripping”.Program. 1980; 14, 130-137.
[3]. Dawson John. “Suffix removal and word conflation”. ALLC Bulletin, Volume 2, No. 3.1974,33-46
[4]. Porter M.F. “Snowball: A language for stemming algorithms”. 2001. http://snowball.tartarus.org/texts/introduction.html
[5].Frakes,W.&BaezaYates,R.,eds(1992),InformationRetrieval:DataStructuresandAlgo- rithms,Prentice-Hall.
[6]. T.G. Rose, M. Stevenson, M. Whitehead. The Reuters Corpus Volume 1 - From Yesterday’s News to Tomorrow’s Language Resources. In Proc. LREC’02, 2002.
[7]. M. Braschler and B. Ripplinger. How e?ective is stemming and decompounding for german text retrieval? Information Retrieval, 7(3-4):291–316, 2004.
[8]. D. R. Morrison. PATRICIA—Practical Algorithm to Retrieve Information Coded in Alphanumeric. J. of the ACM, 15(4):514–534, October 1968.

Keywords

stem, stemmers, conflation, conflation methods, n-grams.