Research Article | Open Access | Download PDF
Volume 1 | Issue 3 | Year 2011 | Article Id. IJCOT-V1I3P306 | DOI : https://doi.org/10.14445/22492593/ IJCOT-V1I3P306
Comparison and Evaluation of scaled data mining algorithms
M Afshar Alam , Sapna Jain ,Ranjit Biswas
Citation :
M Afshar Alam , Sapna Jain ,Ranjit Biswas, "Comparison and Evaluation of scaled data mining algorithms," International Journal of Computer & Organization Trends (IJCOT), vol. 1, no. 3, pp. 28-34, 2011. Crossref, https://doi.org/10.14445/22492593/ IJCOT-V1I3P306
Abstract
Association rule mining is the most popular technique in data mining. Mining association rules is a prototypical problem as the data are being generated and stored every day in corporate computer database systems. To manage this knowledge, rules have to be pruned and grouped, so that only reasonable numbers of rules have to be inspected and analyzed. In this paper we compare the standard association rule algorithms with the proposed Scaled Association Rules algorithm and AIREP algorithm. All these algorithms are compared according to the various factors like Type of dataset, support counting, rule generation, candidate generation, computational complexity and other factors .The conclusions drawn are based on the efficiency ,performance , accuracy and scalability parameters of the algorithms.
Keywords
Association rule, Data Mining, Multidimensional dataset, Pruning, Frequent itemset. Introduction
References
[1] J. P. Bigus., “Data Mining with Neural Networks”, McGrawHill, 1996
[2] T. M. Mitchell., “Machine Learning”, McGraw-Hill, 1997.
[3] Sousa, M.S. Mattoso, M.L.Q. Ebecken, N.F.F. "Data Mining
on Parallel Database Systems" Proc. Int. Conf. on Parallel and
Distributed Processing Techniques and Applications
(PDPTA'98), Special Session on Parallel Data Warehousing,
CSREA Press, Las Vegas, E.U.A., Pp.1147-1154, July 1998.
[4] Fayyad U, “Data Mining and Knowledge Discovery in
Databases: Implications from scientific databases,” In Proc. of
the 9th Int. Conf. on Scientific and Statistical Database
Management, Olympia, Washington, USA, pp. 2-11, 1997.
[5] Tsau Young Lin, "Sampling in association rule mining",
Conference on Data mining and knowledge discovery:
Theory, Tools, and Technology VI, vol. 5433, pp.: 161-167,
2004.
[6] Klaus Julisch," Data Mining for Intrusion Detection -A
Critical Review" in proc. of IBM Research on application of
Data Mining in Computer security, Chapter 1 , 2002.
[7] Jeffrey W. Seifert, "Data Mining: An Overview", in
proceedings of CRS Report for Congress, 2004.
[8] Coenen F, Leng P, Goulbourne, G., “Tree Structures for
Mining Association Rules,” In Journal of Data Mining and
Knowledge Discovery, Vol. 15, pp. 391-398, 2004.
[9] Marek Wojciechowski, Krzysztof Galecki, Krzysztof
Gawronek: „Concurrent Processing of Frequent Itemset
Queries Using FP-Growth Algorithm‟, Proc. Of the 1st
ADBIS Workshop on Data Mining and Knowledge Discovery
(ADMKD'05), Tallinn, Estonia, 2005. V.Umarani et. al. /
IJCSR International Journal of Computer Science and
Research, Vol. 1 Issue 1, 2010 ISSN : 2210-9668
http://www.cscjournals.com 33
[10] Yu-Chiang Li, Jieh-Shan Yeh, Chin-Chen Chang, “Efficient
Algorithms for Mining Shared-Frequent Itemsets”, In
Proceedings of the 11th World Congress of Intl. Fuzzy
Systems Association, 2005.
[11] F. Bodon, “A Fast Apriori Implementation”, In B. Goethals
and M. J. Zaki, editors, Proceedings of the IEEE ICDM
Workshop on Frequent Itemset Mining Implementations, Vol.
90 of CEUR Workshop Proceedings, 2003.
[12] Basel A. Mahafzah, Amer F. Al-Badarneh and Mohammed Z.
Zakaria "A new sampling technique for association rule
mining," in Journal Of Information Science, Vol.35, pp. 358-
376, 2009.
[13] Venkatesan T. Chakaravarthy, Vinayaka Pandit and Yogish
Sabharwal, "Analysis of sampling techniques for association
rule mining," In Proceedings of the 12th International
Conference on Database Theory, Vol. 361, pp. 276-283, 2009.
[14] Y. Zhao, C. Zhang and S. Zhang, “Efficient frequent itemsets
mining by sampling,” Proceedings of the fourth International
Conference on Active Media Technology (AMT), pp. 112-
117, 2006.
[15] Han, j. and Pei, J. 2000. Mining frequent patterns by patterngrowth: methodology and implications. ACM SIGKDD
Explorations Newsletter2, 2, 14-20.
[16] Wang, C., Tjortjis, C., Prices: An Efficient Algorithm for
Mining Association Rules, Lecture Notes in Computer
Science, Volume 2447, 2002. pp. 77-83.
[17] Yuan, Y., Huang, T., A Matrix Algorithm for Mining
Association Rules, Lecture Notes in Computer Science,
Volume 3664, Sep2005.pp 370-379.
[18] R.Agrawal, T.Imielinski, and A.Swami, “Mining association
rules between sets of Items in large databases”, in proceedings
of the ACM SIGMOD Int'l Conf. on Management of data, pp.
207-216, 1993.
[19] Choh Man Teng, "A Comparison of Standard and Interval
Association Rules", In Proceedings of the Sixteenth
International FLAIRS Conference, pp.: 371-375, 2003.
[20] Suzuki Kaoru, “Data Mining and the Case for Sampling,” SAS
Institute Best Practices Paper, SAS Institute, 1998.
[21] Soo, J., Chen, M.S., and Yu, P.S., 1997, “Using a Hash-Based
Method with Transaction Trimming and Database Scan
Reduction for Mining Association Rules” IEEE Transactions
On Knowledge and Data Engineering, Vol.No.5. pp. 813-825.
[22] En Tzu Wang and Arbee L.P. ChenData,“ A Novel Hash-based
approach for mining frequent itemsets over data streams
requiring less Memory space” Data Mining and Knowledge
Discovery, Volume 19, Number 1, pp 132-172.
[23] Wojciechowski, M., Zakrzewiez, M., Dataset filtering
Techniques in Constraint based Frequent pattern Mining,
Lecture Notes in Computer Science, Volume 2447, 2002,
pp77-83.
[24] Tien Dung Do, Siu Cheng Hui,Alvis Fong, Mining frequent
itemsets with category Based Constraints. Lecture Notes in
Computer Science, Volume 2843, 2003, pp226-234.
[25] Das, A., Ng, W.K., and Woon, Y, K. 2001. Rapid association
rule mining. In the proceedings of the tenth international
conference on Information and knowledge management..
ACM press, 474-481.
[26] Rakesh Agarwal, Ramakrishnan Srikant,” Fast Algorithms for
Mining Association Rules” 20th Intl Conference on VLDB,
Santigo, Chile, Set.1994.
[27] Thevar., R.E; Krishnamoorthy, R,” A new approach of
modified transaction reduction algorithm for mining frequent
itemset”, ICCIT 2008.11th conference on Computer and
Information Technology.
[28] Cheung, D., Han, J.Ng, V., Fu, A and Fu, Y. (1996), “A fast
distributed algorithm for mining association rules” in Proc of
1996 Int‟l Conference on Parallel and Distributed Information
Systems‟. Miami Beach, Florida, pp.31-44.
[29] Parthasarathy, S., "Efficient progressive sampling for
association rules", IEEE International Conference on Data
Mining, pp.: 354- 361, 2002
[30] V.Umarani and M.Punithavalli,” Developing a Novel and
Effective Approach for Association Rule Mining Using
Progressive Sampling” In the proc of 2nd Int‟l Conference on
Computer and Electrical Engineering (ICCEE 2009), vol.1,
pp610-614.
[31] V.Umarani and M.Punithavalli,” On Developing an Effectual
Progressive Sampling Based Approach for Association Rule
Discovery” In the proc of 2nd IEEE Int‟l Conference on
Information and data Engineering (2nd IEEE ICIME 2010),
Chengdu ,China April 2010.
[32] Cheung, D., Xaio, Y., Effect of data skewness in parallel
mining of association rules, Lecture Notes in Computer
Science, Volume 1394,Aug 1998,pages 48-60.
[33] Raymond Chi-Wing Wong, Ada Wai-Chee Fu, "Association
Rule Mining and its Application to MPIS", 2003.
[34] Agrawal, R. and Srikant, R., Fast algorithms for mining
association rules. In Proc.20th Int. Conf. Very Large Data
Bases, 487-499, 1994.
[35] Sotiris Kotsiantis, Dimitris Kanellopoulos,” Association Rules
Mining: A Recent Overview", GESTS International
Transactions on Computer Science and Engineering, Vol.32,
No: 1, pp. 71-82, 2006.
[36] Parthasarathy, S., Zaki, M.J.J., Ogihara, M., Parallel data
mining for association rules on shared-memory systems,
Knowledge and Information Systems: An International
Journal,3(1):1-29,February 2001.
[37] Basel A. Mahafzah, Amer F. Al-Badarneh and Mohammed Z.
Zakaria "A new sampling technique for association rule
mining," in Journal of InformationScience, Vol. 35, pp. 358-
376, 2009.
[38] B.Lent, A.Swami,J.Wisdom, “Clustering association rules”, In
the proc of 13th Int‟l Conference on Data Engineering,pp.220.
[39] John D. Holt and Soon M. Chung,” Mining of Association
Rules in Text Databases Using Inverted Hashing and Pruning”
Lecture Notes in Computer Science, 2000, Volume
1874/2000, 290-300.
[40] Rajendra K.Gupta and Dev Prakash Agarwal,”Improving the
performance of Association Rule Mining Algorithms by
Filtering Insignificant Transactions dynamically”, Asian
Journal of Information Management, pp.7-17. 009 Academic
Journals Inc.
[41] Pi Dechang and Qin Xiaolin,” A New Fuzzy Clustering
Algorithm on Association Rules for Knowledge
Management”, Information Technology Journal. Pp. 119-124,
2008. Asian Network for Scientific Information.
[42] Margaret H.Dunham,”Data mining Introductory and
Advanced Topics”, Pearson Education 2008.
[43] Tamanna Siddqui,M Afshar Alam ,Sapna jain ,” Discovery of
Scalable Association Rule from large set of multidimensional
quantitative datasets.”,Academy publisher Journal
[44]Sapna jain,M Afshar Alam ,Ranjit Biswas ,” A I R E P : a
novel scaled multidimensional quantitative rules generation
approach.