Monday, March 30, 2009

FREQUENT ITEMSET MINING USING PROCESSED FP TREES

ABSTRACT



In many business transactions, one is very interested to know which products of his are sold more and which set of products are sold in combination. These details help in business analysis and for making major marketing decisions like promoting offers and free gifts. Mining data using association rules is in widespread use today and is highly relevant for market analysts.

The association rules generated are based on the frequencies in which these products are sold. A ‘support’ factor is introduced to know the frequency of an item being sold. If more than one item is involved in the analysis process, an itemset is formed. If a specified support count has been satisfied by the itemset, it is said to be a frequent itemset.

Various algorithms exist for mining of these frequent itemsets. One of the most popular alogorithms is the Apriori algorithm. The Apriori algorithm is based on candidate generation and its efficiency is not high. Later, to avoid candidate generation, the Frequent Pattern(FP) Tree method was proposed to mine frequent Itemsets.

In this project, we implement the FP Growth method which is used to mine FP Trees. Here we have moved from mining the entire data base to mining the FP tree constructed from the database. We then recursively mine conditional FP trees and grow frequent patterns obtained. The FP Growth method is a depth first algorithm unlike the Apriori which is Breadth First.




TABLE OF CONTENTS




CHAPTER NO TITLE PAGE NO



ABSTRACT v


LIST OF TABLES ix


LIST OF FIGURES x


CHAPTER 1 INTRODUCTION 1

1.1 OBJECTIVE 1

1.2 FREQUENT ITEMSET
MINING – AN OVERVIEW 1

1.3 PROJECT SUMMARY 3


CHAPTER 2 LITERATURE REVIEW 4

2.1 METHOD 4

2.2 THE PROBLEM WITH
APRIORI 5

2.3 AN ALTERNARTIVE TO
APRIORI 5



CHAPTER 3 PROBLEM DEFINITION
AND METHODOLOGY 6

3.1 PROBLEM DEFINITION 6

3.2 METHODOLOGY 6

3.2.1 FREQUENT ITEMSET 7

3.2.2 REORDERING AND
PRUNING 7

3.2.3 FP TREE 7

3.2.4 FP GROWTH 8

3.2.5 TOTAL SUPPORT TREE 9

3.3 REQUIREMENTS 11

3.3.1 SOFTWARE REQUIREMENTS 11

3.3.2 HARDWARE REQUIREMENTS 11

CHAPTER 4 DESIGN 12

4.1 PROJECT DESIGN 12

CHAPTER 5 IMPLEMENTATION 13

5.1 OVERVIEW 13

5.2 CLASSES 13

5.2.1 REORDER 13

5.2.2 FP TREE1 14

5.2.3 TOTALSUPPTREE 14

5.2.4 FPGAPP 15

CHAPTER 6 RESULT AND ANALYSIS 16

6.1 OUTPUT EXECUTION 16

6.2 OUTPUT COMPARISON 18

6.2.1 ACCURACY 18

6.2.2 T – TREE STORAGE 18




CHAPTER 7 CONCLUSION AND FUTURE WORK 19

7.1 CONCLUSION 19

7.2 FUTURE WORK 19


APPENDIX 1 CODING 20


APPENDIX 2 SCREENSHOTS 66


REFERENCES 68












LIST OF TABLES


TABLE NO. TITLE PAGE NO.


Table 1.1 Project Summary 3

Table 3.1 Software Requirements 11

Table 3.2 Hardware Requirements 11






























LIST OF FIGURES


FIGURE NO. TITLE PAGE NO.


Figure 3.1 FP Tree 8

Figure 3.2 FP Growth 9

Figure 3.3 Total Support Tree 10

Figure 4.1 Project Design 12

Figure A2.1 FP Growth – Screenshot 66

Figure A2.2 Apriori – Screenshot 67

No comments:

 
ss_blog_claim=b2020e0f26362b8071fda24b7fed8308 ss_blog_claim=b2020e0f26362b8071fda24b7fed8308