文章詳目資料

資訊電子學刊

  • 加入收藏
  • 下載文章
篇名 發掘有條件限制的關聯規則演算法
卷期 1:2
並列篇名 Discovering Association Rules with Constraints
作者 顏秀珍李御璽謝百恩英家慶
頁次 039-047
關鍵字 資料探勘關聯規則項目限制交易資料庫頻繁項目集Data miningassociation miningitem constrainttransaction databasefrequent itemset
出刊日期 200610

中文摘要

關聯規則探勘(Mining association rule)是從交易資料庫中找出大部份交易中常常同時出現的購買項目組合,進而從這些項目組合中求出某些有用的關聯規則(Association rule)。以往已有很多論文提出找尋關聯規則的方法,其中又以FP-Growth演算法效能最為優越。但由於其結構複雜及使用大量鍵結建構FP-Tree,往往會使得FP-Tree過於龐大。本論文提出一個改良FP-Tree的演算法FIG(Frequent Itemset Growth),不需建構樹狀結構,只需掃描兩次原始的交易資料庫,就可找出常常一起被購買的項目組合。然而使用者在關聯規則探勘時可能只需要尋找一些包含特定項目的關聯規則。若用以往的方法,找出所有的關聯規則後,再加以篩選出使用者感興趣的部分,將會浪費很多時間。若我們能讓使用者自行設定其所感興趣的項目,則找出來的資訊不但可以完全符合使用者的需求,也可省去很多找尋使用者不感興趣之資訊的時間。若使用者限制關聯規則探勘結果所找出的關聯規則必須包含其所指定的項目,我們將此限制稱為項目限制(Item constraint)。論文中我們也改良演算法FIG,使其可以從交易資料庫中直接找出符合項目限制的所有關聯規則。我們稱改良後的演算法為CFIG演算法(Constraint Frequent Itemset Growth)。實驗結果證明,FIG演算法在效能上確實優於FP-Growth演算法。我們也將CFIG演算法與FIG演算法在效能及找出的關聯規則數量做比較,實驗結果也證明在有項目限制的關聯規則探勘時,探勘效率將會大幅提升。

英文摘要

Mining association rule is to find frequent itemsets which appear in most of the transactions in a transaction database. According to these frequent itemsets, we can further find association rules. There were many researchers which proposed algorithms for mining association rules. The most efficient algorithm for mining association rules is FP-Growth algorithm. However, FP-Growth algorithm need to take a lot of time to construct a complex FP-Tree and conditional FP-Trees, and spend a large amount of memory space to store the (conditional) FP-Trees. In this paper, we propose an algorithm FIG (Frequent Itemset Growth) to improve FP-Growth algorithm. The algorithm FIG does not need to construct the tree structure like FP-Tree, and just need to scan the transaction database twice to find all the frequent itemsets. Besides, users may just need to find the association rules which include some particular items. If the algorithms FP-Growth or FIG are applied to find all the association rules, then many uninteresting association rules will be found, and the association rules which contain user specified items need to be picked up. If we can directly find the association rules according to the user specified items, then we will save a lot of time to find all the association rules. The user specified items is called “item constraint”. In this paper, we also modify the algorithm FIG for finding all the association rules which match the item constraint, which is called CFIG algorithm (Constrained Frequent Itemset Growth). The experimental results show that our algorithm FIG is more efficient than FP-Growth algorithm. In the experiments, we also compare the performance of algorithm CFIG and FIG. The experimental results also show that the mining efficiency will be highly improved when we push the constraints into the association rule mining process.

相關文獻