HyRead Journal 台灣全文資料庫

文章詳目資料

資訊電子學刊

自然科學/資訊/科技

篇名	遞增式的探勘序列型樣方法之研究
卷期	9:1
並列篇名	Incremental Mining Sequential Patterns
作者	顏秀珍、李御璽、鄭力瑋、王秋光
頁次	015-023
關鍵字	資料探勘、序列型樣、資料串流、交易資料庫、 Data Mining 、 Sequential Patterns 、 Data Stream 、 Transaction Database
出刊日期	202011

中文摘要

序列型樣探勘(Mining Sequential Pattern)主要是從交易資料庫中找出大部分客戶經常依序購買商品的行為。藉由過去客戶循序的消費行為，便可以預測出客戶在購買某些商品後，未來還會再買哪些商品，藉此可提供行銷策略上的參考。然而，顧客的交易行為不斷在進行，顧客的交易習慣也不斷的在改變，這種交易資料不斷產生的環境，稱為資料串流(Data Stream)。在資料串流的環境下，如何有效率的即時更新原有的序列型樣是一個很重要的研究議題。過去較有效率的演算是將交易資料儲存在樹狀結構中，當有交易資料新增時，可根據新增的項目更新樹狀結構，但是仍需對更新後的樹狀結構重新找尋序列型樣，且無法避免重新掃描原始資料，完全沒有利用到之前已找出的序列型樣，浪費了時間和空間。因此，本篇論文提出在交易資料不斷新增的情況下有效率的更新原有序列型樣的方法，我們的方法不需重新掃描原始交易資料，只需處理新增的交易資料就可找出目前最新的序列型樣，實驗結果也顯示我們的方法比其他方法更有效率。

英文摘要

Mining Sequential Pattern is mainly to find out the behavior of most customers who buy goods in sequence from the transaction database. Through the orderly consumption behavior of customers in the past, it is possible to predict which products customers will buy in the future after purchasing certain products, which can provide a reference for marketing strategies. However, the customer's transactions will be constantly going on, and the customer behaviors are also constantly changing. This environment in which transaction data is constantly generated is called Data Stream. In the data streaming environment, how to efficiently update the original sequential patterns in real time is a very important research topic. The most efficient algorithm in the past was to store transaction data in a tree structure. When the transactions are added, the tree structure can be updated according to the newly added items, but all the sequential patterns also need to be re-generated. In this way, it is inevitable to rescan the original transaction database, and the sequential patterns that have been found before are not used at all, which wastes a large amount of execution time and memory storage. Therefore, this paper proposes a method to efficiently update the original sequential patterns when the transactions are continuously added. Our method does not need to rescan the original transaction database, and only needs to process the newly added transactions to find out the latest sequential patterns. The experimental results also show that our method is more efficient than other methods.

本卷期文章目次

關鍵知識WIKI