Tuesday, June 12, 2007

Incremental Mining of Sequential Patterns in Large Databases

The fundamentals of this algorithm could be use in large databases.

The problem: "As databases evolve the problem of maintaining sequential patterns over a significantly long period of time becomes essential, since a large number of new records may be added to a database. To reflect the current state of the database where previous sequential patterns would become irrelevant and new sequential patterns might appear, there is a need for efficient algorithms to update, maintain and manage the information discovered [12]. Several efficient algorithms for maintaining association rules have been developed [12–15]. Nevertheless, the problem of maintaining sequential patterns is much more complicated than maintaining association rules, since transaction cutting and sequence permutation have to be taken into account [16]."

The proposed solution: "This method is based on the discovery of frequent
sequences by only considering frequent sequences obtained by an earlier mining
step. By proposing an iterative approach based only on such frequent sequences
we are able to handle large databases without having to maintain negative border
information, which was proved to be very memory consuming [16]. Maintaining
such a border is well adapted to incremental association mining [26,19], where association rules are only intended to discover intra-transaction patterns (itemsets). Nevertheless, in sequence mining, we also have to discover inter-transaction patterns (sequences) and the set of all frequent sequences is an unbounded superset of the set of frequent itemsets (bounded) [16]. The main consequence is that such approaches are very limited by the negative border size."

No comments:

Business Analytics

Business Analytics

Blog Archive

About Me

My photo
See my resume at: https://docs.google.com/document/d/1-IonTpDtAgZyp3Pz5GqTJ5NjY0PhvCfJsYAfL1rX8KU/edit?hl=en_USid=1gr_s5GAMafHRjwGbDG_sTWpsl3zybGrvu12il5lRaEw