by Yuxiong He
In data mining, association rule mining is a popular and well-researched method for discovering interesting relations between variables in large databases. For example, the rule {onions, potatoes} => {beef} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy beef. Such information can be used as the basis for decisions about marketing activities. Frequent pattern mining is the basis of association rule mining. Given a list of transactions, frequent pattern mining returns a complete set of items that occur more than a threshold of times. One of the fastest and most popular algorithms for frequent pattern mining is the FP-tree [1] algorithm. In this blog post, I describe how to parallelize the FP-tree algorithm using Cilk++.


