Principles of Data Mining
Lab 2
- Type your answers in the answer sheet NOT in this file:
- For questions 3-5: show all your work/math, no points on final answer only.
- What is market basket analysis?
- What is the “Apriori principle”? Why is it useful in association rule mining?
- Consider the data set shown in Table below then answer the following questions:
Customer ID | Transaction ID | Items Bought |
1 | 0001 | {a, d, e} |
1 | 0024 | {a, b, c, e} |
2 | 0012 | {a, b, d, e} |
2 | 0031 | {a, c, d, e} |
3 | 0015 | {b, c, e} |
3 | 0022 | {b, d, e} |
4 | 0029 | {c, d} |
4 | 0040 | {a, b, c} |
5 | 0033 | {a, d, e} |
5 | 0038 | {a, b, e} |
- Compute the support for itemsets {e} , {b, d} , and {b, d, e} by treating each transaction ID as a market basket.
- Use the results in part (a) to compute the confidence for the association rules
{b, d} → {e} and {e} → {b, d}.
- Study the table below then answer the following questions: (Minimum Support = 40% Minimum Confidence = 40%)
Transaction ID | Items Bought |
1 | A,B,C T |
2 | A,B,C,D,E T |
3 | A,C,D T |
4 | A,C,D,E T |
5 | A,B,C,D |
- Find all the frequent item sets using Apriori algorithm. Use tables to represent Ck and Lk.
- Generate all possible decision rules from each of the frequent itemsets you obtained from the previous questions along with their confidence.