(i) Define the following terms:
[3 marks]Data warehouse
[ marks]Data Discretization
[ marks]Frequent patterns (ii)Explain the differences between Knowledge discovery and data mining.
[4 marks]What is the relation between data warehousing and data mining? List out different sources of information.
[7 marks]Explain following clustering algorithm in details:
[7 marks]CLARA ii) BIRCH
What do you mean by Data Processing? Describe different data cleaning approaches.
[7 marks]Explain the different methods of handling the missing values.
[7 marks](i) Define the terms support and confidence. (ii) How is association rule mined from large databases?
[4 marks]Explain in brief about apriori: candidate generation and test approach with an example.
[7 marks](i) When we can say the association rules are interesting? (ii) Explain whether association rule mining is supervised or unsupervised type of learning.
[4 marks]Consider the transaction dataset: Tid Items Tid Items 1 {a,b} 6 {a,b,c,d} 2 {b,c,d} 7 {a} 3 {a,c,d,e} 8 {a,b,c} 4 {a,d,e} 9 {a,b,d} 5 {a,b,c} 10 {b,c,e} Construct the FP tree by showing the trees separately after reading each transaction.
[7 marks]What is rule based classifier? Explain how a rule classifier works. 1/2
[7 marks]What is Bayes Theorem? Show how it is used for classification
[7 marks]Write the algorithm for k-nearest neighbor classification.
[7 marks]Discuss methods for estimating predictive accuracy of classification.
[7 marks]Explain the typical requirements of cluster analysis.
[7 marks]Write a short note on: Spatial Mining.
[7 marks]Explain cardinal, ordinal and ratio – scaled variables.
[7 marks]Write a short note on: Web Mining. 2/2
[7 marks]