Define what a Data Warehouse is and explain its key characteristics.
[7 marks]Describe the multi-tiered architecture of Data Warehousing.
[7 marks]Briefly outline the processes involved in Extraction, Transformation, and Loading (ETL) and the role of the Metadata Repository.
[7 marks]Explain the different models of Data Warehousing, including Enterprise Warehouse, Datamart, and Virtual Warehouse.
[7 marks]Define Data Mining and discuss its importance in today’s data-driven environment.
[7 marks]Discuss the applications targeted by Data Mining, particularly in Business Intelligence and Web Search Engines.
[7 marks]Describe the types of data that can be mined and the various patterns that can be identified through Data Mining.
[7 marks]Define Market-Basket Analysis and explain its significance in understanding consumer behavior.
[7 marks]Explain the Apriori algorithm for Frequent Itemset Mining, detailing its process and importance.
[7 marks]Discuss how association rules are generated from frequent itemsets and the significance of this process.
[7 marks]Define Classification and its importance in data mining and predictive analytics.
[7 marks]Explain Bayesian Classification, focusing on Bayes’ Theorem and the Naïve Bayesian Classification approach.
[7 marks]Discuss Rule-based Classification, detailing how IF-THEN rules are used and the processes of Rule Extraction and Rule Induction using a Sequential Covering Algorithm.
[7 marks]Define Cluster Analysis and explain its significance in data mining and pattern recognition. What are the essential requirements for effective cluster analysis?
[7 marks]Provide an overview of basic clustering methods, including the main characteristics of Partitioning Methods such as K-Means and K-Medoids.
[7 marks]Compare and contrast Hierarchical Methods, specifically Agglomerative versus Divisive clustering, and describe the process of Hierarchical Clustering.
[7 marks]Explain various Distance Measures used in clustering algorithms and discuss advanced methods like BIRCH and Chameleon for hierarchical clustering.
[7 marks]