Explain various features of Data Warehouse.
[3 marks]Draw the three tier architecture of data warehouses.
[4 marks]Explain KDD Process.
[7 marks]Give the difference between OLAP and OLTP.
[3 marks]Define the term “data mining”. List the major issues in data mining.
[4 marks]With the help of a suitable example, illustrate the OLAP operations: ‘drill-down’, ‘roll-up’, ‘slice’ and ‘dice’
[7 marks]Explain data smoothing methods to divide given data into bins of size by bin means, by bin medians and by bin boundaries. Consider the data: 10, 2, 19, 18, 20, 18, 25, 28, 22
[7 marks]Differentiate Fact table vs. Dimension table
[3 marks]What is noise? Explain the various noise removing methods.
[4 marks]Explain Mean, Median, Mode, Variance, Standard Deviation with suitable database example
[7 marks]What is data cleaning? How to handle the missing value in data cleaning?
[3 marks]Why pre-processing is required in data mining process? List the major steps of pre-processing.
[4 marks]Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order):13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62, 69, 72
[7 marks]Use min-max normalization to transform the value 50 for age onto the range [0:0, 1:0] ii) Use z-score normalization to transform the value 50 for age, where the standard deviation of age is 20.64 years.
[ marks]What are the limitations of the Apriori approach for mining?
[3 marks]What is Market Basket Analysis? Explain Association Rules with Confidence & Support.1
[4 marks]State the Apriori Property. Find frequent item-sets and association rules using Apriori algorithm on the following data set with minimum support count is 2 and minimum confidence=75%. Sr.No TID List of items_IDs 1 T100 I1,I2,I5 2 T200 I2,I4 3 T300 I2,I3 4 T400 I1,I2,I4 5 T500 I1,I3 6 T600 I2,I3 7 T700 I1,I3 8 T800 I1,I2,I3,I5 9 T900 I1,I2,I3
[7 marks]What is classification? Give the difference between classification and predication
[3 marks]What is Information gain and Gain ratio?
[4 marks]Explain Baye’s Theorem and Naïve Bayesian Classification.
[7 marks]What is clustering? Why clustering is an un-supervised learning?
[3 marks]Write a short note on text mining.
[4 marks]Explain k-means algorithm of clustering.
[7 marks]Give the Difference between Spatial and Temporal Data Mining.
[3 marks]Briefly explain Linear and Non-linear regression.
[4 marks]Explain different types of Web Mining with example.
[7 marks]