State the difference between traditional data and big data.
[3 marks]Write a short note on predictive analytics.
[4 marks]Explain the “V’s” of Big Data in detail with relevant examples.
[7 marks]Write a Map-Reduce code for Word Count.
[3 marks]Explain the steps to set up the Hadoop cluster.
[4 marks]Draw and explain Map-Reduce framework in detail.
[7 marks]Draw and discuss HDFS architecture in detail.
[7 marks]Compare and contrast NoSQL and relational databases.
[3 marks]Write CQL queries for the following in Cassandra: 1. Create a keyspace named company 2. Create a table employee with columns: emp_id (PRIMARY KEY), name, dept, salary.
[4 marks]Discuss the architecture and features of Cassandra. How does it manage data distribution and fault tolerance?
[7 marks]Differentiate between master-slave and peer-to-peer distribution models.
[3 marks]Write CQL queries for the following in Cassandra: 1. Create a keyspace named university 2. Create a table students with columns: student_id (PRIMARY KEY), name, course, marks
[4 marks]Describe the four ways in which NoSQL systems handle big data problems. Illustrate your answer with suitable examples.
[7 marks]Differentiate between traditional batch processing and stream processing.
[3 marks]Explain the concept of lazy evaluation in Spark with an example.
[4 marks]Describe Flajolet-Martin algorithm with suitable example.
[7 marks]Enlist the challenges in mining data streams.
[3 marks]Explain the Spark execution workflow from job submission to task execution.
[4 marks]Explain the concept of counting ones in a window using DGIM algorithm. Illustrate with a bit stream example.1
[7 marks]Compare Apache pig with Map Reduce.
[3 marks]Explain the architecture of ZooKeeper.
[4 marks]Explain working of Hive with necessary steps and diagram.
[7 marks]Explain Metastore in Hive.
[3 marks]Explain the data processing operators in PIG.
[4 marks]Discuss the concepts of regions in HBase and storing Big Data with HBase.
[7 marks]