Define big data. Mention few applications of big data analytics.
[3 marks]Explain 4 ‘v’ characteristics of big data with suitable example.
[4 marks]Explain following important daemons of HDFS and MapReduce: DataNode, NameNode and Secondary NameNode, JobTracker and TaskTracker
[7 marks]What is Hadoop eco-system? Mention the components of Hadoop eco-system.
[3 marks]Differentiate RDBMS and Hadoop.
[4 marks]Elaborate on following characteristics of Hadoop: Distributed, fault-tolerant, massive-storage, faster-processing, scalability
[7 marks]What is NoSQL? Explain types of NoSQL databases in brief.
[7 marks]Mention the uses of NoSQL databases.
[3 marks]What is Hive? Briefly explain data units supported by Hive.
[4 marks]Explain following phases of Map and Reduce in brief: Map, Combiner, partitioner, shuffle and sort, Reduce
[7 marks]Explain use of data serialization in hadoop.
[3 marks]Differentiate followings:
[4 marks]NameNode vs. DataNode (ii) Hive vs. RDBMS
[ marks]What is Hive? Explain its architecture and its working with suitable diagrams.
[7 marks]What is NewSQL? How it differs from NoSQL?
[3 marks]What is Zookeeper? How it helps in monitoring a cluster?
[4 marks]What is the importance of HBase? Explain the data model supported by HBase. Also differentiate Row-oriented database vs. Column-oriented database.
[7 marks]How Spark is faster than MapReduce? Explain in brief.
[3 marks]What is MongoDB? Explain the important features of MongoDB.
[4 marks]Explain working of MapReduce with suitable example.
[7 marks]What is Pig? How it is different from MapReduce?
[3 marks]Why do we need Pig? Explain the architecture of Pig.
[4 marks]Explain Resilient Distributed Dataset with reference to Spark.1
[7 marks]Draw architecture of spark and label it’s components.
[3 marks]Elaborate basic CRUD operations in MongoDB with suitable example of each.
[4 marks]Explain the workflow of Apache Zookeeper with suitable diagram.
[7 marks]