Explain five P’s of Big Data in brief.
[3 marks]Explain job scheduling of fair scheduler in Map Reduce.
[4 marks]1) Define the following terms:
[3 marks]Name Node b) Data Node c) Heartbeat 2) Explain HDFS operations in detail.
[4 marks]Explain Big Data in terms of Volume, Velocity and Veracity.
[3 marks]Enlist various applications of Big Data. How it can be used to improve the productivity in agriculture.
[4 marks]Explain Hadoop components with diagram.
[7 marks]1) Define data serialization. 2) With proper examples discuss and differentiate structured, unstructured and semi-structured data. 3) Make a note on how type of data affects data serialization.
[2 marks]Explain storage mechanism in HBase.
[3 marks]Differentiate: Apache pig Vs Map Reduce.
[4 marks]Justify “Spark is faster than MapReduce”.
[7 marks]Define Zookeeper. Enlist and discuss the benefits of it.
[3 marks]Explain the architecture of HIVE. List out the features of HIVE.
[4 marks]What is RDD? Explain about transformations and actions in the context of RDDs. State and explain RDD operations in brief.
[7 marks]Discuss Machine Learning with MLlib in SPARK.
[3 marks]Explain SPARK unified stack.
[4 marks]Differentiate SQL and NoSQL. Enlist the industry applications of NoSQL.
[7 marks]Explain sharding process of MongoDB.
[3 marks]Explain job scheduling of capacity scheduler in Map Reduce.
[4 marks]What is NoSQL? List out the features of NoSQL. Explain types of NoSQL databases in brief.
[7 marks]Define: Term Frequency and Inverse Document Frequency.
[3 marks]Explain metastore in Hive.
[4 marks]Explain following for MongoDB. 1) Indexing 2) Aggregation1
[7 marks]Explain scaling feature of MongoDB.
[3 marks]Explain following in brief with respect to Mongo DB : 1) Collections and documents 2) Indexing and retrieval
[4 marks]Explain CRUD operations with suitable example in MongoDB.
[7 marks]