What is Big Data? Explain how big data processing differs from distributed processing.
[3 marks]List various application of big data. How it can be used to improve business for a superstore.
[4 marks]Explain core architecture of Hadoop with suitable block diagram. Discuss role of each component in detail.
[7 marks]Explain Avro data serialization technique in MapReduce.
[3 marks]Explain characteristics of Big Data.
[4 marks]What is Hadoop Ecosystem? Discuss various components of Hadoop Ecosystem.
[7 marks]What is data serialization? With proper examples discuss and differentiate structured, unstructured and semi-structured data. Make a note on how type of data affects data serialization.
[7 marks]Explain following commands with syntax and at least one example of each. (1) copyFromLocal (2) showing the content of outputfile.
[3 marks]Explain “Map Phase” and “Combiner Phase” in MapReduce.
[4 marks]Write Map Reduce steps for counting occurrences of specific numbers in the input text file(s). Also write the commands to compile and run the code.
[7 marks]List various configuration files used in Hadoop Installation. What is use of mapred-site.xml?
[3 marks]Explain “Shuffle & Sort” phase and “Reducer Phase” in MapReduce.
[4 marks]Write Map Reduce steps for counting sum of numbers in the input text file(s). Also write the commands to compile and run the code.
[7 marks]What is Zookeeper? What are the benefits of Zookeeper?
[3 marks]Draw architecture of APACHE PIG and explain in short.
[4 marks]Define HDFS. Discuss the HDFS Architecture and HDFS Commands in brief.
[7 marks]What is HBase? Write a query to create a table in HBase.
[3 marks]Discuss role of Data node and Name node in HDFS.1
[4 marks]Draw and explain Architecture of APACHE HIVE. Explain various data insertion techniques in HIVE with example. [P.T.O]
[7 marks]Explain following in brief with respect to Mongo DB : 1) Collections and documents 2) Indexing and retrieval
[3 marks]Write difference between MangoDB and Hadoop.
[4 marks]What is NoSQL database? List the differences between NoSQL and relational databases. Explain in brief various types of NoSQL databases in practice.
[7 marks]Explain scaling in MangoDB.
[3 marks]Explain CRUD operations in MongoDB.
[4 marks]What is Resilient Distributed Dataset in Apache Spark? Explain in detail. Make a note on why RDD is better than Map Reduce data storage?
[7 marks]