Friday , April 19 2024
Breaking News

MapReduce and YARN Cognitive Class Exam Answers

MapReduce and YARN Cognitive Class Exam Answers

Module 1: Introduction to MapReduce and YARN

Question 1 : Which phase of MapReduce is optional?

  • Shuffle
  • Reduce
  • Combiner
  • Map

Question 2: Which node is responsible for assigning (key, value) pairs to different reducers?

  • Shuffle node
  • Reducer node
  • Combiner node
  • Mapper node

Question 3: Where are the output files of the Reducer task stored?

  • A data warehouse
  • Hadoop FS
  • Within the Reducer node
  • Linux FS

Module 2: Limitations of Hadoop v1 & MapReduce v1

Question 1 : What is an issue or limitation of the original MapReduce v1 paradigm?

  • It’s not scalable
  • It only has one TaskTracker
  • It only supports Parquet file types
  • It only has one JobTracker

Question 2: How is YARN an improvement over the MapReduce v1 paradigm?

  • It’s completely open source
  • It splits the JobTracker into two processes: ResourceManager and ApplicationManager
  • It reduces multi-tenancy to improve performance
  • It splits the TaskTracker into two processes: ResourceManager and ApplicationManager

Question 3: Existing applications can run on YARN without recompilation. True or False?

  • True
  • False

Module 3: The Architecture of YARN

Question 1 : The main change from Hadoop v1 to Hadoop v2 was the consolidation of both resource management and job processing. True or False?

  • True
  • False

Question 2: The NodeManager is a more generic and efficient version of the TaskTracker. True or False?

  • True
  • False

Question 3: A new ApplicationMaster is launched for each job and ends when the job completes. True or False?

  • True
  • False

Final Exam :

Question 1: Which of the following is the correct sequence of MapReduce flow?

  • Reduce —> Combine —> Map
  • Combine —> Reduce —> Map
  • Map —> Reduce —> Combine
  • Map —> Combine —> Reduce

Question 2 :  Which of the following can be used to control the number of part files in a MapReduce program’s output directory?

  • Shuffle parameters
  • Number of Reducers
  • Counter
  • Number of Mappers
  • Duplicate of ‘Question 2’

Question 3:  Which of the following operations will work improperly when using a Combiner?

  • Average
  • Maximum
  • Count
  • Minimum

Question 4 : Which of the following is true about MapReduce?

  • Compression of input files is optional.
  • Output from the Map phase is replicated.
  • The programmer must write the Map code, the Shuffle code, and the Reduce code.
  • MapReduce programs must be written in Java.

Question 5 : Input data to MapReduce is record-oriented and blocks of data contain the same number of full records. True or False?

  • False.
  • True.

Question 6 :Which statement is true about the Reduce phase of MapReduce?

  • Output results are sent to the client program.
  • Data arrives from the Shuffle phase already sorted by key.
  • The Reducer phase sums up the values associated with each key.
  • Each Reduce task processes all the data for one key only.

Question 7:Which statement is true about the Reduce phase of MapReduce?

  • Containers are used instead of slots in MRv1, and can be used with either Map or Reduce tasks in MRv2.
  • There is one JobTracker in the cluster.
  • MapReduce jobs written in Java for MRv1 never require recompilation.
  • Each job has an ApplicationManager that obtains Container IDs from the NodeManager.

Question 8: With YARN, long-running jobs acquire and retain fixed-size containers before execution starts. True or False?

  • False.
  • True.

Question 9: Which of the following statements is true?

  • The NameNode in Hadoop 2 is fully fault-tolerant, whereas in Hadoop 1 it was a single point of failure.
  • The NodeManager in Hadoop 2 replaces the TaskTracker in Hadoop 1.
  • YARN requires a minimum of two nodes, one master and one slave, to run
  • Both MapReduce and YARN can scale to any cluster size.

Question 10: The command provides the CLASSPATH needed for compiling Java programs written for MapReduce or YARN. True or False?

  • False.
  • True.

Question 11: Which statement is true about MapReduce’s use of replication in HDFS?

  • Only one copy of each replicated block is processed by MapReduce in normal operation.
  • Speculative execution is normally performed on all copies of each “split.”
  • Each DataNode uses RAID to store its data.
  • Multiple copies of each record are kept on each node.

Question 12: On which file system (FS) is the output of a Mapper task stored?

  • Linux FS, and it is replicated 3 times.
  • HDFS, and it is replicated 3 times.
  • Linux FS, but it is not replicated.
  • HDFS, but it is not replicated.

Question 13 : Which of the following statements is true?

  • You can set the number of Reducers.
  • The Shuffle phase is optional.
  • You can set the number of Mappers and the number of Reducers.
  • The number of Combiners is the same as the number of Reducers.
  • You can set the number of Mappers.

Question 14 : What will a Hadoop job do if you try to run it with an output directory that is already present?

  • It will create new files, but with a different suffix.
  • It will create another directory to store the output.
  • It will erase all files in that directory before running.
  • It will not run.

Question 15 :What are the main components of the ResourceManager in YARN? Select two.

  • Scheduler
  • JobTracker
  • DataManager
  • HDFS
  • ApplicationManager

About Clear My Certification

Check Also

infosys springboard

Pragati – Infosys Launched career development program for women | Get Complete Details Here

Infosys Springboard Pragati: Path to Future. It is an exclusive career development program for women which will …

Leave a Reply

Your email address will not be published. Required fields are marked *