READ Free Dumps For Cloudera- CCD-410
Question ID 12475 | You wrote a map function that throws a runtime exception when it encounters a control
character in input data. The input supplied to your mapper contains twelve such characters
totals, spread across five file splits. The first four file splits each have two control characters
and the last split has four control characters.
Indentify the number of failed task attempts you can expect when you run the job with
mapred.max.map.attempts set to 4:
|
Option A | You will have forty-eight failed task attempts
|
Option B | You will have seventeen failed task attempts
|
Option C | You will have five failed task attempts
|
Option D | You will have twelve failed task attempts
|
Option E | You will have twenty failed task attempts
|
Option F | Correct Answer*: E .
Explanation: There will be four failed task attempts for each of the five file splits.
Note:
|
Correct Answer | |
Explanation
Question ID 12476 | MapReduce v2 (MRv2/YARN) splits which major functions of the JobTracker into separate
daemons? Select two.
|
Option A | Heath states checks (heartbeats)
|
Option B | Resource management
|
Option C | Job scheduling/monitoring
|
Option D | Job coordination between the ResourceManager and NodeManager
|
Option E | Launching tasks
|
Option F | Managing file system metadata
|
Correct Answer | B,C |
Explanation Explanation: The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map- Reduce jobs or a DAG of jobs. Note: The central goal of YARN is to clearly separate two things that are unfortunately smushed together in current Hadoop, specifically in (mainly) JobTracker: / Monitoring the status of the cluster with respect to which nodes have which resources available. Under YARN, this will be global. / Managing the parallelization execution of any specific job. Under YARN, this will be done separately for each job. Reference: Apache Hadoop YARN Concepts & Applications