READ Free Dumps For Cloudera- CCD-410
Question ID 12491 | On a cluster running MapReduce v1 (MRv1), a TaskTracker heartbeats into the JobTracker
on your cluster, and alerts the JobTracker it has an open map task slot.
What determines how the JobTracker assigns each map task to a TaskTracker?
|
Option A | The amount of RAM installed on the TaskTracker node.
|
Option B | The amount of free disk space on the TaskTracker node.
|
Option C | The number and speed of CPU cores on the TaskTracker node.
|
Option D | The average system load on the TaskTracker node over the past fifteen (15) minutes.
|
Option E | The location of the InsputSplit to be processed in relation to the location of the node.
|
Correct Answer | E |
Explanation Explanation: The TaskTrackers send out heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that it is still alive. These message also inform the JobTracker of the number of available slots, so the JobTracker can stay up to date with where in the cluster work can be delegated. When the JobTracker tries to find somewhere to schedule a task within the MapReduce operations, it first looks for an empty slot on the same server that hosts the DataNode containing the data, and if not, it looks for an empty slot on a machine in the same rack. Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, How JobTracker schedules a task?
Question ID 12492 | Can you use MapReduce to perform a relational join on two large tables sharing a key?
Assume that the two tables are formatted as comma-separated files in HDFS.
|
Option A | Yes.
|
Option B | Yes, but only if one of the tables fits into memory
|
Option C | Yes, so long as both tables fit into memory.
|
Option D | No, MapReduce cannot perform relational operations.
|
Option E | No, but it can be done with either Pig or Hive.
|
Correct Answer | A |
Explanation Explanation: Note: * Join Algorithms in MapReduce A) Reduce-side join B) Map-side join C) In-memory join / Striped Striped variant variant / Memcached variant * Which join to use? / In-memory join > map-side join > reduce-side join / Limitations of each? In-memory join: memory Map-side join: sort order and partitioning Reduce-side join: general purpose