READ Free Dumps For Cloudera- CCD-410
Question ID 12489 | Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application
containers and monitoring application resource usage?
|
Option A | ResourceManager
|
Option B | NodeManager
|
Option C | ApplicationMaster
|
Option D | ApplicationMasterService
|
Option E | TaskTracker
|
Option F | JobTracker
|
Correct Answer | B |
Explanation Reference: Apache Hadoop YARN – Concepts & Applications
Question ID 12490 | What types of algorithms are difficult to express in MapReduce v1 (MRv1)?
|
Option A | Algorithms that require applying the same mathematical function to large numbers of individual binary records.
|
Option B | Relational operations on large amounts of structured and semi-structured data.
|
Option C | Algorithms that require global, sharing states.
|
Option D | Large-scale graph algorithms that require one-step link traversal.
|
Option E | Text analysis algorithms on large collections of unstructured text (e.g, Web crawls).
|
Correct Answer | C |
Explanation Explanation: See 3) below. Limitations of Mapreduce where not to use Mapreduce While very powerful and applicable to a wide variety of problems, MapReduce is not the answer to every problem. Here are some problems I found where MapReudce is not suited and some papers that address the limitations of MapReuce. 1. Computation depends on previously computed values If the computation of a value depends on previously computed values, then MapReduce cannot be used. One good example is the Fibonacci series where each value is summation of the previous two values. i.e., f(k+2) = f(k+1) + f(k). Also, if the data set is small enough to be computed on a single machine, then it is better to do it as a single reduce(map(data)) operation rather than going through the entire map reduce process. 2. Full-text indexing or ad hoc searching The index generated in the Map step is one dimensional, and the Reduce step must not generate a large amount of data or there will be a serious performance degradation. For example, CouchDBs MapReduce may not be a good fit for full-text indexing or ad hoc searching. This is a problem better suited for a tool such as Lucene. 3. Algorithms depend on shared global state Solutions to many interesting problems in text processing do not require global synchronization. As a result, they can be expressed naturally in MapReduce, since map and reduce tasks run independently and in isolation. However, there are many examples of algorithms that depend crucially on the existence of shared global state during processing, making them difficult to implement in MapReduce (since the single opportunity for global synchronization in MapReduce is the barrier between the map and reduce phases of processing) Reference: Limitations of Mapreduce where not to use Mapreduce