Browse Prior Art Database

Using flexible slots and cross-assignment to increase MapReduce resource utilization

IP.com Disclosure Number: IPCOM000234081D
Publication Date: 2014-Jan-10
Document File: 7 page(s) / 96K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a system applied to a task scheduler to improve resource utilization during the slow start and reducer tail, enforce map task, and reduce task limits per host to prevent over-utilization. The approach is to cross-assign reduce tasks to map slots, or map tasks to reduce slots, when these slots are not used.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 18% of the total text.

Page 01 of 7

Using flexible slots and cross-assignment to increase MapReduce resource utilization

To run jobs as quickly as possible, a task scheduler must concurrently run an ideal number of map tasks and reduce tasks as a method to optimize performance when the number of available slots is limited.

If the task scheduler runs too few reduce tasks, then it might have to run multiple

waves of reduce tasks for the job. Only after a reduce task from the previous reducer wave finishes, can a reduce task from a subsequent wave start to shuffle , merge, and sort. Therefore, some of the shuffle, merge, and sort become a sequential cost rather than parallelized, compared to running a single reducer wave. Parallelizing the shuffle, merge, and sort for reduce tasks that are not started in the first wave is not possible. This can be costly if the job's shuffle, merge, and sort are heavy. The total duration of the reducer's shuffle , copy, and merge phases may be amplified when multiple waves of reducers are scheduled because the first wave of reducers spends a lot of time waiting, rather than doing productive work.

Similarly, if the scheduler starts reduce tasks too late, the shuffle, merge, and sort for reduce tasks might not be sufficiently amortized with the map phase , which can

cause the job to run longer than if the system had started the reduce tasks earlier .

If the scheduler runs too many reduce tasks or starts reduce tasks too early , then the reduce tasks might occupy slots that could have been running map tasks , making the map phase (and the overall job) take much longer. As a direct result, if not enough map tasks are completed at a fast enough rate , then the reduce task has prolonged idle periods and wastes the computational resources.

One current solution employs complex parameters, and understanding how to tweak these parameters is very difficult and fails to meet the goal in a foolproof manner . In addition, the user has to reconfigure cluster-level settings to optimize for individual

jobs or just suffer performance degradation when cluster configuration is not optimal for the job's characteristics. In addition, adding or removing a single host to/from the cluster may completely change the dynamics of the job. Running concurrent jobs can also change the dynamics. Reserving dedicated map slots to run map tasks and dedicated reduce slots to run reduce tasks tends to either over - or under-utilize resources.

Another solution prioritizes resource requests for the map tasks above those for the reduce tasks. Due to the prioritization of map task resource requests, it seems that map tasks may run before reduce tasks if there is contention for resources , thereby creating a long reducer tail and extending the overall job runtime.

The novel contribution is a system to cross-assign reduce tasks to map slots, or map tasks to reduce slots, when these slots are not used. This is a means of improving resource utilization during the slow start and r...