Browse Prior Art Database

A method to improve resource utilization in cluster scheduling by splitting long-time job into sub-jobs

IP.com Disclosure Number: IPCOM000177739D
Original Publication Date: 2008-Dec-29
Included in the Prior Art Database: 2008-Dec-29

Publishing Venue

IBM

Abstract

A method to improve resource utilization in cluster scheduling by splitting long-time job into sub-jobs

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 46% of the total text.

Page 1 of 13

A method to improve resource utilization in clusterscheduling by splitting long-time job into sub-

Background: What is the problem solved by your invention? Describe known solutions to this problem (if any). What are the drawbacks of such known solutions, or why is an additional solution required? Cite any relevant technical documents or references.

Answer:

This invention focuses on the schedule job in a busy cluster. Currently resource in cluster, especially in busy cluster, is sometimes underutilization. There is no good solution to trace this problem till now. For example in some clusters like weatherforecast center, there are a lot of time slots separated by large numbers of jobs and reservations (like recurring reservation). So the cluster resource is alveolate, and there are a lot of resource fragments in cluster. In this situation the long-time job can not run right now and it cannot be inserted into reservation intermission. The job has to delay a long time and it will start in the future which may be one year later. When the resources are split into small pieces, long-time job can't be scheduled even the total resources can meet the requirement of it. The job may stay idle for very long time, or it won't be scheduled for ever if it can't start between occurrences of an endless recurring reservation .

For example:

1

jobs

Page 2 of 13

Resource

Job requiring TWO nodes

Can NOT fit in between time 0 ~ t

Node 3

Occupied

Node 2

Occupied

Occupied

Node 1

Occupied

0 t

Time

2

Page 3 of 13

Summary of Invention: Briefly describe the core idea of your invention (saving the details for questions #3 below). Describe the advantage(s) of using your invention instead of the known solutions described above.

Answer:
With this invention the scheduler will initiatively separate job into small sizes, and fill the sub jobs into time slots. In this way long-time job will not be delayed. The job could start up much earlier rather than waiting for some recurring reservations to complete.

3

Page 4 of 13

Resource

Job requiring TWO nodes

Job fits into spaces by splitting pieces

Node 3

Job Piece B

Occupied

Job Piece A

Node 2

Occupied

Occupied

Job Piece C

Node 1

Occupied

Job Piece B

0 t

Time

4

Page 5 of 13

Description: Describe how your invention works, and how it could be implemented, using text, diagrams and flow charts as appropriate. Answer:
1. Definitions
In current system, recurring reservations occupy a lot of resources, and split resources into small slots. There is no time slot that is long enough to start long-time job though the total resource in cluster can satisfy both recurring reservation and the job .

a) define one threshold time

_fragment

                            value:
This threshold is used to determinating if this time slot is big enough. If the time slot is larger than this value we will separate long -time job and insert this job fragment into this time fragment.

b) define second threshold start

_time

_

_delay

_value:

T...