Browse Prior Art Database

Prediction of job resource usage based on large amount of historical data Disclosure Number: IPCOM000248232D
Publication Date: 2016-Nov-10
Document File: 2 page(s) / 16K

Publishing Venue

The Prior Art Database


A method to measure and predict the resource usage of new jobs based on large amount of historical data in a job scheduling system. In a large job scheduling system with thousands of execution servers and millions of running and queueing jobs, it’s always a good practice for users to submit jobs with resource requirement (run time, CPU time, memory to be used, etc.) prediction. Jobs without resource requirement may result in high load on execution servers that cause jobs inefficiency. Submitting jobs with resource requirement and estimation can tell the scheduler how much resource jobs might consume and schedule will use this information to efficiently schedule jobs. However, users may not know how much resource their jobs might take. If forced by grid administrator to specify resource estimations during job submission, they will normally put a relatively large amount of buffer to the resource reservation. This will result in waste of resources and inefficiency of using the resources. In this kind of large scale of grid environment, there are a huge amount of job history. This method takes full usage of these data and predict and estimate the resource usage of the new jobs.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 2

Prediction of job resource usage based on large amount of historical data

Grid users request too much more resources than their job actually needs. Causing computing resource being wasted

A new submitted job may have these features:

1. Job name (command to execute, arguments of the command)

2. Submission user

3. Job submission server

4. Current working directory when the job is submitted

5. Input file

6. Number of execution servers to run on

7. User specified resource requirement estimation

Each difference of above features may result in different job result and resource usage:

1. Run time (wall clock time)

2. Resource usage (CPU time, memory, etc)

This method makes the scheduler to find the key contributor(s) of the above features to the job result and resource usage by learning the historical job data.

The method is to make the job scheduler a learning machine. The more historical jobs the grid ran, the more accurate estimationthe scheduler can make on

jobs resource usage. Grid administrator can set threshold of how many jobs the scheduler has learned to start the prediction.

For some particular jobs, the job owner may know how much resource the jobs will take. For such cases, the scheduler can give an estimation prompt to the user and the user is the one to decide whether to use the scheduler's estimation or keep his own estimation.

The first task is to categorize and group the historical jobs. In this method, consider jobs with the same job execution command (The first token of the command line, without the arguments to it) as the same type. Also need to make use of the current working directory (CWD) of the job. For example, job A and job B are submitted like this (sub is the command to use to submit jobs to scheduler, path in the square bracket is the current working directory):

Job A: [/filer/app]$ sub ./executePI -i /home/user/

Job B: [/home/user]$ sub /filer/app/executePI -I /home/user/

In this case, the enti...