Browse Prior Art Database

Minimum Cut Based Code Reordering Method

IP.com Disclosure Number: IPCOM000237751D
Publication Date: 2014-Jul-09
Document File: 6 page(s) / 52K

Publishing Venue

The IP.com Prior Art Database

Abstract

Optimal temporal and spatial locality is one of the cornerstones in achieving high performance in computer programs. A lot of research has been done towards that goal. Further, optimizing compilers and profile based optimizers all incorporate methods that try to solve this problem. We propose a method, derived from a min-cut algorithm, for improving instruction memory locality, reducing accesses to lower levels of the hierarchical memory and minimizing the amount of page-faults in executable pages.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 33% of the total text.

Page 01 of 6

Minimum Cut Based Code Reordering Method

Related Work:
1.Henis, Ealan A., Gadi Haber, Moshe Klausner, and Alex Warshavsky. "Feedback based post-link optimization for large subsystems." In Second Workshop on Feedback Directed Optimization, pp. 13-20. 1999.

The solution above tries to improve cache misses by global code reordering, but, it doesn't take into consideration the code placement in pages.

Our apparatus minimizes the amount of page-faults and improves the cache miss ratio of the method above by expanding its algorithm.

2.Boehm, Omer, Daniel Citron, Gadi Haber, Moshe Klausner, and Roy Levin. "Aggressive Function Inlining with Global Code Reordering." IBM Technical Paper

(2006).

3.Nashon, I., and D. Berstein. "FDPR: A Post-pass Object-code Optimization Tool." In International Conference on Compiler Construction. 1996.

4.Heisch, Randall R. "Trace-directed program restructuring for AIX executables." IBM Journal of Research and Development 38, no. 5 (1994): 595-603.

5.Atta, Islam, Pınar Tözün, Xin Tong, Anastasia Ailamaki, and Andreas Moshovos. "STREX: Boosting Instruction Cache Reuse in OLTP Workloads Through Stratified Transaction Execution." ISCA40 (2013).

6.Basu, Arkaprava, Jayneel Gandhi, Jichuan Chang, and Mark D. Hill1 Michael M. Swift. "Efficient Virtual Memory for Big Memory Servers." ISCA40 (2013).

7.The Sun Studio Binary Code Optimizer: http://www.oracle.com/technetwork/server-storage/solaris/binopt-136601.html
The core idea in our invention is a new/expanded global code reordering algorithm that minimizes the number of page faults and TLB misses (an important goal as can be seen in related work '6' above) by improving the temporal and spatial locality of code in pages.

This method is a major expansion to the reordering method described in related works 1-4 above by taking into account where the code resides in memory pages.

Also, our apparatus improves the cache locality and reduces the number of cache misses further than the methods in the related work.

    In a nutshell, our code reordering algorithm works in four phases:
Phase 1
Ordering Basic Blocks Into Chains
This is done by known methods in the related works 1-4 for improving the locality and linearity of small sets of basic blocks that are frequently executed together (for example, loops) by building small "chains" or super-blocks of such code
Phase 2
Placing Chains Into Pages
We use our proposed algorithm, described in section 3 below, to minimize the number of virtual memory pages needed by the chains and minimize the flow between pages, bringing to a minimum the number of times we need to switch pages during program execution and maximizing the flow within a page, thus, minimizing the number of page faults and TLB misses.

We also keep a small unused space in each page for use by later optimization passes


Page 02 of 6

like alignment optimization.

Phase 3
Placing the Pages into caches
We use our proposed algorithm, described in section 3 below, to minimize the flow b...