Page-oriented code reordering method
Publication Date: 2014-Jul-09
The IP.com Prior Art Database
Optimal temporal and spatial locality is one of the cornerstones in achieving high performance in computer programs. A lot of research has been done towards that goal. Further, optimizing compilers and profile based optimizers all incorporate methods that try to solve this problem. We propose a method for reordering the program code in order to minimize its working set, and thus reduce the pressure in the code cache, while also maximizing the in-page temporal and spatial locality. This last characteristic is important in reducing the amount of page-faults in during program run-time
Page 01 of 3
-oriented code reordering method
oriented code reordering method
1.Henis, Ealan A., Gadi Haber, Moshe Klausner, and Alex Warshavsky. "Feedback based post-link optimization for large subsystems." In Second Workshop on Feedback Directed Optimization, pp. 13-20. 1999.
The solution above tries to improve cache misses by global code reordering, but, it doesn't take into consideration the code placement in pages.
Our apparatus minimizes the amount of page-faults and improves the cache miss ratio of the method above by expanding its algorithm.
2.Boehm, Omer, Daniel Citron, Gadi Haber, Moshe Klausner, and Roy Levin. "Aggressive Function Inlining with Global Code Reordering." IBM Technical Paper
3.Nashon, I., and D. Berstein. "FDPR: A Post-pass Object-code Optimization Tool." In International Conference on Compiler Construction. 1996.
4.Heisch, Randall R. "Trace-directed program restructuring for AIX executables." IBM Journal of Research and Development 38, no. 5 (1994): 595-603.
5. Atta, Islam, P·nar T·z·n, Xin Tong, Anastasia Ailamaki, and Andreas Moshovos. "STREX: Boosting Instruction Cache Reuse in OLTP Workloads Through Stratified Transaction Execution." ISCA40 (2013).
6.Basu, Arkaprava, Jayneel Gandhi, Jichuan Chang, and Mark D. Hill1 Michael M. Swift. "Efficient Virtual Memory for Big Memory Servers." ISCA40 (2013).
7. The Sun Studio Binary Code Optimizer: http://www.oracle.com/technetwork/server-storage/solaris/binopt-136601.html
The core idea in our invention is a new/expanded global code reordering algorithm that minimizes the number of page faults and TLB misses (an important goal as can be seen in related work '6' above) by improving the temporal and spatial locality of code in pages.
Code chain is a sequence of basic blocks that have high probability of executing sequentially.
Code cluster is a collection of code chains or smaller code clusters, that has high temporal locality, ii.e., high probability of branches within the collection as compared to branches out of the collection.
The FDPR method basically allocates the chain's basic block sequentially to reduce the number of taken branches. Once chains are made code clusters are built bottom-up and are placed sequentially in memory in memory in order to reduce page faults and cache misses.
Ordering Basic Blocks Into Chains
This is done by the FDPR m...