Browse Prior Art Database

Disclosure of simulation application design with the use of shared memory and data compression

IP.com Disclosure Number: IPCOM000021259D
Original Publication Date: 2004-Jan-08
Included in the Prior Art Database: 2004-Jan-08
Document File: 3 page(s) / 74K

Publishing Venue

IBM

Abstract

This disclosure is a design technique for general simulation system which refers large amount of data like Monte Carlo simulation. The model system infrastructure of the simulation system which this article describes is IBM pSeries and AIX 4.3.3 based. And IBM LoadLeveler is used as a middleware to dispatch the simulation JOBs, each JOB is related to each interest rate scenario, to the simulation execution nodes ( 6 nodes ). The simulation JOB consists of multiple execution processes on the simulation execution nodes and each process accesses the simulation database on the DB nodes, executing SELECT statements concurrently to get interest rate, account data, agreement data, etc. The CPU usage of the simulation execution nodes stays at lower level because the performance bottleneck was workload concentration of the DB nodes. Consequently, the 1000 interest rate scenarios of Monte Carlo simulation took more than 24 hours even if all 6 nodes worked. In general, tuning of DBMS, gigabit networking, etc. could be considered in such case. However, to accelerate performance without upgrading hardware is required in some cases. Moreover, although it was possible to improve processing efficiency to some extent by increasing the number of simulation processes, there was a limit in increasing the efficiency because of resource restrictions ( the number of simultaneous SQL transactions, large memory usage, etc. ) of DB node.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 43% of the total text.

Page 1 of 3

THIS COPY WAS MADE FROM AN INTERNAL IBM DOCUMENT AND NOT FROM THE PUBLISHED BOOK

JP820031002 Yusuke Kanehira/Japan/IBM Kiyonori Komiya, Keiichi 1 Fujita

Disclosure of simulation application design with the use of shared memory and data compression

To reduce execution time of the simulation, workloads distribution of the DB nodes and optimization of DB access are indispensable. In this point of view, we tried to load all the necessary data for the simulation from the DB into a shared memory of each node in prior to start the first simulation JOB. However, the data size of the DB exceeds 600 mega bytes, and only one segment( 256 mega bytes ) of the shared memory was available for the simulation application, since most segment were reserved for the work areas for other applications, middle-ware, and DBMS. Then, we found the way to compress data before storing into a shared memory to fit in one segment, and de-compressing it on use. The data structure of the shared memory is hash table to optimize data access time. There is no memory address in the data block (structure) in shared memory. Because, in case of AIX, probably most of other UNIX implementation, the shared memory segment could be mapped to different segment number by each process. So, the data block consists of the value and the offset address in the shared memory segment instead of absolute address.

We have also cared about the performance of data de-compression on simulation. We have created our own compress/de-compress code (which is about 20 lines of C language code) based on zero-compression algorithm without using existing compression library like zlib.

As the result of tuning, the simulation time takes about 4 hours now. Before implementing this idea, it took over 24 hours.

(1) overview of the new approach As shown in left hand side of Fig.1, each simulation process had been accessing database to calculate cashflow on use. The technique we disclose (right hand side figure) is to pre-load account data prior to start simulation, compress the data, save it into shared memory. The compressed data is de-compressed by simulation process on calculating cashflow.

F irst simulation job process attempts loading account data, co mp re ss it, a n d sa v e it in to sh a re d me mo ry .

Before: No w:

simulation node

simu la tio n process

simu la tio n process

DB node

A cco u n t d a ta fo r simu la tio n

simulation node

simulation node

D B node

Account data fo r simu la tio n

simu la tio n process (first time

Shared Memory

(compressed simu la tio n data

Shared Memory

simu la tio n process

each process calls many SQLs on caluclating cashflow .

F ig 1. Ov erv iew of simulation time optimization

simulation node

simu la tio n process

simu la tio n process

simu la tio n process (first time

simu la tio n process

re trie v e s simu la tio n d a ta , d e - co mp re ss it, a n d u se it fo r
ca lu cla tio n .

(2) system structure and control flow

1

[This page contains 4 pict...