eGPS : Exploratory Graph Processing System
Publication Date: 2014-Nov-11
The IP.com Prior Art Database
Disclosed are a new architecture and programming methodology that maximize parallelism and eliminate latencies through memory hierarchies when generating and using large amounts of data.
Page 01 of 5
:: Exploratory Graph Processing System Exploratory Graph Processing System
The generation and use of large amounts of unstructured data (e.g., social networks, business analytics, and extraction of information from big data ), has catapulted to such high levels, that a demand for new ways of processing the data has risen . Often, such data is stored in some graph form where nodes carry some information and nodes are connected by some semantic or other matching characteristics . Because of the distributed nature of data and distributed nature of processing , neither of the above two paradigms (shared and distribute systems) is well suited to efficiently process in such a setting. The resources invested for capitalizing locality of data tend to become wasteful most of the time.
The novel contribution is a new architecture and programming methodology that is well suited for such computations. Notably, graphs are efficiently represented and graph algorithms can be concisely encoded and efficiently executed over the proposed architecture. The architecture is a novel organization of processor and memory components, as well as a novel method of utilizing this organization. That method maximizes parallelism and eliminates latencies through memory hierarchies that are underutilized in a system with the earlier stated characteristics.
The global schema of the proposed architecture is shown in Figure 1. The system is a collection of nodes (shown as the columns in the figure) connected by a network . Each node consists of a memory bank , memory controller and a processor . Information in the memory is organized as a collection of blocks (the size of a block depends on the application, typically 16-32 bytes). The mode of communication in the network is via comm packets . Each comm packet has a block address , an opcode , and one block of data (D1). When such a packet is entered in the network , the method routes it to the node in which the addressed block resides and then delivers it to the corresponding controller. Based on the opcode, the controller typically reads the block of data (D2) from the bank at that address and appends it to the comm packet , to form a work
packet, and delivers it to the processor. The processor processes the information in the work packet and generates a set of comm packets , which are sent out (either back to the controller if it is local addr or via network to a remote address ).
The controller performs a variety of functions based on the opcode :
1. Write a block of data into the memory block and discard the packet
2. Read a block of data and form a work packet and send it to processor
3. Read a block of data, lock the block, form, and send a work packet to processor
4. Delay the packet if the block in memory is locked
5. Write a block of data into the memory block , unlock it and discard the packet
6. Synchronize two comm packets that read and write from same block and supply
the written data to the re...