Browse Prior Art Database

P9NEST, - FB06 - Method of Managing Coherency in a 3-Hop Topology

IP.com Disclosure Number: IPCOM000250005D
Publication Date: 2017-May-15
Document File: 6 page(s) / 162K

Publishing Venue

The IP.com Prior Art Database

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Method of Managing Coherency in a 3-Hop TopologyABSTRACT

Disclosed is a method of managing coherency in a 3-hop topology.

The POWER* processors implement a non-blocking snooping protocol. This enables scaling

ever-larger n-way SMP systems which are typically limited by message queuing depth and

limitations in coherency bandwidth in message passing snooping-based coherency protocols. In

non-blocking snooping protocols, caching agents’ requests are temporarily bounded. When a

request is broadcast, it has a guaranteed fixed time in which all snoopers respond. Once a request

is placed on the coherency network, there is essentially no queuing. This facilitates running the

coherency network at very high utilization. Therefore, increasing the overall network bandwidth

has a direct effect on the system capacity to do work.

Very large 3-hop n-way SMP systems are especially difficult to manage for non-blocking

snooping protocols. Since queuing facilities are minimal, the coherency network must divide the

available coherency bandwidth evenly amongst the requesters. Furthermore, not all chips in the

system are the target of coherency broadcasts, intentionally due to selective broadcast or

unintentionally due to overcommits. Once requests are placed on the coherency network, each

request must keep track of which chip in the system it was broadcast. The snooper partial

responses must be returned in the exact reverse order. Finally, the combined responses must be

broadcast to the same chips in the system as the original requests in the same order, as well. The

problem and subject of this disclosure is how to manage the coherency broadcast and necessary

tracking structures.

The POWER9* 3-hop topology includes fully connected chips in a group via external SMP X-

buses (intra-group). The groups are fully interconnected to other groups via external SMP A-

buses (inter-group). However, each chip in a group is not fully interconnected with each chip in a

remote group. Only one chip within each group connects to a remote group. The POWER9

processor designates each stage in the coherency broadcast based on the position within the 3-

hop topology. The requesting caching agent is designated to be on the Local Master (LM) chip.

The chip that connects the local group to a remote group is designated to be the Local Hub (LH).

The remote group receiving chip is designated as the Remote Hub (RH). Finally, the remote

chip(s) on the remote group are designated as the Remote Leaf (RL).

The LH is equally allocated amongst all the chips on the local group. In the case of a 4-chip

group, each LM would have 25% of the available coherency bandwidth to issue requests beyond

the local group. The LH includes a tracking structure that keeps track of the originating LM, as

well as the X-bus it is connected. The LH tracking structure (SPLH presp FIFO) also keeps track

of which A-bus the command is broadcast. This could be all A-buses or none of the A-buses and

depends on the scope and...