Group MMU Virtualization
Publication Date: 2014-Jan-14
The IP.com Prior Art Database
This invention is used to resolve the lock-contention of page-table emulation which divides vcpus into different groups based on its virtual numa info, then emulates page-table on groups instead of vcpus.
Page 01 of 2
Group MMU Virtualization
The hypervisor is responsible for maintaining the translation of addresses generated by the Guest kernel; running in the virtual machine. These translations are maintained in a hardware defined format called the page-table.
Any time a fault is generated when the guest accesses a page, the hypervisor is responsible to allocate a new page frame and map that page frame to the corresponding entry in the page-table. This can happen concurrently if multiple threads; also called vCPUs, of the guest fault simultaneously. Hence the access to the page table needs to be serialized to avoid page-table corruption. However, on large virtual machine with large number of vCPUs and large amount of memory, the amount of contention increases bogging down the performance of the virtual machine. This can be a serious bottleneck.
Duplicating the page-table; one per VCPU, can alleviate the bottleneck. However, this solution has some disadvantages:
(1): it trades-off space for time. Multiple page tables per virtual machine can take away significant amount of memory for page tables,
(2) it can generate more page-fault since the mappings fixed on one VCPU continues to be invalid on other VCPUs.
(3) On systems with no hardware extended page table support; where guest MMU management is done using shadow page tables, any page modification done on one shadow page table needs to be synchronized to all other shadow page table. This can be a significant overhead.
These disadvantages can be alleviated by grouping multiple VCPUs and guest memory-ranges into individual logical units.
This invention explains the idea.
Logically partition the total guest memory and guest VCPU into a group; which is called a Guest-NUMA-node. Expose the Guest-NUMA-node information to the guest so that the guest can manage its resources intelligently. The hypervisor maintains multiple guest-page-table one per guest-NUMA-node with a corresponding lock. Concurrent access to a guest-page-table by the VPCUs of the guest-NUMA-node can than be serialized using its lock.
The invention introduces a NUMA-emulated memory layout. Hypervisor partitions the physical memory into logical-partitions called guest-numa-node. It then exposes these logical partitions to the guest, through any available mec...