OpenPOWER-based CAPI acceleration on FPGA attached board

June 20, 2016 // By Graham Prophet
Alpha Data (Edinburgh, Scotland) has added a CAPI acceleration development kit for its ADM-PCIE-8K5 board. The kit enables designers to utilize Xilinx All Programmable FPGA devices attached to the coherent accelerator processor interfaces on IBM POWER8 systems.

The development kit includes the PSL (Power Service Layer) to provide the infrastructure connection to the POWER8 chip, examples of user defined AFU (Accelerator Function Units), as well as OS Kernel extensions and library functions specifically for CAPI. This solution removes the software overhead for processor communication with the I/O subsystem, allowing an accelerator to operate as part of an application, which significantly reduces the development time required to offload data processing applications to FPGAs.

 

The Alpha Data ADM-PCIE-8K5 PCIe form factor add-in card utilizes Xilinx FPGAs to deliver application-specific acceleration for Big Data workloads. The ADM-PCIE-8K5 is IBM Power8 CAPI capable, featuring a Xilinx Kintex UltraScale KU115 FPGA with 32 GB of DDR4-2400 ECC memory, dual SFP+ networking I/O ports, dual 4x16G FireFly expansion I/O ports, and built in USB accessible system monitoring and JTAG debug port. The hardware is supported by the Xilinx SDAccel tool for OpenCL, and Xilinx Vivado for HDL and HLS flows. Alpha Data offers Board Support Packages (BSP) including high-performance PCIe/DMA, OpenPOWER Architecture CAPI, FPGA example designs, plug and play O/S drivers, and a mature Application Programming Interface (API).

 

Describing CAPI, IBM says; “ The Coherent Accelerator Processor Interface (CAPI) on POWER8 systems provides a high-performance solution for the implementation of client-specific, computation-heavy algorithms on an FPGA. This innovation can replace either application programs running on a core or custom acceleration implementations attached via I/O. CAPI removes the overhead and complexity of the I/O subsystem, allowing an accelerator to operate as part of an application. IBM’s solution enables higher system performance with a much smaller programming investment, allowing hybrid computing to be successful across a much broader range of applications.

In the CAPI paradigm, the specific algorithm for acceleration is contained in a unit on the FPGA called the accelerator function unit (AFU or accelerator). The purpose of an AFU is to provide applications with a higher computational unit density for customized