Deep learning with FPGAs and neural nets, at China’s ZTE

January 26, 2017 // By Graham Prophet
Chinese telecoms company ZTE reports work carried out in collaboration with Intel that achievves what the two companies are calling a new benchmark in deep learning and convolutional neural networks (CNN), using OpenCL programming; capabilities that will underpin Internet search and AI techniques for tasks such as picture search and matching.

“Perception, such as recognizing a face in an image, is one of the essential goals of the ZTE 5G System,” said Duan Xiangyang, vice president of the ZTE Wireless Institute. “Deep learning technology is very important as it can enable such perception in mobile edge computing systems, thus making ZTE’s 5G System smarter.”


The test took place in Nanjing City, China, where ZTE’s engineers used Intel’s midrange Arria 10 FPGA for a cloud inferencing application using a CNN algorithm. ZTE has achieved a new record – beyond a thousand images per second in facial recognition – with what is known as “theoretical high accuracy” achieved for their custom topology. Intel’s (ex-Altera) Arria 10 FPGA accelerated the raw design performance more than 10 times while maintaining the accuracy. The Arria 10 FPGA provides up to 1.5 teraflops (TFLOPs) single precision floating-point processing performance, 1.15 million logic elements, and more than a terabit-per-second high-speed connectivity.


Such deep learning designs can be migrated from the Arria 10 FPGA family to the high-end Stratix 10 FPGA family, and Intels estimates users can expect up to nine times performance boost. The team at the ZTE Wireless Institute sped design time with the use of the OpenCL programming language, via the Intel SDK for OpenCL.


The benchmark was achieved on a server holding 4S Intel Xeon E5-2670v3 processors running at 2.30 GHz, 128 GB DDR4; Intel PSG Arria 10 FPGA Development Kit with one 10AGX115 FPGA, 4 GB DDR4 SODIMM, Intel Quartus Prime and OpenCL SDK v16.1.