Licensable IP DSP cores step up performance

September 14, 2015 // By Graham Prophet
Cores in Synopsys’ ARC EM DSP Processor Family combine high efficiency control and signal processing for ultra low power, always-on applications; XY memory support in EM9D and EM11D delivers up to three times or more DSP performance than existing ARC EM processor cores, with lower energy consumption.

The EM9D and EM11D cores implement an enhanced version of the ARCv2DSP instruction set architecture (ISA), combining RISC and DSP processing with support for an XY memory system to boost digital signal processing performance while minimising power consumption. The cores maximise processing throughput by retrieving instructions and data from memories that are tightly coupled to the processor pipeline, reducing the number of accesses to system memory along with the associated latency and power consumption penalties. The ARC MetaWare Development Toolkit has been enhanced to offer full C/C++ programming support for the cores' DSP instructions and XY memory as well as a library of DSP functions. The cores are optimised for DSP-intensive functions such as sensor fusion, voice detection, speech recognition and audio processing.

"Synopsys' ARC EM9D and EM11D processors are ideally suited for the increasing number of IoT devices using speech comprehension capabilities to enhance hands free operation," said Dean Neumann, CEO at Malaspina Labs. "The combination of these latest ARC EM cores and highly efficient speech processing software such as Malaspina Labs' VoiceBoost suite delivers an ultra low power solution for voice activation, biometric verification and speech recognition in 'always listening' devices."

All EM DSP cores implement a three stage pipeline for applications with a mixture of control and DSP workloads. The EM9D and EM11D take advantage of regular data access patterns common in signal processing code by integrating separate X and Y memories with hardware support for address generation and DMA to move data in and out of the memories. This enables a sustained throughput of one 32x32 MAC operation or two 16x16 MAC operations per clock cycle with minimal energy and area overhead. These new processors have also been enhanced to support full integer, fractional divide and square root operations, unaligned loads/stores and bitstream parsing, for complex sensor algorithms: they also improve processing efficiency for a range of audio formats including MP3, SBC, OPUS and AAC LC.