This content requires the Adobe Flash Player and a browser with JavaScript enabled. Click here to get the latest version of Adobe Flash Player.

TAKING A BITE OUT OF POWER: TECHNIQUES FOR LOW-POWER-ASIC DESIGN

EVEN IF YOU ARE DESIGNING AN ASIC OR SOC THAT DOESN’T TARGET A LOW-POWER APPLICATION, YOU NEED TO BECOME FAMILIAR WITH LOW-POWERDESIGN TECHNIQUES, BECAUSE THE NEWEST GENERATION OF SILICON-PROCESS TECHNOLOGIES INHERENTLY LEAKS POWER.

BY MICHAEL SANTARINI • SENIOR EDITOR -- EDN Europe, 01 Jul 2007

Until recently, low-power-digital-IC design has been an area for specialist or guru IC designers. However, most IC-design engineers will have to learn a variety of low-power-design techniques as ASICs and SoCs (Systems on Chips) increasingly target processes of 130 nm and below. At 130-nm processes, foundries started to employ new techniques and materials, such as low-k dielectrics and copper, in silicon processes to increase design performance. However, smaller geometries, scaled thresholds, and unscaled voltages produced smaller, speedier ICs but produced a nasty side effect: leakage, or static power. By the 90-nm node, power management started to become a huge concern, and, at the 65-nm node, low-power-design techniquesare a must.

“As we scale technology nodes, clearly we have to lower VDD [supply voltage], because there is a quadratic relationship: Power dissipation is proportional to VDD 2,” says Mike Keating, a fellow at Synopsys. “If we just scaled the devices and did not scale VDD, we’d be doubling the power density every generation. We can’t dothat, so we’ve been lowering VDD.”

When the semiconductor industry lowered supply voltage over the last few nodes, each reduction also lowered the transistor threshold voltage, which keeps drain-to-source current at a level that allows ICs to charge their output capacitors and thus increase the performance of ICs in those nodes. However, as the industry further decreased threshold voltage at each node, it forced the subthreshold leakage to also increase at each node. “As we’ve been shrinking processes, the gate-oxide thickness is so skinny now that gate leakage is increasing exponentially,” says Keating. “Somewhere around 65 and 45 nm, you end up with dynamic power equal to subthreshold current and equal to the gate-leakage current. We have a train wreck; only, in this case, we have three trains—dynamic power, subthreshold leakage, and gate leakage—headed to exactly thesame spot.”

In the past, overall power density has essentially stayed the same for every process reduction. But, in 2005, the ITRS (International Technology Roadmap for Semiconductors) released a study that indicated that at the 65-nm node, dynamicpower density and leakage power would increase by 1.43 and 2.5 times, respectively. At the 45-nm node, the ITRS predicts, dynamic-power and leakage-power density will increase to two and 6.5 times, respectively. In reality, designs in highspeed 65-nm processes lose as much as half their power to leakage. Many in the industry believe that, by the 45-nm node, ICs will lose as much as 60% of their power to leakage (Figure 1). “Until recently, we’ve been dealing with power by simply making different trade-offs in silicon,” says Keating. “That option is sort of disappearing. Using these design techniques isno longer an option; it is a requirement.”

To deal with power management, the electronics community is employing new low-power techniques and materials on several fronts (Figure 2). Fabs have introduced multithreshold, multivoltage transistors; SOI (silicon-on-insulator) and low-k materials; body, or “back”, biasing; and copper-metal and SiGe (silicon-germanium) substrates. Meanwhile, chip architectsand software designers deal with low power by performing smart-hardwareversus-software trade-offs; by implementingpower-aware operating systems, introducingmore hibernation modes intosystem design; and by more selectivelygranting memory access. IC designers arealso employing several techniques to lowerthe power of their designs. The mostpopular techniques for low-power designinclude multithreshold design, multivoltagedesign, clock gating, power-awarememories, and power gating.

AT A GLANCE
  • At the 45-nm node, leakage power consumes 60% of an IC’s total power.
  • Foundries now offer several libraries, each with multiple threshold voltages to manage power.
  • The EDA industry has split into two factions supporting two similar power standards: UPF (Unified Power Format) and CPF (Common Power Format).
  • Clock gating is one of the oldest tricks in the book, but power gating is quickly becoming the hottest technique for low-power design.

Jerry Frenkil, chief technology officer, vice president, and general manager of Sequence Design’s Silicon Business Unit, notes that low-power design is all about reducing one or several parts of the power equation: Dynamic power plus leakage power equals the device’s overall power consumption. Dynamic power is the power a device consumes when a user is employing it for its intended purpose, and leakage power is the power thatleaking transistors waste (Figure 3 ).

Custom and circuit designers over the years have employed several techniques to lower the power of their designs, according to Kurt Keutzer, a professor at the University of California—Berkeley, who is a co-author and editor of Closing the Power Gap Between ASIC & Custom: Tools and Techniques for Low Power Design (Reference 1), which was due to be published by the time this article appears. However, he says, the power consumption of today’s typical ASICs may be three to seven times that of custom ICs fabricated in process technology of the same generation. He and one of the book’s co-authors, David Chinnery, estimate that, by employing low-power-design techniques, users can improve the energy efficiency of their ASIC designs by a factor of two to three. “The main finding is that ASIC designers are failing to implement a lot ofpossible power savings,” says Keutzer.

But there’s no single easy solution in low-power design. “There are a lot of techniques … and different methods attack different portions of the power equation. They usually have some overhead of some sort,” says Frenkil, also a contributor to the book. “Some may have no overhead, others may affect you in area, and others may affect you in speed. One of the critical things about low-power design is understanding the impact of what you are facing and how you are going to deal with it.” Indeed, users will have to mix and match many of these techniques to come up with a low-power methodologythat works for them.

MULTITHRESHOLD DESIGN

About five years ago, when excessive power consumption became a problem, foundries started to offer libraries for low-power and high-speed design. For example, TSMC (Taiwan Semiconductor Manufacturing Co, www.tsmc.com) offers a standard, or nominal, library; a high-speed library; and a low-power library, each having several types of cells. For instance, each of TSMC’s libraries includes low-threshold-voltage, highthreshold- voltage, and threshold-voltagewith- MTCMOS (multithreshold-CMOS) cells. Multiple-cell libraries help designers deal with both leakage and dynamic power. To deal with leakage power using multiple types of cells, designers today employ multithreshold design. “Because we’ve played so many games with VDD and VTH [threshold voltage], we can’t create one library that is going to work for an entire design, because you have designs that are speed-critical, and, for the areas that are not speed-critical, you wantto reduce the leakage,” says Keating.

A multicell library typically comprises at least two sets of identical cells that have different threshold voltages. Those with higher threshold voltage are slower but have less leakage; conversely, the cells with lower threshold voltage are faster but leak. “It is a nonlinear relationship,” says Keating. “Conceding a little bit of speed, you get a very dramatic reduction in leakage.” Frenkil says that a high-threshold-voltage cell typically has50% less leakage than a low-threshold-voltage cell with no bad side effects, suchas area gain.

For most applications, designers typically use a low-threshold-voltage library for a first pass through synthesis to get maximum performance and meet timing goals. They then determine the critical paths in their design—that is, the path or paths in the design that require the highest performance. They then try to locate areas that don’t require low-thresholdvoltage cells and swap out low-voltage cells for high-voltage cells to reduce overall power and leakage of the design. Frenkil notes that this approach represents the most common use of the multithreshold- design technique because most applications have timing as a first requirement, low-threshold-voltage libraries run faster through synthesis, and synthesis tools ultimately produce smaller design areas from these libraries. Synthesis tools tend to run longer and produce larger design areas when running heavy doses ofhigh-threshold-voltage cells.

However, in some wireless-system applications, power is the main goal, and area increases are less of an issue. In those cases, some designers first run synthesis with high-threshold-voltage cells, find the critical path, and then swap out the high-voltage cells with low-voltage cellsuntil they reach their performance goal.

EDA INDUSTRY QUIBBLES OVER POWER STANDARDS

The EDA industry is responding to the challenges that designers face in the fi eld of power consumption. However, overcoming the low-power hurdle in a timely manner will likely require EDA vendors to collaborate on a common power format. Unfortunately, on the power front, the industry splits into two camps: A few small EDA companies back Cadence’s CPF (Common Power Format), under the auspices of Si2 (Silicon Integration Initiative, www.si2.org), whereas Synopsys, Mentor, and Magma back Accellera’s (www.accellera.org) UPF (Unifi ed Power Format). Recently, it looked as if the two formats would merge under the IEEE, but politics in the industry have—for now at least—dashed the hope of that development. Ironically, those who have had access to both formats say that UPF and CPF share roughly 85% of the same functions. However, the EDA companies are guarding their formats in the hope that they will become de facto standards and thus be able to capture market share in a new tool area.

For now, it looks as though users and EDA vendors will have to support two formats. The industry and the designers have done it before with Verilog and VHDL—both viable HDLs (hardware-description languages). However, working with and supporting two formats creates confusion, mistakes, delays, and more work for designers and vendors alike. One vendor notes that supporting two formats means that his company must budget engineering time and resources to ensure that its tools support both formats. That requirement gives engineers less time to create tools to address tomorrow’s challenges.

MULTIVOLTAGE DESIGN

Although multithreshold design helps engineers minimize leakage of their designs through the use of multiple libraries, another technique, multivoltage design, helps designers control dynamic power. Similar to multithreshold design, multivoltage design enables designers to give the critical paths and blocks in their designs access to maximum voltage for the process and specification, but the designers then reduce the voltage for less power-hungry blocks. For example, Keating says, a processor block may require a clock speed of 500 MHz, but a USB core may require only 30 MHz to comply with the USB protocol and thus require less voltage to run. So, if designers give the USB core only the power it needs, they can drastically reduce the overall power the design consumes. To implement the method, designers traditionally put level shifters betweenblocks that are running at different voltages. “If you have a 0.9V region onyour IC design that is sending a signalto a 1.2V region, you have to put a levelshifter between the two regions so youcan boost it to the swing in voltage andcontrol timing,” Keating says.

Although a fairly simple concept, its implementation is more complex. First, designers must get used to dealing with multiple voltages on a die. “We are really trained as engineers that a chip has just one power supply, and now you have to deal with some complications,” says Keating. There are also some fairly significant challenges on the tools front. Most commercial synthesis and physical-design tools can insert level shifters and handle multivoltage structures, but creating RTL is a problem. “HDLs don’t yet have a mechanism for describing power connectivity,” says Keating. This lack is one area that EDA vendors are addressing by trying to implement a low-power standard. Unfortunately, the industry players have diverged between two similar standards (see sidebar “EDA industry quibbles overpower standards”).

Another method that started in custom design but is making its way into ASIC design is the use of parallelism along with voltage scaling. In their book, Chinnery and Keutzer describe this technique. Keutzer says that people atfirst dismissed it as impractical but that it is now getting serious attention. “Youparallelize to get the performance upand then scale voltage down to reducethe power and energy,” says Keutzer. “Ifyou look at dynamic power, voltage isclearly where the biggest gains will be.So, how do you get the voltage down?Given a timing constraint—2 nsec, forexample—you first over-achieve yourtiming objective. In particular, you addparallelism to get the critical path downto 1.2 nsec. Then, you can scale downthe voltage to relax back to the 2-nseccycle time you need to achieve. The decreasein voltage more than compensatesfor the increase in area.”

CLOCK GATING

Probably the oldest and most triedand- true technique for reducing power is clock gating. One-third to one-half of an IC design’s dynamic power is in the chip’s clock-distribution system. “It’s a pretty simple concept: If you don’t need a clock running, shut it down,” says Keating. Today, the two popular methods of clock gating are local and global (Figure 4). If you feed old data to the output of a flip-flop back into its input through a multiplexer, you typically need not clock again. Therefore, you can replace each feedback multiplexer with a clock-gating cell that clocks the signaloff. You would then use the enable signal that controls the multiplexer to controlthe clock cell to clock the signal off.

In the old days of digital design, designers had to manually perform this task, but any commercial synthesis tool worth its salt can now automatically do it. “The tools are all set up for that now, so they will go in, automatically look for multiplexers, and, if there is a feedback multiplexer, they’ll replace it with a clockgating cell,” says Keating. “When you start talking about 32-bit registers, you can get significant savings using this technique.” He notes that Intel (www.intel.com) engineers this year presented a paper at SNUG (Synopsys Users Group) that reported a 43% savings in dynamic powerusing the technique (Reference 2).

The other popular approach of clock gating, global clock gating, is to simply turn off the clock to the whole block, typically from a central-clock-generator module. This method functionally shuts down the block, unlike local clock gating; it even further reduces dynamic power because it shuts down the entire clocktree.

CUT MEMORY POWER

Another popular technique for lowering both dynamic power and leakage is touse power-aware memories.

In its simplest form, this approach involves shutting down segments of a memory array when they are not in use. Another technique in this category is body-biasing memories. In this method, designers reverse-bias a memory when it is not in use, which essentially raises the threshold voltage and in turn slows leakage. Another method gaining popularity is to use multimode power for memories. In this technique, designers employ memory with several power modes. Many designs employ dual-function memories so that, when the CPU accesses a memory to read or write data to run a main application, the memory receives full access to power in order to perform the operation. However, when the memory is not required to read or write, designers can program the memory to power down to a level at which the memory gets only enoughpower to retain its memory content.

POWER GATING/MTCMOS

Perhaps the most prominent new methods for low-power design are power gating and MTCMOS (Figure 5). Like voltage gating, power gating involves temporarily shutting down blocks in a design when the blocks are not in use. And, like voltage gating, the technique is complex. “The neat thing about the other techniques is that they are pretty much all transparent to the design engineer,” says Keating. “When I’m writing my RTL, I don’t have to think about multithreshold, multivoltage, clock gating, or power-aware memories because someone else down the line has to worry about it. But with power gating, I have to worry about it at the RTL. I have to design a power controller that is going to control what blocks I need to shut down and when, and I have to think about what voltage I’m going to [need to] rundifferent blocks.”

Traditionally, two variants of power gating are fine-grained and coarsegrained. In fine-grained power gating, designers place a switch transistor between ground and each gate. This approach allows designers to shut off the connection to ground whenever a series of functions is not in use. “You employ that [technique] with every cell in the library,” says Keating. “At first, people really liked finegrained power gating because it is fairly easy to do power characterization of each cell, but the problem is that the impact on the area of your design is very significant: two to four times larger.” Designers can also mix and match cells, having some power-gated and others not. Cells with high threshold voltage need not use power gating. For the most part, the power penalty is just too large, and many design groups are instead using coarsegrained power gating, in which designers create a power-switch network—essentially, a group of switch transistors that in parallel turn entire blocks on and off. The technique does not have the area impact of the fine-grained technique but is harder to characterize on a cell-by-cellbasis.

Sequence Design’s Frenkil says that a compromise—medium-grained power gating—is also starting to emerge in the design community. In this method, he says, “Power-gating cells will power small blocks individually. … If you look at a high-performance, 65-nm process, the leakage can easily be 40 to 50% of your total power design. If you are designing a high-performance chip, you have to deal with an enormous amount of leakage, so people have several separate power domains controlled individually. I’ve seen one modestly sized chip that has 20 power domains; if you scale that up to a leading-edge chip, it will have over 100 power domains.” That number would be too hard to control with either a true fine-grained or a true coarse-grained technique. Of all the techniques, power gating has the most promise, says Frenkil. “It reduces leakage more, and it will scale well into the future, where things like back-biasingwill not,” he says.

EDA vendors are feverishly attempting to automate the power-gating technique. The competing low-power standards, UPF (Unified Power Format) and CPF (Common Power Format), both aim to help design teams more effectively implement power-gating methods. For example, Keating notes that, in UPF design, engineers still must design the power controller in RTL, but several tools help with the insertion of the power mesh, isolation cells, and retention registers into a design. “Instead of doing it in RTL, you can do it in a UPF command language and specify a certain number of blocks to be isolated,” says Keating. “In one line, you can do what it would take many lines of RTL to do. The tools are smart enough to take those commands and insert them at the appropriate levels. Some get inserted during synthesis; others get inserted duringplace and route.”

The method requires either manualor tool-automated insertion of isolation-retention flip-flops. “When you shutdown a block, and its outputs go to a blockthat is still powered up, you have to worryabout those power-down nodes floating,and they can float to the threshold voltageand create unwanted currents,” saysKeating. “You have to put isolation cellson those outputs and clamp that outputto a one or a zero, so nothing gets hit by afloating current down the line.”

MORE AT EDN.COM

For more power-related articles, see:

  • “Cadence-led initiative seeks low-power standard” at www.edn. com/article/CA6336525
  • “Si2 forms low-power coalition” at www.edn.com/article/CA6377975
  • “Synopsys donates low-power technology” at www.edn.com/ article/CA6373357
  • “Magma donates low-power technology to Accellera” at www. edn.com/article/CA6374012

Go to www.edn.com/070524cs and click on Feedback Loop to post a comment on this article.

The method also requires the use of retention flip-flops. Keating notes that one of the problems with shutting down a block is that the block needs to restore or maintain all its states. To achieve this goal, designers can use retention flipflops, in which the main part of the flipflop has a low threshold voltage—that is, fast but leaky—and it sits beside a balloon register of high-threshold-voltage, low-leakage cells. “Just before you shut down a block, you put the output of the flip-flop into a balloon register,” says Keating. “Then, everything but the balloon register gets powered down to maintain the states. When the block powers back on, the balloon register dumps everything back on the main flip-flop,which helps quickly power the block up.”

EDA TO THE RESCUE?

Frenkil notes that, although EDA vendors offer a wide range of tools to help designers implement low-power-design techniques, the EDA industry also offers power-integrity tools to help designers consider the effects of design decisions on power. Power-integrity tools perform voltage-drop analysis, voltage-derated timing analysis, noise-margin analysis, and power-bus sizing. Many vendors offer low-power tools to attack the problem from every angle (Table 1 on the web version of this article at www.edn-europe. com). According to Keutzer, the EDA industry has yet to address some problems adequately. For example, the industry could provide tools that ease ASIC designers’ ability to implement microarchitecture techniques, such as pipelining; to more efficiently lay out clock networks; and to more effectively use transparent latches. However, he notes, no EDA tool can solve everyone’s power problem: there is no single tool or strategy that will yield a solution; rather, you must apply arange of individual measures.

Designers must become familiar with a mix of low-power-design techniques and should also investigate which tools will help them achieve their power goals. The EDA industry is trying to market a healthy field of tools to help designers control power. Eventually, vendors hope to provide design flows to allow designers to make trade-offs among timing, power, signal integrity, and, eventually, even thermal analysis (Reference 3). Top semiconductor companies, design houses, and EDA players are trying to establish a common power format. Even with the current field of EDA tools and the rough beginnings of integrated lowpower flows, however, the EDA industry still has much work to do before it cansolve the power problem.

REFERENCES
  1. Chinnery, David, and Kurt Keutzer, Closing the Power Gap between ASIC & Custom: Tools and Techniques for Low Power Design, ISBN: 978-0-387- 25763-1, Springer, June 2007, www. springer.com/west/home/generic/ search/results?SGWID=4-40109-22- 77145813-0.
  2. Pokhrel, Khem C, “Physical and Silicon Measures of Low Power Clock Gating Success: An Apple to Apple Case Study,” proceedings of Synopsys Users Group, San Jose, CA, 2007, www.snug-universal.org/cgi-bin/search. cgi?San+Jose,+2007.
  3. Santarini, Michael, “Thermal integrity: a must for low-power-IC digital design, EDN, Sept 15, 2005, pg 37, www.edn. com/article/CA6255052.


 

Our Sponsors



Ads by Google