Altering The Soc-Design Flow

Powerful forces are at work, resculpting SOC-design methods into new shapes.

By Ron Wilson • Executive Editor -- EDN Europe, 01 Aug 2010

SOC (System-On-Chip)-DESIGN TEAMS at the leading edge of their markets say that “business as usual” is no longer the case. Powerful technical and business forces, seemingly independent of the EDA vendors’ road maps, are resculpting SOC-design methods into new shapes, profoundly different from the best practices of only a few years ago. The change is painful for many architects, designers, and managers. To cling to the past is to embrace failure, however.

The driving forces of this change are several. Financial and geographic realities have mandated a growing dependence upon third-party IP (intellectual property) and have attenuated the feedback loop from downstream issues to RTL (register-transfer-level) corrections. Complexity— especially in power and clock networks of aggressively power-managed designs—has forced tasks that formerly resided downstream to cluster early in the design flow. Also, the challenges of advanced processes have influenced both front- and back-end practices.

The Driving Forces

AT A GLANCE
  • IP (intellectual-property) reuse,
    power management, and advanced
    processes are altering the SOC
    (system-on-chip)-design flow.
  • More work is shifting to the planning
    stage of the design.
  • Power management can have a
    disproportionate impact on verification
    and physical design.
  • Until tools can catch up, plan on
    the need for many iterations to get
    through design closure.

IP was supposed to be part of the solution, not part of the problem. IP reuse— of everything from I/O controllers to CPUs—has made it possible to both disperse and shrink design teams. The pervasive use of IP has changed the nature of the design flow, however. The flow formerly comprised building behavioral requirements, reducing them to RTL, synthesizing a netlist, and implementing it in cells. It has now become a process of assembling and imposing closure on an ad hoc set of complex, increasingly unalterable, and opaque functional blocks. When designers encounter problems with integration or closure, the original IP developer is often the only one who can help.

Although IP reuse has helped with the sheer size of designs, it has not helped with other facets of complexity. This situation is especially true for power management. Clock gating is a mandatory design step for reducing dynamic power. But it has made SOCs’ clock networks so complex that the clock trees are, in essence, other signal networks, requiring extraction, timing, power, and signal-integrity closure. And the use of voltage islands, power gating, and DVFS (dynamic voltage/frequency scaling)—just coming into use in most design teams—promises to similarly complicate power grids.

Finally, the processes themselves are forcing change. Despite the heroic efforts of process engineers and cell-library developers, the complexities of advanced processes have, by the 65-nm node, begun to tunnel through the custom/cellbased barrier to present themselves to chip designers. “Our memory-compiler designers have had to deal with process variations, with the weak drive strengths of cells, and with increasingly complex DFM [design-for-manufacturing] rules,” says Lisa Minwell, director of technical marketing at Virage Logic. All these issues are now facing chip designers in cell-based flows. These forces have combined not merely to make design more difficult, but also to change the underlying approach.

The Crucial Beginning

Open-Silicon recently executed a 100 million-gate wireless-networking SOC. The company executed this design in TSMC’s (Taiwan Semiconductor Manufacturing Co’s) 65-nm CMOS process. “The key to this design was upfront planning,” says Taher Madraswala, Open- Silicon’s vice president of engineering. Open-Silicon partnered on the physical design of the chip with ASIC-designservices company Brite Semiconductor, using requirements and RTL from HiSilicon, a fabless semiconductor vendor. “This was pretty much a topdown design,” Madraswala says, noting that clock layout significantly drove the upfront work.

Open-Silicon’s task began with understanding the design and performing risk assessment. “This was going to be a very large die with some very long tracks,” he says. “So we spent three days in meetings just understanding the clock structure, for instance.” Understanding the sources, consumers, and gating structures of the various clocks was a necessary preliminary to block placement. If the team got it wrong, there would be little chance of closing clock timing.

Open-Silicon had to work with multiple instances of IP cores that had essentially fixed pin placement, placing another constraint on block placement. “The problem was repeatability,” Madraswala explains. “If you change the orientation of the core, critical routing lengths change, and you get different timing.” The team thus performed a preliminary routing of top-level signals, clocks, and I/Os and then used that routing as the basis of the design’s partitioning and the placement of the resulting blocks.

“It is very difficult now to partition a design at the system level,” says Venkat Mattela, Redpine Signals’ chairman and chief executive officer, noting that engineers must do power planning early. In Redpine’s design, an extremely low-power 802.11n transceiver for embedded-system applications, the definitions of the modules at RTL made them self-contained entities with respect to the chip’s power strategy. The partitioning into modules followed not only functional boundaries but also the boundaries between voltage islands and clock domains. Consequently, the design team could capture the power intent for each block of RTL in a UPF (Universal Power Format) file at the outset of the design.

Other issues can also demand attention early in the design. For example, Vitesse Semiconductor recently developed a 24-port switch SOC with integrated copper PHY (physical-layer) blocks (Figure 1). According to Mandeep Chadra, director of design at the company, the power consumption of the PHY blocks dominated the task of estimating how much the designers could integrate because these blocks consume much of the total power. “Throughout the planning process, power remained a major issue, especially since we were targeting a wirebond package,” he says. Far from being an afterthought, that package intruded into the early decisions in chip planning. Without a flip-chip signal-redistribution layer, the arrangement of I/Os on the die had to reflect the pinout of the chip. At these frequencies, the chip had to reflect the layout of the boards that would use it. As a result, the physical layout of a switch box directly influenced the company’s floorplan, Chadra adds.

Two issues—power-management strategy and top-level signal, clock, and power routing—emerge in the early stages of chip planning, and EDA vendors have reacted to these changes. Power-aware flows from all the major companies now encourage designers to capture power intent early in standard CPF (Common Power Format) or UPF files, which then guide implementation of power management through synthesis, placement and routing, and verification.

Vendors are paying increasing attention to the need for design teams to have preliminary routing information as early as the partitioning and floorplanning stages of the design. “In the early stages of design, the biggest surprise is congestion,” says Pravin Madhani, division manager of Mentor Graphics’ place-and-route division. “So people are running their place-and-route tools very early to check for potential congestion problems.” This trend, in turn, is leading place-and-route-tool vendors to extend their tools for use in the preliminary stages of the design.

Unexpected congestion issues yield costly consequences. “We encountered a couple of blocks with congestion issues,” says Open-Silicon’s Madraswala. “We had to go back and get a rewrite on the RTL to deal with them.” That step involved another pass through verification, setup, and synthesis for those blocks. Open- Silicon had from the outset, however, cultivated a fast feedback path to the RTL designers at HiSilicon by putting a six-person design team in HiSilicon’s location in China.

Congestion surprises with thirdparty IP can be worse. For instance, what if the IP vendor lacks the resources to make RTL changes on your schedule or if the congestion is at the pins of a hard- IP block? In the worst-case scenario, the SOC team may have to replace the IP vendor. Having the design partitioned and placed consistently with the power strategy and having an early view of toplevel routing have thus become missioncritical issues.

Synthesis and Verification

The design teams at Open-Silicon, Vitesse, and Redpine don’t find synthesis a major problem. Rather, they focus on how to avoid repeated trips through synthesis. “We treated each block of RTL as if it were an independent die,” Madraswala says. “Then we focused on exiting each step in the flow for each block at a high-enough level of quality of results. Maybe as a consequence, after clock insertion, we made only one pass through synthesis.” Open-Silicon used its synthesis tool to automatically insert clock gating. Otherwise, configuration at the architectural level handles the power management in the chip, according to Madraswala. “There are power islands, but, because the power management comes explicitly through the RTL, we didn’t need anything like CPF,” he says. Similarly, the Vitesse design used extensive clock gating, but had only one power-gated block, and Chadra reports no issues with the normal synthesis flow.

Redpine, however, used a more aggressive power-management strategy and pushed the tools harder. That approach had an impact on the design flow (Figure 2). Mattela says that, in principle, if you have correctly organized the RTL and accurately captured your power intent, you should be able to feed the RTL, the UPF, and a power-aware library into synthesis and receive a netlist with all the isolators, level shifters, and controls in place. In reality, though, “you push the button, and it doesn’t happen,” he laments. Everything may look right structurally, but a detailed manual verification with voltage-aware tools may tell a different story.

Verification appears to have adapted to the new order more than has synthesis. With growing complexity, functional verification is starting earlier and at a more abstract level. “We followed a coverage-based OVM [open-verification- methodology] approach,” says Vitesse’s Chadra. The process started early in the design with behavioral models of the 24-port switch core and the MIPS CPU core to understand the dynamics of the chip under traffic flows. Verification then continued in increasing detail until the test bench was driving gate-level models with the clock-gating circuitry and the isolators in place. “We had specific targets in our verification plan based on our requirements document,” Chadra says.“We augmented those targets with code-coverage metrics to guide the verification effort.”

“New tools are often problematic. ‘Let’s say that some of the point tools are not mature.’”

Redpine’s Mattela says that the company’s DVFS design required special care. Part of the problem was that logic simulators can’t tell you whether a mismatch in signal levels would cause chaos in a path between voltage islands. So Redpine verification engineers resorted to manual techniques, such as forcing nodes to tristate to see what would happen downstream. And part of the problem, Mattela warns, is that you never know the source of the models you are using. “Don’t trust the models in a multivoltage situation,” he states. “You don’t know if they were written by an electrical engineer or by a software person who thinks a one is a on e and a zero is a zero.”

The Back-end Flow

You now need to consider the physical-design phase: placement, routing, and design closure. During this phase, the impacts of IP reuse and design complexity begin to wane but do not by any means disappear. And the challenges of advanced processes cast a lengthening shadow over every step. First, the good news: Design managers seem to agree that tools have picked up and automated many of the new tasks that until recently had been manual. Madraswala says that Open-Silicon was able to take advantage of the DFM awareness of IC Compiler to help prepare for the complex design rules the process imposed. “A few years ago, everything about taking a power-managed design to tape-out was manual,” says Mattela. “Now, there have been huge improvements, especially for post-routing validation.”

The forces of change are still imposing problems, however. One is simply that new tasks breed new tools, and new tools are often problematic. “Let’s say that some of the point tools are not mature,” Chadra says. The tools’ capacity is a more widespread issue. “We had to partition the design and run it through the tools in pieces,” he explains. “Fortunately, most of the chip breaks into very natural segments. The biggest challenge was getting the switch through place and route.”

Madraswala also cites place-and-route capacity. “You have a very limited design size when you switch on DFM awareness in IC Compiler,” he says. “We were limited to about 400,000 placeable instances,” yielding a small needle’s eye through which to drive a 100 milliongate design.

Capacity was not the only issue with the place-and-route tools. Modern routers are timing-aware—that is, rather than trying to find the best possible route for every wire, the tools read the design’s timing constraints and try to route to meet the timing on all nets. This process requires that the tool be able to estimate the delay on a proposed route, which in turn requires an estimate of the route’s capacitance. So modern routing tools either call the sign-off extraction tool, which can be disablingly slow, or have built-in “quick-and-dirty” extraction estimators. Unfortunately, even at the 65- nm process node, parasitic extraction is a complex job for which there are no known quick approximations. “There are differences between IC Compiler and reality,” Madraswala says.

Chadra is no more flattering. “The router capacitance estimates are not very accurate,” he says, without specifying which place-and-route tool he is discussing. “We had the tool make some huge detours, which we had to go back and reroute.”

The problem of timing estimation puts the EDA vendors in a dilemma. If the router’s quick capacitance guesses are bad, physical-system designers will experience iterations between extraction, timing, and rerouting. Runtime and capacity will suffer if the router calls the sign-off extraction and timing tools, which have become complex because they must deal with all the effects of fine geometries.

After these chip designs were complete, both Cadence and Synopsys announced a third potential approach: moving preliminary placement and timing into the synthesis tools—even earlier in the design flow. It’s not that the estimates will get any better but that the tool designers apparently hope to steer the synthesis tools away from creating netlists that the router will misestimate and misroute.

A similar issue exists with routers and design rules. If the router doesn’t keep track of design rules as it works, many violations will emerge in the finished file. Routers thus pick up design rules from the LEF (layout-exchangeformat) files and check the routes as they go. This process appears to have worked satisfactorily for digital circuits at the 65-nm node. Mentor Graphics’ Madhani warns, though, that LEF can’t express some of the rules, such as pinch rules, in advanced processes. Mentor thus now has its Olympus router dynamically call the Calibre sign-off tool for DRC on the fly. Again, this approach involves performance costs, but slow is better than wrong.

“Power domains and third-party IP present issues for back-end design, as well. ‘Multiple power domains can lead to a complex closure.’”

Perhaps surprisingly, after all the front-end work that has gone into them, power domains and third-party IP present issues for back-end design, as well. “Multiple power domains can lead to a complex closure,” says Keh-Ching Huang, director of marketing at ASIC vendor Global Unichip. “We have to use a lot of manual procedures and scripts with them.” Huang says that even IP choices influence the closure flow. “For instance, if a customer uses a low-speed DDR [doubledata- rate] interface, the IP block usually comes in soft form and we have to synthesize it. There will be timingclosure issues within the block. But if the customer licenses a high-speed DDR interface, it comes as hard IP, so the whole closure process is different. If there are issues, they will usually be with the package.” Altogether, the impact of a design comprising mainly IP from outside sources on the struggle for design closure remains an underexplored question.

One final point is the impact of the new environment on analog design. Vitesse redesigned its copper PHY for this project, modifying the previous design to reduce power. In the process, the analog designers ran headlong into a number of layout-driven effects that were new in the 65-nm process. “We learned about well-proximity and drain-placement impacts on device performance,” Chadra says. “The device models do an adequate job of modeling the effects, but we still had to do layout-extraction iterations to get the circuits performing as we desired.”

So what is the big picture? Certainly SOC design today requires more upfront planning, especially for dealing with long routes, clocks, and powermanagement strategies. Verification planning upfront is also vital. Teams should understand that a lot is going on in synthesis tools. This step is no longer a straightforward replacement of Verilog statements by standard cells. Teams should thus plan to minimize iterations through the synthesis tools, especially once delicate structures, such as gated clock trees and test-scan chains, are in place. Likewise, teams should understand that aggressive power management can vastly complicate verification, and that this concern might justify choosing a more organic power-management strategy over a complex one.

Finally, physical design and closure are becoming more difficult. Choose frontend tools or develop scripts to head off congestion problems early. And plan for iterations between routing and the signoff tools because they probably won’t agree with each other. In basic structure, it’s the same old flow. But the emphasis is shifting. “Probably 60% of the steps in this design were the same old steps,” Madraswala says. “About 30 or 40% were specific to 65 nm, but those were the steps that caused the majority of the issues."

For more information
Brite Semiconductor
www.britesemi.com
Open-Silicon
www.open-silicon.com
Cadence
www.cadence.com
Redpine Signals
www.redpinesignals.com
Global Unichip
www.globalunichip.com
Synopsys
www.synopsys.com
HiSilicon
www.hisilicon.com
Taiwan Semiconductor Manufacturing Co
www.tsmc.com
Mentor Graphics
www.mentor.com
Virage Logic
www.viragelogic.com/index_en.asp
MIPS
www.mips.com
Vitesse Semiconductor
www.vitesse.com


 

Our Sponsors



Ads by Google