CONVENED WEDNESDAY July 29, 1:30pm - 3:00pm | Concourse Level

TOPIC AREA: GENERAL INTEREST


SESSION 5U
USER TRACK: Poster Session and Ice Cream Social



The poster session includes approximately 40 posters on topics that span both front-end and back-end design.  The posters will offer an opportunity for personal interaction with EDA tool users from many leading companies. 

User Track

Show Front-End Posters
Show Back-End Posters

Front-End

View Slides3-D Visualization of Integrated Circuits in the Electric™ VLSI Design System
Steven Rubin, Gilda Garreton - Sun Microsystems, Inc., Menlo Park, CA
Authors:

The Electric VLSI Design System, an open source VLSI CAD package, was enhanced with the ability to view integrated circuits in three-dimensions, offering designers advanced visualization capabilities not available in commercial tools. The 3-D viewer has been used in the design of specialized structures whose 3-D properties are important, for example, transmitter arrays for proximity-communication chips. The 3-D viewer is also valuable when designing parts of a chip that are intended to be accessed externally, such as pads and fuses for post-fabrication circuit modifications.

The 3-D viewer has been used in the design of dense circuitry which, although it can be analyzed by standard inductance tools, can more quickly be debugged simply by examining it from all angles. Finally, the 3-D viewer is being brought to classrooms to facilitate the understanding of layout principles. This poster displays Electric’s 3-D visualization facility and some chips that have benefited from it.

View SlidesAutomatic Generation, Execution and Performance Monitoring of a Family of Multiprocessors on Large Scale Emulator
Xinyu Li, Omar Hammami - ENSTA, Paris, France
Ludovic Larzul - EVE, Palaiseau, France
Authors:

Multiprocessor system on chip design requires fast design productivity. In order to achieve this goal we propose an automatic flow for the generation of multiprocessor, execution on large scale emulator and performance monitoring. We report results based on OCP benchmarks for a 672 processors with NOC multiprocessor.

View SlidesC-Based Hardware Design Using AutoPilot™ Synthesizing MPEG-4 Decoder onto Xilinx FPGA
Jason Cong - Univ. of California, Los Angeles, CA
Zhiru Zhang - AutoESL Design Technologies, Inc., Los Angeles, CA
Yi Zou - Univ. of California, Los Angeles, CA
Authors:

We share our experience in using a state of the art Electronic System-Level (ESL) tool AutoPilot to synthesize the algorithmic description of the MPEG-4 decoder specified in C language into a hardware implementation on a Xilinx FPGA. We evaluate the tool capability for both performance optimization and area optimization.

Our experience shows that most of the optimizations can be automated by various features inside the tool, while some optimizations still require code refinements and insights of hardware design. We are able to obtain a design which meets the desired performance target, while the area of most functional blocks is smaller than the corresponding modules in the manual design.

View SlidesC-Based High-Level Synthesis of a Signal Processing Unit Using Mentor Graphics Catapult C
Axel Braun, Tobias Oppold, Joachim Gerlach, - Univ. of Tübingen, Tuebingen, Germany
Holger Janssen - Robert Bosch GmbH, Hildesheim, Germany
Wolfgang Rosenstiel - Univ. of Tübingen, Tuebingen, Germany
Authors:

Today's embedded systems have to fulfill a variety of different requirements. Starting with real time constraints, these systems have to achieve requirements for safety, performance, energy dissipation, and chip area. Additionally, economical constrains like time-to- market, product flexibility, and costs have to be achieved. In modern IP-based design flows, high-level synthesis plays an important role for the design of application-specific functional IP blocks. High- level synthesis enables a fast path from an algorithmic specification down to a hardware implementation of an IP block. High-level synthesis also allows an efficient optimization of the design.

This poster describes a C/C++-based high-level synthesis flow based on Mentor Graphics high-level synthesis tool CatapultC. The example application for this project is coming from the video signal processing domain. The final product has very strict timing restrictions in terms of throughput and latency to fulfill. The target market is a mass market with very strong constraints for costs and a high demand for flexibility. The poster shows an evaluation of the entire C/C++-based hardware synthesis design flow including RTL synthesis, logic synthesis, and place and route stages for this application-specific IP block. It presents the example application and its constraints coming from a real life industrial product development. The poster gives insights into the pre-requisites to the C/C++ code, the application-specific challenges and the modifications necessary for hardware synthesis. Additionally, the optimization process to achieve the constraints for the design are presented in detail.

View SlidesDesign and Verification Challenges of ODC-Based Clock Gating
Chaiyasit Manovit, Sridhar Narayanan, Sridhar Subramanian – PwrLite, Inc., Santa Clara, CA
Authors:

Power consumption has clearly become a major constraint in chip designs. Among several techniques for optimizing power of digital designs, clock gating has been proven successful in reducing the active power while maintaining the functionality and performance level. In particular, clock gating based on observability don't care (ODC) conditions is an effective way to reduce energy dissipation because it can reduce switching activity on both clock and data paths of flops whose value changes are not observed. However, some issues inherently associated with the ODC-based clock gating can affect the correctness of both design and verification. We will present solutions and guidelines for handling these issues.

View SlidesEffective Debugging Chip-Multiprocessor Design in Acceleration and Emulation
Yunji Chen - Chinese Academy, Beijing, China
Authors:

More transistors on a single die make multicores possible, as well as bring great challenge to functional verification. For example, the 16-core Godson-3 Chip-Multiprocessor (CMP) contains more than 1 billion transistors. Debugging such a huge design is time-consuming, therefore it becomes a potential bottleneck of time-to-market. Acceleration and emulation can facilitate debugging through relatively high simulation speed. However, debugging requires not only tools but also human wisdom, it is still difficult to catch and to localize bug even in acceleration and emulation. In this paper we introduce some experiences of debugging CMP design in acceleration and emulation, including well-chosen assertions, dynamically recording specific debug message, and multi-level dumping. They help to effectively debug CMP design while bring low cost on performance and capacity on accelerator or emulator. With these techniques, many bugs in Godson-3 CMP are found or solved on the Xtreme III accelerator/emulator.

View SlidesEnabling IP Quality Closure at STMicroelectronics with VIP Lane
Olivier Florent - STMicroelectronics, Grenoble, France
Stéphane Bonniol - Satin IP Technologies, Montpellier, France
Authors:

With SOCs comprising dozens of IP blocks, the key to harmless integration and predictable design schedules is the quality of IP deliveries. To help improve IP quality, design standards and quality metrics have usually been defined within companies. Many different CAD tools can now be used or even customized to implement those standards and to produce various metrics at different stages of the IP development cycle. However, with the growing complexity of current IPs added to the high complexity of sub-micron design flows, the quantity of data to be monitored is becoming immense, causing analysis time to consume more of the design schedule than the time to design. Therefore there is a clear need for an infrastructure able to automatically extract the right information at the right time. This helps IP providers to monitor efficiently the quality of their deliveries, and should also give SOC integrators a better overview of the quality of what they get.

This poster describes how the Home Entertainment & Display (HED) group of STMicroelectronics has been using Satin IP’s VIP Lane to improve IP quality. VIP Lane is used to extract key data from the HED IP design flow, to monitor the IP quality during its development cycle, to generate IP integration documents at delivery time, and to consolidate quality metrics at SOC level from the constituent IP blocks. Thanks to this approach, IP quality in HED is improving, and time to integration is being reduced.

View SlidesFormal Verification Based Automated Approaches to System-On-Chip DFT Logic Verification
Subir Roy, Rubin Parekhji - Texas Instruments, Inc., Bangalore, India
Authors:

An exceedingly important phase, not featuring as a prominent front end task in the design of any system-on-chip (SOC), is the integration of DFT logic and the verification of this integration to other sub-systems and IPs in a SOC. This constitutes a significant portion of the overall design and verification effort. Any savings in this component helps in reducing the overall chip design cost. Automation of these tasks is the key to realizing this cost reduction. A key enabling factor for this automation is the predominantly canonical and regular nature of the structures and behavior of most DFT IPs, leading to the kind of convergence presently seen towards standardized configurable DFT logic architectures, which are amenable to being tool generated. The key motivation of our approach has been to automate integration verifications of IPs and DFT logic towards, 1) cycle time reduction by a factor of two in the DFT logic verification task by minimizing usage of simulation based chip level verification requirements, 2) improvement in Silicon quality by elimination of all DFT logic and its SOC integration related bugs and 3) deployment of DFT logic generation, its integration in SOC and its verification through a common infrastructure to facilitate re-use of these tasks across different SOC designs. One of the key contributions in the automation of the DFT logic verification task has been the deployment of formal verification techniques. To leverage the capabilities of formal verification in the context of auto-generated configurable modules, it is essential that the formal properties, themselves be configurable and auto-generated, along with the formal verification environment. This enables high re-usability of properties developed during the tactical formal verification of each module present in the DFT logic subsystem. In this paper we give the justification for taking this approach and show how this has been achieved in Texas Instruments and the benefits that has been observed on several SOCs.

View SlidesInteractive Code Optimization for Dynamically Reconfigurable Architecture
Kenji Funaoka, Mayuko Koezuka, Akira Kuroda, Hidenori Matsuzaki, Takashi Yoshikawa, Shigehiro Asano - Toshiba Corp., Kanagawa, Japan
Authors:

Dynamically reconfigurable architecture attracts increasing attention thanks to the balance it strikes among performance, power consumption, and flexibility. Toshiba Corporation has developed a dynamically reconfigurable architecture named FlexSword, which employs a heterogeneous design to fully educe the superiorities of dynamically reconfigurable architecture. Instead of its great performance, the heterogeneous design requires a great deal of trial and error on the part of software developers who try to write an assembly code directly. Although a compiler has been developed with a view to achieving completely automated compilation, there are still many obstacles to the achievement of perfect optimization.

This poster presents the collaboration between a compiler and an interactive development tool to help optimizing program code in a short period. Developers can intuitively improve the intermediate compilation code with the interactive development tool. Case studies show the optimization flows of an image filter and a part of H.264 decoder. Their performances are increased 15.8% and 50.0% with the interactive tool, respectively.

View SlidesPower Gated Design Optimization and Analysis with Silicon Correlation Results
Lee Kee Yong, Fern Nee Tan, Sze Geat Pang - Intel Corp., Penang, Malaysia
Authors:

High gate and sub-threshold leakage current and its impact on standby performance for sub-90nm designs has been one of the challenges confronting low power IC designers. Increased device current drive strength along with higher device placement densities require IC designers to be aware of the performance to power tradeoff they need to make at every stage of the design cycle, from the architectural level to physical implementation. With the focus mobile computing, and other low-power applications, various creative techniques are used to extend the useful life of a battery.

Conventional techniques of reducing a design’s dynamic/switching power by clock gating unused clock trees do not yield necessary controls on standby or leakage power [1]. Hence, to extend battery performance and to offset higher power consumption, power gating techniques have to be used. In this technique, power supplies for selected area of the circuits are powered off by switches or sleep transistors to reduce the leakage during the off-state or standby mode. This technique also was known as “MTCMOS” on synthesis logics design.

A new methodology of simulating a power gated design is presented. Various aspects of analyzing and verifying a power gated design from the initial planning and cell definition stage to the sign-off stage is outlined. Finally silicon correlation result qualifying the simulation results is presented.

View SlidesSystemC: A Complete Digital System Modeling Language: A Case Study
Reni John - Rambus, Inc., Los Altos, CA
Authors:

View SlidesTransforming Simulators into Implementations
Nikhil Patil, Derek Chiou - Univ. of Texas, Austin, TX
Authors:

Designing a next-generation computer system is a complex endeavor involving a huge investment of human and monetary resources. The development begins with architectural exploration, in which computer architects write performance models for the system they envision. Design decisions are made after extensive simulation, and the architecture is passed along to the RTL and Verification teams in the form of an extremely detailed document. Although the architectural simulator contains most of the information needed by the teams downstream, they _cannot_ use the simulator to aid their development effort.

FAST simulators are a novel simulation technology that can produce full-system simulators that are simultaneously fast and accurate; the timing and power models of such simulators are implemented on an FPGA. It is possible to make a FAST simulator accurate enough to the extent that it contains all the information needed to create an implementation. This information is encoded in synthesizable HDL and is therefore a good candidate for an automatic transform into an implementation. We mention the potential applications of such technology and outline the changes required to the existing FAST simulation infrastructure to make such a transformation feasible.

View SlidesUsing Algorithmic Test Generation in a Constrained Random Test Environment
Håkan Askdal - EAB (Ericsson AB), Kista, Sweden
Authors:

Getting sufficient Functional Coverage in a timely manner is a challenging task in ASIC/SOC verification. For the last decade, logic simulation using Constrained Random Stimuli Generation (CRSG) has been considered as the state-of-the-art methodology. However, caught in the squeeze between ever increasing complexity and shrinking timelines, verification engineers are finding it more and more difficult to reach their verification goals within the given timeframes. Algorithmic Test Stimuli Generation (ATSG) is a recent technology that promises to significantly shorten the time to closure. Using a rule-based approach to completely describe a protocol, a tool can then generate an exhaustive set of tests using a minimum number of patterns. Potentially ATSG can be orders of magnitudes more efficient than CRSG in terms of required number of simulation cycles. In order to obtain a full coverage goal.

But introducing new technology into an existing verification environment is not trivial. In our case major investment has been made over the years in developing a complex CRSG verification environments based on the hardware verification language ‘e’. The investment in ‘e’ test bench functions such as ‘e’-Verification Components (eVC’s), sequence drivers, scoreboards, checkers and functional coverage represents many man-years of effort. Therefore it was a critical requirement that as much as possible of the existing verification infrastructure could be reused while introducing ATSG techniques to make the verification more efficient. In this poster-presentation we will describe how we proceed to develop a simulation methodology based around ATSG in a CRSG test bench environment.

View SlidesVisualizing Debugging Using Transaction Explorer in SOC System Verification
Alicia Strang, Robert Carden IV - Marvell Semiconductor, Inc., Aliso Viejo, CA
Authors:

The traditional debug process requires viewing large swaths of waveforms at one time. As semiconductor designs continuously grow in complexity, using a waveform viewer to debug a SOC system is like examining a forest from an ant’s view: this becomes impractical. To move us from an ant’s view to an eagle’s view of the forest, data flow and bus transactions must be presented in way that will empower an understanding of the big picture of a system without unnecessarily cluttering the debug environment. Cadence Transaction Explorer (TXE) enables advanced viewing, navigating and organizing the waveforms. This enables us to view complex SOC system from higher spot. The debug process therefore is at a more abstract level, thus enabling us to debug complex semiconductor systems more effectively.

This paper will highlight Transaction Explorer usage and benefits by citing examples from transaction recoding technologies, and will provide innovative solutions. Examples include design visualization environments that employ a hierarchy of transactions, process tracing, advanced packet bundling and exploding packet views to foster visual debugging in SOC system. This paper gives a complete example showing step by step how the Transaction Explorer is used in our SystemVerilog environment. We also point out some engineering challenges and issues that require special attention. This paper provides a simple, incremental, and achievable methodology to use Transaction Explorer, a tool that we used to perform system level debugging and system performance analyzing.

Back-End

A Generic Clock Domain Crossing Verification Flow
Tayeb Bouguerba - Advanced Micro Devices, Inc., Markham, ON, Canada
Roger Sabbagh - Mentor Graphics Corp., Ottawa, Ontario, Canada
Authors:

The implementation of clock domain crossing (CDC) verification on the current generation of complex, deep sub-micron (DSM) designs has become a critical step in the design process, and, in general, design teams have adopted automated RTL CDC verification tools to address this requirement.

In this poster, we present the results of our experience developing and using a generic CDC verification flow. The flow provides a common interface for all members of AMD design teams to run CDC verification on their design units, thus providing a standardized approach to CDC verification across the corporation. The CDC verification flow supports a wide variety of requirements that stem from varying CDC design styles across large project teams.

View SlidesA Simple Design Rule Check for DP Decomposition
Chih-Hsien Tang, Kuen-Yu Tsai - National Taiwan Univ., Taipei, Taiwan
Authors:

Double patterning (DP) is a promising lithography solution for ITRS 45-nm half-pitch node and beyond. A major design issue to implement DP is lack of conflict-free decomposition algorithms for existing layout patterns. To make full-chip DP design achievable, a DP-aware design rule check to guide the physical designer is required.

The current design flow is followed, except that design rule check (DRC) is modified to detect the DP decomposition conflict patterns before tape out. We define two additional parameters from DP process to build a DP conflict design rule. DP conflict errors can be fixed by pattern shifting which is determined by the DP parameters, similar to the traditional way to fix the DRC errors.

For demonstration, we utilize Calibre DRC with PSMgate to implement a check-split-check algorithm with a preliminary conflict check, a preliminary DP decomposition, and a final spacing check on the decomposed two mask patterns, to ensure a DP conflict free layout. We define a set of parameters for a trial check on a 65nm metal layer. All conflicts can be easily fixed by pattern shifting.

View SlidesAlgorithm for Analyzing Timing Hot-Spots
Bhargav Joshi - Einfochips, Inc., Ahmedabad, India
Authors:

Resolving timing violations one by one requires huge effort and time to converge at optimum solution. The algorithm discussed here, can save these efforts and time by reporting starting points of multiple timing violations. The cells in design not able to achieve stage delay budget assigned to them are timing hot-spots for the design. This algorithm separates all such cells with their respective cost of failure. A prerequisite for the algorithm is timer updated design.

Solving one such reported node in worst corner will resolve multiple violations. The same sort of reporting can be done with the algorithm in best corner. Area and power can be saved by adding buffer on reported failing best corner node instead of adding data path buffers on each failing best corner node.

View SlidesAn On-Chip Variation Monitor Methodology Using Cell-Based P&R Flow
Yu-Ting Hung, Yu-Wen Tsai - Faraday Technology Corp., Hsinchu, Taiwan
Authors:

The process variation accounts for deviations in the semiconductor fabrication process. This variation becomes unavoidable and serious in the deep submicron era due to limited process resolution. In addition, supply power and operating temperature are also possible on-chip variances. Usually these on-chip variations are modeled as a percentage variation in the circuit performance estimation. The ring-oscillator is the most widely used performance meter in the on-chip variation monitor. In the past, it would take one or two weeks to implement process monitor layout by fully-custom layout service for each different process technology. After that, the monitor cells need manually placing into chip. It always increases the implementation turnaround time.

In this paper, a practical integrated on-chip variation monitor methodology is proposed, including variation monitor implementation in cell-based P&R flow, and a flexible cell-based integration flow in logic level and physical level to reduce the implementation effort. Additionally, the applications of variation monitor are listed as follows. By spreading the variation monitor over the chip, intra-die variation could be measured and feedback as a reference in STA (static timing analysis) for later designs of the same process. It can also be used as an inter-die (lot-to-lot) variation monitor to check the consistency of the process if any abnormal yield occurs. The measured frequency can also be used as an index for setting proper IDDQ testing threshold per die/chip (die-to-die).

View SlidesApplication and Extraction of IC Package Electrical Models for Support of Multi-Domain Power and Signal Integrity Analysis
Om Mandhana, Jon Burnett - Freescale Semiconductor, Inc., Austin, TX
Sam Chitwood, Brad Brim - Sigrity, Inc., Santa Clara, CA
Authors:

The focus of this paper is to describe how attributes of IC package electrical models are dictated by high level concepts, such as bandwidth, switching speed of the core logic circuits as well as high-speed I/Os, types of PI-SI analyses to be performed, compatibility with resolution and fidelity of chip and board models, etc. In particular, the paper discusses how the various package electrical models depend on the extraction frequency or range of extraction frequency, which in turn is related to the edge rate of the switching signals. The strength and weakness of various types of extracted package electrical models is described relative to common system-level PI-SI analyses requirements. A method to estimate the upper bound of the extraction frequency and effect of applying an extraction frequency above or below this parameter on the accuracy of the extracted values of R, L and C parameters of the package model is discussed using commercial IC device packages as examples. Considering different types of package electrical models (single section, lumped RLC model, multi-section distributed RLC model, SPICE compatible Behavioral models and S-parameter model) of these commercial device packages, the bandwidth validity of extracted electrical models will be shown in relation to the resonance frequency of the package planes and the return current of the signals. The final focus of the paper will be application of these package electrical models for design oriented noise performance evaluation of high-speed packages in terms of computational efficiency and the accuracy of the PI-SI simulation and analyses. The target audience for this paper includes both IC designers and IC package designers.

View SlidesApplications of Platform Explorer, Integrator and Verifier in SOC Designs
Byeong Min, Kwang-Hyun Cho, Jaebeom Kim, Chi-Ho Cha, Junhyung Um, Euibong Jung, Sik Kim, Kyu-Myung Choi - Samsung, Yongin City, Republic of Korea
Authors:

This poster introduces a platform-based SOC design flow and methodology for SOC platform exploration, integration and verification. These three components make a Reusable Platform Design Methodology (RPDM), which gives increased efficiency in designing SOCs. Platform Explorer makes it easy to start exploration of architecture, where designers analyze various types of designs for the best implementation of specification. It has been shown that the features implemented in Platform Integrator fully exploit IP reuse and integration of platforms based on SPIRIT standard. Especially, the design flow automation for IP packaging and platform integration is proven to be beneficial in design of SOC platforms. Lastly, Platform Verifier automatically generates verification environment based on the IP-XACT information of the integrated platform. Application results show that the RPDM has reduced 30% of platform design time in mother version and more than 50% in derivative platforms compared to traditional methodology.

Assertion Based Formal Verification in SOC Level
Jentil Jose, Varun Nair - Wipro Technologies, Cochin, India
Authors:

Verification teams have identified the potential of Assertion Based Formal techniques in IP level verification. Tools like the Incisive Formal Verifier from Cadence are really useful to identify even the corner case bugs very early in the design cycle without complex test bench development effort .However, when it comes to SOC level verification, verification teams are fully relying on simulation based verification techniques.

We have identified certain Verification hot spots in SOCs that are ideal for Assertion based Formal Verification. Functionalities like programmable IO pull conditions, reset conditions, multiplexer logics, Inter – block hook ups etc are best suited for Assertion Based Formal Verification in SOC Level as well.

Most SOC development teams have experienced talent in Simulation based verification. Whereas experience in formal tool based verification is limited. This paper attempts to help SOC verification teams in understanding the real power of Assertion Based Formal Verification in SOC level.

It also explains practical techniques that can be applied to integrate Assertion Based Formal Verification in to SOC verification flow. These techniques help the teams in speeding up assertions, avoiding explore depth problem, managing huge number of assertions and debugging assertions.

It is evident from our experience, that adopting Formal ABV in to SOC level, gives good Return on Investment. The methodology has been used in six designs in the same SOC family. Verification teams were able to capture several design issues, early in the design cycle.

View SlidesAttacking Constraint Complexity in Verification IP Reuse
Ben Chen - Cisco Systems, Inc., San Jose, CA
Harish Krishnamoorthy - Cisco Systems, Inc., San Jose, CA
Srinath Atluri - Cisco Systems, Inc., San Jose, CA
Alex Wakefield, Balamurugan Veluchamy - Synopsys, Inc., Mountain View, CA
Authors:

As chip design becomes larger and more complex, verification engineers are expanding constrained-random testing to meet the validation demand. The size and complexity of constraint problems are growing, resulting in performance and capacity issues. This paper discusses the key challenges verification engineers face when writing constraints – how to achieve test goals, how to optimize constraints for performance and how to manage the interaction and code complexity. We use two case studies from the network domain to illustrate these issues and discussion solutions.

View SlidesAutomated Assertion Checking in Static Timing with IBM ASICs
Nathan Buck - IBM Corp., Underhill, VT
William Rose - IBM Corp., Research Triangle Park, NC
Authors:

Timing closure is the most time consuming portion of the design schedule that the ASIC vendor must achieve. The customer provides timing assertions, also called timing constraints, and the ASIC vendor drives their Physical Design and Static Timing tools with those assertions. Determining the correctness of those assertions is a challenging and time consuming task. If an assertion is missing or not correct then there is risk that the hardware will not work. Multiple approaches are used by most ASIC design teams to ensure assertion correctness, and allocating the time necessary to do the work sufficiently is critical to successfully meeting schedule with working hardware. These approaches include:

- Tools to evaluate clock domain crossings
- Back-annotated (SDF) simulations with parasitic data
- Manual assertion reviews
- Automated assertion checking

This presentation will briefly describe the first three approaches and will detail the automated assertion checking that was created by IBM ASICs, using the EinsTimer static timing analyzer, to identify common assertion errors.

View SlidesCase Study of Diagnosing Compound Hold-Time Violations
Shuo-Fen Kuo, Rei-Lung Chen, Jih-Nung Lee, Chi-Feng Wu - Realtek Semiconductor Corp., Hsinchu, Taiwan
Dragon Hsu, Ting-Pu Tai - Mentor Graphics Corp., Hsinchu, Taiwan
Yu Huang - Mentor Graphics Corp., Marlboro, MA
Wu-Tung Cheng - Mentor Graphics Corp., Wilsonville, OR
Authors:

For scan-based tests, hold-time violations could happen either on scan chain shift operations or during the capture cycles of some patterns, or both. These three scenarios are called scan chain hold-violations, system logic hold-time violations and compound hold-time violations, respectively.

Previously published hold-time fault diagnosis models are either focusing on scan chain or system logic. But, in reality, compound hold-time defects may exist. To diagnose compound defect, the current chain diagnosis technology used in YieldAssist can be extended by a two-pass simulation. In the first pass simulation, we mask the cell N when injecting a hold fault at cell N. Therefore all failing bits caused by system logic hold-time violations at cell N will become Xs, and will not impact diagnosis resolution. In the second pass simulation, we do not mask cell N, and use a score system to rank the candidate cells based on the number of unexplained failing bits that are possibly caused by system hold-time violations. In this paper we share our experience in diagnosing compound hold-time violations with a case study.

Design Profiling – Modeling the ASIC Design Process
Tom Guzowski - IBM Corp., Essex Jct., VT
Authors:

Designing complex chips at or below 65nm inflicts a new class of infrastructure challenges on today’s engineers. Juggling IT resources, crafting eclectic workflows, and reconciling design environments vie for a disturbing proportion of a designer’s attention along with the more traditional tasks like synthesis, verification, and layout.

This session describes the methods and tools used by the IBM ASIC design centers and internal design groups to cope with these infrastructure hurdles. It focuses on the creation of an empirical model of the design process, deriving representative metrics from that model, and driving technical and business decisions with those metrics to affect quality, turnaround time, schedule, and efficiency.

Some of the metrics discussed include CPU/wall time, memory usage, workflow iteration, and tool/methodology usage.

View SlidesEnhanced SDC Support for Relative Timing Designs
Eric Quist , Peter Beerel, - Univ. of S. California, Los Angeles, CA
Kenneth Stevens - Univ. of Utah, Salt Lake City, UT
Authors:

Despite having many advantages, the use of non-standard circuit families and templates is often limited to full-custom design houses due to the lack of commercially-supported CAD tools that support their design. One of the biggest problems is the lack of support for relative-timing based timing assumptions and complex performance constraints that often exist in non-standard circuit templates. The industry standard Synopsys Design Constraint (SDC) language supports several powerful mechanisms for specifying timing constraints (e.g., set_max_delay, set_data_check) but the generation of these constraints for non-standard templates is often manual, cumbersome, and error-prone. This poster presents an enhanced SDC language that makes specifying complex timing constraints for template-based designs significantly easier and more reliable. In addition, to support this enhanced SDC language, we present a translation tool to convert the enhanced SDC constraints to the industry-standard SDC constraints, thus enabling the continued use of existing tool flows.

View SlidesHold Time ECO for Hierarchical Design
Albert Li - Global Unichip Corp., Hsinchu, Taiwan
J.J. Hsiao - Dorado Design Automation, Inc., Hsinchu, Taiwan
Authors:

Three major panic index of today’s hold time fixing are the number of corners, number of modes, and number of sub-design blocks. The increasing number of corners and modes raises the possibility of cross mode setup-hold conflict, while the increasing number of sub-design blocks results more cross partition timing issues. Either the cross mode or the cross partition timing issues will require large number of ECO’s (Engineering Change Order), if not well addressed somewhere else in the design flow.

The proposed “Hierarchical Hold Time ECO Flow” is developed for addressing hold time issues of a big design with some blocks of partitions under MMMC (Multi-Mode Multi-Corner) condition. The key benefits of the flow are:

  1. Large capacity -- --Handles more than 40 scenarios of data for multi-scenario Hold Fix.
  2. Straightforward and fast -- --No STA, no complicated clock, multi-mode data are processed as single mode. --Process whole design data but able to output ECO result for each partition in a hierarchical design.
  3. Existing timing are well maintained -- --Timing Fix applied only to the ECO Path Domain. --No impact to setup time during Hold Fix as the Timing Window Files are adopted.
  4. Correlation issues are minimized -- --Fully utilize sign-off data (SPEF, SDF). --Everything is Physical-aware.

For a real tape out (65nm/5.2M instances) with eight sub-design blocks, it processed 38 STA scenarios at once, completed hold time fixing within four hours.

View SlidesImproving the Automation of the System in Package (SIP) Design Environment via a Standard and Open Data Format
Alain Caron - IBM Corp., Burlington, VT
Thomas Brandtner, Infineon Technologies AG, Villach, Austria
Nebojša Nenadović, NXP Semiconductors, Nijmegen, The Netherlands
Authors:

This poster was developed by members of the Silicon Integration Initiative’s (Si2) Systems-in-Package Working Group from IBM, NXP, Infineon, Intel, and LSI. It details and proposes the development of a data format standardization and interfaces, based on XML scheme, to improve system-level design entry efficiency (netlist + constraints), to define interfaces for die and package abstract views(footprints)and to define libraries of packaging technology platforms.

One of the main objectives in presenting this poster is get feedback from EDA industry leaders on their interest in participating in the development of this data format standardization as part of an Si2 initiative. Feel free to contact us if you would like to discuss it in more detail or if you need additional information: http://www.si2.org/?page=3

View SlidesInterconnect Explorer: A High-Level Power Estimation Tool for On-Chip Interconnects
Antoine Courtay, Johann Laurent, Olivier Sentieys, Nathalie Julien - Univ. de Bretagne, Lorient, France
Authors:

Today, System on Chip (SOC) are more and more complex and require many computational resources, implying a large volume of data to be stored or to be transmitted. To transfer this data from memories to processors or from one processor to another, on-chip interconnect buses or networks have to be used. In state of the art, SOC interconnect can represent up to 50% of the total power consumption. Moreover, the transistor and wire dimension scaling has a strong impact on propagation time since the propagation time of a wire is higher than the gate one. Therefore, power and delay estimation and optimization due to interconnects has become a major issue in SOC design. Thus, it is essential to take interconnect power consumption and delay into account during the first design stages of systems.

To do this, we developed a high-level estimation tool (called Interconnect Explorer) which allows user to obtain very fast power and delay estimations of his interconnect network based on both platform parameters and application data that are handled by the bus. The tool efficiency has been validated considering three important aspects: the precision, the execution time and the amount of memory space required to store the experimentation files. All these aspects are compared to a classical design power consumption estimation using a physical power estimation tool like SPICE. Experimental results show that Interconnect Explorer is approximately 3000 times faster than SPICE with a maximum error of 3% and requires almost 25 times less memory space.

View SlidesManaging Information Silos: Reducing Project Risk through Multi-Metric Tracking
Christopher Kappler, Gregory Goss - Achilles Test Systems, Inc., Waltham, MA
Authors:

Tracking progress accurately and reliably is vital to proper decision making and optimal resource allocation! This paper describes a system of real-time web dashboards that track progress and are visible to all team members via their web browsers. Using the latest software technology and web 2.0 techniques, our process can be customized and automated for any team.

Tracking metrics from all different aspects of design and development is a time consuming and manual task. By automating this process it is possible to track progress in a thorough, diligent and rigorous manner. This increases accuracy and reliability, and brings the focus back to the whole team and away from any one specific area.

Modern projects consist of a large number of phases that each result in large amounts of data, status and metrics. This paper will address the philosophy of automatically gathering, efficiently organizing and concisely displaying real-time status in any web browser. These metrics are independent of any specific tool chain. The goal of this work is to avoid tunnel vision, un-needed panic and undo optimism, by reliably tracking progress of a large project in a holistic fashion.

High quality information can help keep a large team in step, if it is universally accessible via a web browser.

View SlidesNet-list Level Test Logic Insertion: Flow Automation for MBIST & Scan
Che-Jen Jerry Chang, Nikhil Herlekar, Artak Mosikyan
Authors:

Automatic DFT flow is presented in this paper. This flow is to insert memory BIST and scan compression logic at the netlist level. It provides flexibility and portability for design team and it can also reduce the workload and greatly improve the efficiency of DFT engineer. This flow is not limited by certain tool vendor and had been successfully verified with several designs.

New method is presented to insert memory BIST and scan compress logic in the netlist level for maximum DFT design flexibility and portability. Automatic design flow was developed to finish both memory BIST and scan compression logic insertion at the same run. Result had been shown that all tasks can be finished within 24 hours run time with design block of 700K~750K instances. Automatic design flow can also reduce the possibility of human error during RTL insertion. This design flow had been verified with several designs and brought to volume production. This design flow is not limited by certain EDA vendor and it had been proved working with different EDA tools.

Sequential Clock Gating Optimization in GPU Designs with PowerPro CG
Tayeb Bouguerba - Advanced Micro Devices, Inc., Markham, ON, Canada
Authors:

View SlidesPhysical Implementation of Retention Cell Based Design
Alpesh Kothari - Atoptech, Inc., Santa Clara, CA
Authors:

With stress on more “greener” applications, power is becoming a big factor in a lot of chips designed today. Today’s mobile applications have to deliver more complex features while getting more conscious for power due to their shrinking size. Both of these requirements pull in opposite directions so, one may have power conscious mobile device which won't do lot of complex things when in power saving mode. To strike a balance, the best way will be to shut down the portions of the chip when not in use but wake it up at touch of a button i.e. the circuit wake-up time should be few milli-seconds. To achieve this one need bank of registers, which can hold the last known state. These registers can be part of the block, which is going to shut-down or can reside in some other area of the chip which is always-on. Since they need to retain the state of the design, they need to have an extra power supply, which can feed in power when the primary power is turned off.

In this presentation we are talking about how to handle place and route aspects of a chip, which needs special handling during placement and extra routing resources for the secondary power pins. AtopTech’s Aprisa is used for doing place & route implementation of this design. Various approaches to place and route retention cell have been discussed. In addition, challenges associated with daisy chaining and defining tie-share connection and its impact to IR drop is covered.

View SlidesSoft-Error-Rate Estimation in Sequential Circuits Utilizing a Scan ATPG Tool
Masaki Shimada, Michio Komoda, Yoshiaki Fukui - Renesas Technology Corp., Tokyo, Japan
Minoru Ito - Hitachi, Ltd., Tokyo, Japan
Kan Takeuchi - Renesas Technology Corp., Tokyo, Japan
Authors:

The methodology for simple and accurate SER estimation in sequential circuits utilizing scan ATPG tool is proposed, in which no pre-defined functional input vectors are required. The logic derating was estimated to be 13% for an embedded-processor.

View SlidesSolving FPGA Clock-Domain Crossing Problems: A Real-World Success Story
Timothy Paige - North Pole Engineering, Inc., Minneapolis, MN
Gordon Braun - Honeywell International Inc., Minneapolis, MN
Chris Rockwood - Mentor Graphics Corp., Milwaukee, WI
Authors:

Today’s high-capacity FPGAs enable a level of integration that was possible only in large ASICs until recently. With many types of processors, complex functions, and interfaces (especially serial interfaces) being incorporated into FPGA designs, the number of asynchronous clocks has increased rapidly. Most bugs associated with clock-domain crossing (CDC) paths cannot be found using traditional simulation and static timing analysis, and the use of third-party IP further complicates the problem. This poster describes a real-world example of CDC verification success on an FPGA design containing third-party IP at Honeywell.

Although the FPGA design had been verified in simulation, the device malfunctioned unpredictably in the lab; using the FPGA vendor’s debug tool to probe the part seemed to only move the problem around. After two frustrating weeks of unsuccessful debugging, we began to suspect that the problem might be related to clock-domain crossing signals and decided to evaluate a CDC analysis tool. Less than one day of work with Mentor’s 0-In® CDC yielded an initial set of results that identified several problem areas with missing or incorrect synchronization, and within one week we had reviewed the results with the IP vendor and obtained a set of design changes that significantly improved the design’s functionality in the lab. The poster describes in detail the process of running static CDC analysis and verifying CDC protocols using a combination of simulation and formal verification technology. Our experience has convinced us that today’s FPGA designs require a dedicated CDC verification solution.

Static Timing Analysis of Single Track Circuits
Prasad Joshi, Peter Beerel - Univ. of Southern California, Los Angeles, CA
Jonathan Gainsley, Ivan Sutherland - Sun Microsystems, Inc., Menlo Park, CA
Marly Roncken - Intel Corp., Hillsboro, OR
Authors:

Today’s semi-conductor design industry is driven by the goal of achieving low power and high performance, both of which are becoming increasingly difficult in a single clock system given increasing on-chip process variations. Asynchronous systems that rely on local coordination of request/acknowledge handshaking signals to transfer data offer an attractive alternative but have generally suffered from the lack of supporting CAD tools and flows.

The 6-4 GasP family of asynchronous circuits uses two-phase request/acknowledge handshaking over a single bi-directional wire to provide ultra high-performance and low power in both processor and network on chip (NOC) applications. Because these circuits do not conform to the standard synchronous templates, their use is currently limited to custom design where extensive SPICE simulations are required to verify timing correctness and performance. In order to incorporate these circuits in larger ASIC designs, it is essential to establish an efficient timing verification flow.

This poster presents a two-step verification flow for GasP circuits. In the first phase, we verify all relative-timing constraints in the control by cutting all loops and modeling the bi-directional handshaking wires using a split-pin architecture. In addition, we characterize the relative min and max skew of the local latch enable signals that the GasP controllers generate. In the second phase, we use these skew values to verify setup and hold times of the latch-based data path considering both time-borrowing and on-chip variations. This verification flow enables timing-violation-driven ECO flows of large GasP designs and is a pre-cursor to timing-driven place and route.

View SlidesTiming Closure in 65-Nanometer ASICs Using Statistical Static Timing Analysis Design Methodology
Llewellyn Marshall, Eric Foreman - IBM Corp., Essex Jct., VT
Authors:

As semiconductor technology continues to decrease in size, relative process variation grows. In order to create a robust design, one solution is for chip timing analysis to add margin for variation; however, this leads to a loss in possible processing performance. With the advent of Statistical Static Timing Analysis (SSTA), new methods have been developed to maximize product yield and improve timing accuracy. In this presentation, we will describe a SSTA sign-off timing methodology for 65nm ASICs using EinsTimer EinsStat, and we will discuss timing closure techniques and experience with using this methodology.

View SlidesUsing STA Information for Enhanced At-Speed ATPG
Colin Renfrew, Ashu Razdan - Freescale Semiconductor, Inc., Austin, TX
Bruce Swanson - Mentor Graphics Corp., Wilsonville, OR
Authors:

Several types of physical defects introduced by the IC manufacturing process are detected only when the device is operating at its functional speed, particularly on technologies of 90nm and smaller. Transition delay, a gross-delay fault model, is the most popular scan test method using Automatic Test Pattern Generation (ATPG) for the structural testing of devices at functional speed and is the fault model used for this work. One of the challenges currently faced on many designs is how to partition the design into logical and manageable parts for accurate at-speed ATPG, especially when multiple clock frequencies are involved. The absence of specific and programmable controls for these clock domains during test requires the creation of an alternative solution. The solution described in this poster presentation allows full controllability of the various clock domains during ATPG and the ability to easily target a specific portion of logic with test patterns running at the correct frequency. Specifically, the poster describes the methodology that was established to enable this solution, how false and multi-cycle path information from Static Timing Analysis (STA) was used for clock domain partitioning and logic separation, and how this technique was applied through FastScan/TestKompress for at-speed ATPG on a high speed, 90nm SOC platform. Results from simulation as well as silicon are included as proof of concept. The intention of sharing this work is to demonstrate an effective method that can be reused on other designs where a similar challenge exists.