
Parallel Simulation Boosts Verification Productivity
by Bradley Geden
Product Marketing Manager, FastSpice, Synopsys
Sanjay Sawant
Senior Product Marketing Manager, Synopsys
www.synopsys.com
The electronic industry, today, is driven by advancements in manufacturing processes. They drive density which, in turn, drives increased
complexity and performance. As a result, simulations are becoming a key bottleneck in the design cycle. Engineers are required to run
billions of cycles of simulation to find all of the design bugs. In addition to using high-performance verification solutions to keep up with
the fast pace, engineering teams regularly upgrade the server farm to high-performance, modern machines. This has proven to be cost
effective and it offers a competitive advantage. As compute infrastructure transitions to multicore, multi-threaded architectures,
verification solutions must evolve to optimize the performance on new hardware.
Today, designers use farm-based parallelism to address the bandwidth bottleneck for regression suites. This helps considerably to improve
regression throughput, especially during peak usage, by running multiple tests in parallel on several compute resources. However, designs are
now becoming multicore whereas verification environments are composed of checkers, monitors, constraints, coverage and debug with long,
serial tests.
To further increase verification performance, parallelism must be enabled at a finer granularity. For example, a design with multiple cores
should be simulated such that each core can be verified in parallel on an independent processor to achieve maximum throughput. Distributing
simulation tasks on several processors can improve simulation turnaround time. By partitioning a design into several smaller pieces, each
piece takes less memory and can easily fit into any current computer. A parallel simulation technique includes the partitioning of a DUT into
a set of smaller partitions that are executed on different processors. This is defined as Design Level Parallelism (DLP). In addition to
design, other verification components such as assertions, coverage, debug/dumping and constraints can also be made concurrent. This is
defined as Application Level Parallelism (ALP).
In addition to enabling DLP and ALP, a multicore solution must ensure consistent results between single-core and multicore simulations.
This means the multicore solution must be able to synchronize the communication across multiple cores and perform optimum memory management
successfully.
As an example, VCS Multicore technology addresses these challenges to take advantage of advances in the compute infrastructure. It cuts
verification time by harnessing the power of multicore CPUs. It allows designers to identify performance bottlenecks and distribute time-
consuming activities across multiple cores for faster functional verification and debug. Automatic partitioning and load balancing, event
synchronization and memory optimization are techniques well suited to increase functional verification performance.
As the prevalence of multicore hardware has grown, there has been strong demand for circuit simulators to take advantage of the extra
processing power in addition to the mixed-language event simulators. Running many simulation jobs over multiple cores has been possible for
quite some time; instead, the challenge has been multi-threading a single simulation job. HSPICE, Synopsys’ SPICE level circuit simulator,
introduced advanced multi-threading algorithms in 2008 marking the early results of Synopsys’ investment in parallel circuit simulation
technologies. Progressive developments in HSPICE have improved the performance scalability over multiple cores in addition to broadening the
scope of circuits that demonstrate accelerated simulation performance due to multi-threading.
In the FastSPICE domain, Synopsys has invested heavily in delivering superior performance on a single core while continuing to investigate
the feasibility of multi-threading technologies for high performance, high capacity circuit simulation. Similar techniques to HSPICE can be
used, such as parallel device model evaluation, that will improve performance over multiple threads when a large number of individual model
equations need to be evaluated. Additionally, the matrix solving algorithms can be multi-threaded to deliver optimum speed-up when solving
very large, dense matrices that typically occur when simulating sizable amounts of post-layout RC data. With the recent announcement of the
CustomSim Circuit Simulation Solution, Synopsys has introduced multi-threading capabilities into the high- performance, high-capacity
circuit simulation arena.
Designers using multicore hardware can take advantage of the additional processing power available on multicore processors. However,
“mileage may vary,” if you will, and the performance improvements will depend on the type of circuit being simulated, the quantity of post-
layout data (more parasitic data will reap more benefits from multi-threading) and the targeted accuracy level (higher accuracy modes will
see greater acceleration from multi-threading).
Continuing efforts are underway at Synopsys as well as in all other EDA companies to further utilize multicore hardware. Synopsys plans
to release incremental multi-threading technologies in HSPICE and CustomSim. Parallel circuit simulation — and any parallel software
application — is limited by Amdahl’s L aw (i.e., a dding more threads will eventually no longer provide performance improvements due to
inter-thread communication overhead). H ence, these enhancements will supplement research into single-core performance and advanced analysis
techniques to address the verification challenges facing IC design teams.
|