Improving Simulation Performance Using Multithreading

Star-Hspice simulations involve both model evaluations and matrix solutions. Running model evaluations concurrently on multiple CPUs using multithreading can significantly improve simulation performance. In most cases, the model evaluation will dominate. To determine how much time is spent in model evaluation and solving, specify .option acct = 2 in the netlist. Using multithreading results in faster simulations with no loss of accuracy.

Multithreaded (MT) Star-Hspice is supported on Sun Solaris 2.5.1 (SunOS 5.5.1) and on Windows NT as a prerelease version using win32 threads.

Running Star-Hspice-MT

You can run Star-Hspice-MT using the syntax described below.

Sun Solaris Platform

Enter on the command line:

hspice -mt #num -i input_filename -o output_filename

Windows NT Platform

Under the Windows NT DOS prompt type:

hsp_mt -mt #num -i input_filename -o output_filename


NOTE: If the #num is omitted, the number of threads will be set to the number of online CPUs.
If you omit the -o output_file option, the result will be printed to the standard output.

Under Windows NT explorer:

1. Double click the hsp_mt application icon.

2. Select the File/Simulate button to select the input netlist file.

In Windows, the program will automatically detect and use the number of online CPUs.

Under the Avant! HSPUI interface:

1. Select the correct version of hsp_mt.exe in the Version Combo Box.

2. Select the correct number of processors in the MT Option Box.

3. Click the Open button to select the input netlist file.

4. Click the Simulate button to start the simulation.

Performance Improvement Estimations

For multithreaded Star-Hspice, the CPU time is:

Tmt = Tserial + Tparallel/Ncpu + Toverhead

where:

Tserial Represents the Star-Hspice calculations that are not threaded

Tparallel Represents the threaded Star-Hspice calculations

Ncpu The number of CPUs used. Toverhead is the overhead from multithreading. Typically, this represents a small fraction of the total run time.

For example, for a 151-stage nand ring oscillator using LEVEL 49, Tparallel is about 80% of T1cpu (the CPU time associated with a single CPU), if you run with two threads on a multi-CPU machine. Ideally, assuming Toverhead = 0 , you can achieve a speedup of:

T1cpu/(0.2T1cpu + 0.8T1cpu/2cpus) = 1.67

For six CPUs the speedup is:

T1cpu/(0.2T1cpu + 0.8T1cpu/6cpus) = 3.0

The typical value of Tparallel is 0.6 to 0.7 for moderate to large circuits.

 

 

Star-Hspice Manual - Release 2001.2 - June 2001