1. Introduction
Accurate device modeling is essential for circuit simulation and design. In 1996, BSIM3 Version 3 (commonly abbreviated as BSIM3v3) was established by SEMATECH as the first industry-wide standard of its kind[1]. It has since been widely used by most semiconductor and IC design companies world-wide for device modeling and CMOS IC design. Though the BSIM model is accurate, it needs a long time to adjust for non-ideal effects. Meanwhile, Moore’s law has nearly come to an end and lots of new devices show up which need study[2–4]. It is unwise to invest a huge amount of resources to model a new device; we may just test its circuit characteristics rather than business use. Efficient modeling methods should be proposed for these purposes.
A lot of methods have been proposed for the similar usage. Gustanven proposed a lot of black box Macro-modeling methods for huge complex systems such as the high-voltage system and electronic magnetic system[5]. The whole large systems are viewed as a black box with input and output, and statistical regression models are built to describe these systems. By using this method, the complexity of systems is greatly reduced[6–8]. The same thought can be used in semiconductor device modeling since a semiconductor device can also be viewed as a black box. If an ideal statistical regression model is found, we can save a lot of time and money. For example, the carbon nanotubes field effect transistor (CNT-FET) is a promising candidate for MOS-FET, which generates much less heat and runs as fast as MOS-FET[9]. But CNT-FET shows different I–V characteristics with a different manufacturing process, which is hard for us to build a universal physical model for all of them. A black box statistical modeling is a choice for efficiently modeling different CNT-FET[10, 11].
The keys of statistical modeling are the choice of model and at what extent the model can fit the true data to. Numerous statistical models are proposed for this task. Ordinary least square (OLS) and its regularization method (Ridge and LASSO) are the most commonly used regression approaches[12]. But they are mainly used for linear regression and may not fit for semiconductor device modeling, since the device has a highly nonlinear property. In this paper, we propose two numerical methods for semiconductor device modeling, single-pole denominator numerator fitting (single-pole MRR) and double-pole MRR. They have a great nonlinear curve fitting ability and good numerical stability which are critical in semiconductor device modeling.
2. Background
A lot of methods have been proposed for the curve fitting task. Suppose we have p dimension input attributes and one dimension output observed data points from the curve,
w∗=argminw||y−XTw||2. | (1) |
If XTX is invertible, the optimal solution of coefficient w is w* = (XTX)–1XTy[1]. The Lasso is a shrinkage method for OLS, which adds the L1 norm to the objective function,
w∗=argminw||y−XTw||2+λP∑j=1wj. | (2) |
It can shrink some of the parameters to be exactly zeros if positive penalty
Compared with the linear models, the rational model,
w∗=argminw||y−N(X)D(X)||2. | (3) |
However, Eq. (3) cannot be solved directly in closed form as there are unknown parameters in the denominator.
Since the 1950s, considerable effort has been devoted to the development of methods for parameters extraction of rational function. Levy, Sanathanan and Koerner, Lawrence and Rogers, and Stahl have presented various techniques by posing linear least squares problems. Pintelon and Guilaume analyze these and several other techniques[13–16]. Vector-Fitting(VF), introduced by Gustavsen and Semlyen and using partial fraction basis, has been widely accepted as a robust modeling method for approximating frequency domain responses[17–19].
In this work, we proposed single-pole MRR and double-pole MRR methods based on vector fitting, which are of great help in a nonlinear curve fitting task. The power of MRR is shown by numerical examples involving artificially created binary function, SMIC 40 nm NMOS DC characteristic, CNT-FET, and the LNA performance model.
3. Algorithm
3.1 Single-pole MRR
Consider the rational function approximation,
y=N(X)D(X)=w0+XTwn1−XTwd, | (4) |
where
We solve Eq. (4) by transforming it into an iterating OLS problem. We knew that equation
w∗n,w∗d=argminwn,wd||y−w0+XTwn1−XTwd||2. | (5) |
Suppose in our iteration process, we have a set of
||1−XTwtd1−XTwt−1d||2∗argminwn,wd||y−w0+XTwtn1−XTwtd||2=argmin||wn,wd(1−XTw+d1−XTwt−1d)∗(y−w0+XTwtn1−XTwtd)||2=argminwn,wd||y−XTw+dy1−XTwt−1d−w0+XTwtn1−XTwt−1d||2=argminwn,wd||y−w0+XTwtn+XTyw+d1−XTwt−1d||2. | (6) |
Here in the last equation, a new OLS problem is made where the feature vector is
3.2 Double-pole MRR
Consider the rational function approximation,
y=N(X)D(X)=w01+XTwn11−XTwd1+w02+XTwn21−XTwd2. | (7) |
Though we can still multiply a factor
wt01+XTwtn11−XTwt−1d1+wt02+XTwtn21−XTwt−1d2=(1−XTw+d11−XTwt−1d1−XTw+d21−XTwt−1d2)y, | (8) |
wt01+XTwtn11−XTwt−1d1+wt02+XTwtn21−XTwt−1d2+XTyw+d11−XTwt−1d1+XTyw+d21−XTwt−1d2=y. | (9) |
Here
3.3 Data preprocessing method
3.3.1 Normalization
The key step of MRR is accurately solving the over-determined Eq. (7). However, because the condition of such a problem is poor, the solving of the normal equation is of reduced numerical stability and may result in large errors in the solution. Besides, if there are degrees of magnitude difference between y and x, the rank of matrix A might be rank deficient, which also lead to the worst solution. In order to circumvent these cases, normalizing input attributes x and target output y before the parameters extraction procedure is usually of great help.
3.3.2 Logarithm transformation
Semiconductor device I–V characteristic value might be small in number and varies in a wide range, e.g. the Idof NMOS-FET may vary from
4. Experiment
4.1 Fitting artificial created function
To illustrate the validity of the proposed method, we firstly consider an artificially created function defined
4.2 Fitting I–V characteristics curve of SMIC 40 nm NMOS-FET
DC characteristics of NMOS-FET are used to show the performance of MRR, compared with OLS and LASSO. As BSIM has been widely used in industry for years and, to some extent, it can represent the authentic physical property of NMOS-FET, so we use Cadence and SPICE to get DC simulation data of SMIC 40 nm NMOS-FET with a channel width of 1 μm and channel length of 40 nm. In this case, we choose two independent variables,
4.2.1 Analysis of fitting result
Fig. 2 shows the fitting results of different algorithms for the dataset. We plot the Id–Vd curve and Id–Vg curves to show the fitting performance. Table 1 shows the parameters and NMSE of the fitting result. Single-pole MRR with sextic polynomial, double-pole MRR with quartic polynomial and OLS nonic polynomial are used for contrast. The number of parameters in each model is close. Because of the regularization property, we choose the nonic polynomial for LASSO to include as many meaningful features as possible. After 10-fold cross validation, the optimal regularization parameter
Algorithm | Single-pole MRR
(sextic polynomial) |
Double-pole MRR
(quartic polynomial) |
OLS
(nonic polynomial) |
LASSO
(nonic polynomial) |
Parameter | 55 | 58 | 55 | 36 |
NMSE | 3.0157 × 10–8 | 2.631 × 10–6 | 4.357 × 10–4 | 5.082 × 10–4 |


The Table 1 details the NMSE of four algorithms. As the data is very small, the Normalized Mean Square Error (NMSE) is used, as shown in Eq. 10. Single-pole MRR with sextic polynomial
1mK∑i=1[timax(t)−yimax(t)]2. | (10) |
NMSE is
Although single-pole MRR reaches the minimum of NMSE in the results, double-pole MRR uses a lower polynomial (cubic polynomial) and the fewest parameters to reach the second best fitting solution. In fact, it is the best model with polynomial of low degree (1, 2, 3). Double-pole MRR with cubic polynomial’s NMSE is 1.593 × 10−5. This is very helpful in the circumstance where the number of input attributes is large and polynomial of a high degree cannot be achieved due to the limit of CPU memory, e.g. a sextic polynomial of 12 input attributes requires 1237 GB CPU memory.
4.2.2 Analysis of convergence
The convergence of the algorithm is critical, for our parameters extraction method is based on iteration. As has been analyzed, the stability of every least squared problem would be improved after data normalization. Single-pole MRR has very good numerical stability with all polynomial while double-pole MRR shows good numerical stability for a polynomial of low degrees. When the polynomial degree is high, double-pole MRR may suffer from rank deficiency and the numerical stability will decrease. We get the sequential NMSE of two MRRs’ iteration with a quadratic polynomial. The result is plotted in Fig. 4 in log scale. Fig. 4(a) shows step NMSEs of the global optimal fitting results for 40 nm MOSFET. Single-pole MRR adopts a sextic polynomial and double-pole MRR adopts a quartic polynomial. Fig. 4(b) shows step NMSEs of two MRR methods when they adopt the quadratic polynomial. The best NMSE of single-pole MRR and double-pole MRR with the quadratic polynomial are
4.3 Fitting CNT-FET
In this section, we apply single-pole MRR to CNT-FET DC behavior modeling. Fig. 4 visualizes the measured
Normalized mean square error (NMSE) of Idis

4.4 Fitting LNA performance model
In this section we apply MRR methods to model LNA performance characteristics and demonstrate our algorithm in multi-variables (more than 2) regression. Generally, we get a certain performance of a circuit by means of solving the KCL and KVL equations. The solution is accurate but very time-consuming. If we want to do behavioral simulation and optimization for the RF circuit, directly mathematical mapping between design parameters and performance is highly useful.
We chose an LNA which worked at 4GHz and use Cadence to get simulation data for modeling. Figs. 7(a)–7(c) show the fitting result of three LNA indicators, NF (noise factor), gain and power by single-pole MRR. Figs. 7(d)–7(f) are step NMSEs of single-pole MRR and Figs. 7(g)–7(i) are step NMSEs of double-pole MRR. The performance of two MRR methods are close. We sort the sample point in an increased order and record the index sequence, then we plot data points, predictions and errors according to the sorted index sequence in one figure. We can see that the training result is good. Normalized mean square error (NMSE) of NF, gain and power are 0.0031, 0.00710, and

5. Conclusion
This paper proposes a family of numerical methods---MRR to approximate an unknown system and extract model parameters. We firstly use single-pole MRR to fit an artificial function and the result is extremely good. Then we compare the performance among single-pole MRR, double-pole MRR, OLS and LASSO on SMIC 40 nm DC characteristics dataset. The results show single-pole MRR has the highest fitting precision and double-pole performs better than single-pole MRR in circumstances of a low degree polynomial. The MRR methods have a more powerful nonlinear curve fitting ability than OLS and LASSO and are proved to be numerically stable. CNT-FET and LNA performance indicators are also modeled, of which the fitting results are good as well. But there are two key points for using MRR methods. Firstly, users have to pay close attention to the numerical stability of MRR methods. We have done one artificial function fitting task and three device model fitting tasks for the convergence analysis of two MRR methods. The results show single-pole MRR has better numerical stability than double-pole MRR. Secondly, the datasets used for the fitting curve should be well-distributed and not be sparse. MRR methods are powerful in fitting a highly-nonlinear function but can also lead to overfitting if the dataset is ill-distributed. Our paper shows the MRR methods are good choices for semiconductor devices statistical modeling as well as other highly-nonlinear curve fitting tasks.