# Accurate Interconnection Length and Routing Channel Width Estimates for FPGAs<sup>\*</sup>

Gao Haixia<sup>†</sup>, Ma Xiaohua, and Yang Yintang

(Microelectronics Institute, Xidian University, Xi'an 710071, China)

**Abstract :** We study the problem of the prediction of interconnection dimensions for FPGAs, including estimating interconnection length and channel width. Experimental results show that our estimates are more accurate than those of existing methods.

Key words:FPGA; interconnection length estimation; channel width estimationEEACC:1265A;1265ZCLC number:TN791Document code:AArticle ID:0253-4177 (2006) 07-1196-05

## 1 Introduction

For field programmable gate array (FPGA) design flow, the routing of nets is difficult and time-consuming. In commercial CAD tools, the majority of design time is spent in routing nets. With the increasing size and complexity of FPGA devices, CAD tools face the challenge of achieving good convergence of mapped designs in an acceptable time frame<sup>[1]</sup>. The prediction of wiring requirements is the problem that demands the most attention. For FPGA design flow, where the routing resources are fixed, the most useful parameter derived from the estimation process is the peak routing demand, i. e. routing channel width. Interconnect prediction is a difficult problem with applications spanning the entire FPGA design flow. Reliable interconnection estimates will help during placement, routing, and also during FPGA architecture studies<sup>[2]</sup>.

In this paper, we study the problem of interconnection size prediction for FPGAs. A technique for estimating interconnection length considering external interconnections is presented. A method for estimating FPGA channel width based on the total interconnection length of the design is proposed as well.

## 2 Interconnection length estimation

There have been many attempts to predict interconnection lengths<sup>[3-8]</sup>. However, none of them have considered interconnections connecting a gate to the outside of the chip (external interconnections). In this section, we will present an improved algorithm that includes the estimation of external interconnections.

### 2.1 Design and partitioning models

A design can be represented by a set of interconnected blocks as in Fig. 1 (a). A net that connects the blocks within the design is called an internal net, and a net that is connected to the outside of the design is called an external net. Every external net is connected with exactly one pin. Thus the number of pins equals the number of external nets. The design is placed in a physical architecture, which can be modeled as a square Manhattan grid as in Fig. 1 (b). In this grid, each grid-point (cell) corresponds to a location where one logic block of the design can be placed. Each pad corresponds to a pin of the design. The gridlines correspond to the routing channels. All lengths are measured using a Manhattan metric.

<sup>\*</sup> Project supported by the National Defense Advanced Foundation of China (No. 41308010205)

<sup>†</sup> Corresponding author. Email :hxgao @mail. xidian. edu. cn

Received 26 December 2005, revised manuscript received 18 March 2006



Fig. 1 Recursive partitioning scheme of the design (a) and the physical architecture (b)

The design and physical architecture are partitioned hierarchically into sub-designs and sub-architectures. Due to the symmetry of the architecture ,each step in the partitioning hierarchy divides the design and architecture into four sub-parts. Each sub-design (sub-architecture) at a hierarchical level consists itself of four sub-designs (sub-architectures) of equal size at the next (lower) level of the hierarchy, as shown in Fig. 1. Each sub-design is paired with a sub-architecture. The partitioning and placement process continues until each logic gate is paired with one grid location. Let us number the recursion levels from K (whole design), K - 1 (four sub-designs that constitute the whole design) down to 0 (four sub-designs consisting of only one logic gate).

### 2.2 Estimation of external interconnection length

The length of an interconnection between a gate and a pin should be computed as the length between the cell where the gate is placed and the closest available I/O pad location. Assuming the square grids have side length 2 ,the length of each external interconnection is bounded by 1 and . Accurate estimates should take placement optimization into account. An optimal placement strategy will make most external interconnections close to the chip border. In order to include this message in the estimates ,we first analyze the interconnections.

The interconnection length distribution shows the number of interconnections for each length. By enumerating all possible external interconnections, we obtain the interconnection length distribution of the external interconnections (called architectural distribution), which only depends on the physical architecture. The distribution P(l) of a given length l is given by

$$P(1) = \frac{2 + 1 - 21}{2}$$
(1)

Next we will compute the probability of an interconnection of a given length l to be assigned an actual placement. It depends on the interconnection complexity ,and thus on the Rent exponent r. This exponent is a measure of the interconnection complexity of the design. Its value is bounded by 0 and 1. Since every cell in the architecture should have the same occupancy distribution ,the occupancy distribution f(l) of external interconnections should be equal to the one for internal interconnections<sup>[3]</sup>

$$f(1) \sim 1^{2r-3}$$
 (2)

The interconnection length distribution of external interconnections is the product of its architectural distribution and its occupancy distribution. Combining Eq. (1) and Eq. (2) , the expression of average external interconnection length  $L_{\text{ext}}$ is

$$L_{ext} = \frac{1P(1) f(1)}{P(1) f(1)} = \frac{[1^{2r \cdot 2} (2 + 1 - 21)]}{[1^{2r \cdot 3} (2 + 1 - 21)]}$$
(3)

The sums in Eq. (3) cannot be computed analytically as a function of . To obtain an analytical form of the average external interconnection length, we approximate Eq. (3) by Eq. (4), multiplied by a scaling factor.

$$L_{ext} = \frac{1^{2r-2}(2 + 1 - 1) d1}{1^{2r-3}(2 + 1 - 1) d1}$$
$$= \frac{2 + 1}{2r - 1} \times (2r - 1) - \frac{1}{r} \times (2r - 1 - 1) - \frac{1}{r} \times (2r - 1 - 1) - \frac{1}{r} \times (2r - 1 - 1) - \frac{1}{2r - 1} \times (2r - 1 - 1) - \frac{2}{2r - 1} \times (2r - 1 - 1) - \frac{2}{2r - 1} \times (2r - 1 - 1) - \frac{2}{r} + \frac{2}{r$$

In order to determine the scaling factor, we calculate the external distributions using Eqs. (3) and (4) for designs with different numbers of gates and a Rent exponent r equal to 0. 6. The results can

be seen in Fig. 2. This figure shows that with the introduction of a scaling factor equal to 0. 74, the estimate coincides with the theoretical data for a wide range of numbers of gates. We carry out many experiments for different  $r(0.4 \ r \ 0.8)$ , and find that the scaling factor depends on r and equals 0.  $15r^2 - 0.06r + 0.72$ . The average value of the scaling factor r is 0. 74. Therefore, the average external interconnection length is approximately equal to



Fig. 2 Evaluation of the approximation of the average external interconnection length

### 2.3 Estimation of average interconnection length

Assuming the total gate number G equals  $4^{\kappa}$ , in which K is the number of partitioning layers, the average interconnection length L of the whole design is estimated by computing the average interconnection number  $N_k$  and the average interconnection length  $l_k$  of every layer.

$$\overline{L} = \sum_{k=0}^{K} (N_k l_k) / \sum_{k=0}^{K} N_k$$
(6)

Thus combining internal interconnection and external interconnection, we can compute the average interconnection length of the whole design as  $\frac{K-1}{K-1}$ 

$$\overline{L} = \frac{(N_k l_k) + N_{ext} L_{ext}}{K l_k}$$
(7)

According to Ref. [3],

$$N_{k} = 0.5 A G (1 - 4^{r-1}) 4^{k(r-1)}$$
(8)  

$$I_{k} = 2^{k} \left[ \frac{2}{3} \times \frac{r-1}{r+1} \times \frac{3^{2r+2} - (r+4)2^{2r+2} + (4r+7)}{3^{2r+1} - (2r+7)2^{2r} + (4r+5)} + \frac{1}{3} \times \frac{r-1}{r+1} \times \frac{4^{2r+1} - 3^{2r+2} + 3 \times 2^{2r+1} - 1}{4^{2r} - 3^{2r+1} + 3 \times 2^{2r} - 1} \right]$$
(9)

The number of external interconnections  $N_{\text{ext}}$  can be obtained using the Rent rule,

$$N_{ext} = AG^{r}$$
(10)

where A is the average number of point-to-point interconnections for a logic block and G is the number of gates in the design. Combining equations 4, 5, 8, 9 and 10, the average interconnection length of the whole design is obtained.

### 2.4 Validation

In order to validate our new external interconnection length estimates, we set up some experiments using several benchmark circuits<sup>[3]</sup>. For each of the benchmark circuits, the total number of gates G and the Rent exponent r are presented in Table 1. Table 1 also shows the experimentally measured interconnection lengths  $L_{exp}$ , theoretical interconnection length estimate  $L_D$ ,  $L_D$  and L using Donath 's technique, the improved Donath 's technique, and our own technique, respectively. It is clear from the table that our estimates are closest to the experimental results for all benchmark circuits, especially for the design with low interconnection complexity (low r). Compared to  $L_D$  and  $L_{\mathbb{D}}$ , the improvements of our estimates are 29 % and 5 % respectively.

| No      | G    | r    | Lexp | Do   | nath <sup>[6]</sup> |                | np ro ved<br>nat h <sup>[6]</sup> | Our analysis |          |  |
|---------|------|------|------|------|---------------------|----------------|-----------------------------------|--------------|----------|--|
|         |      |      |      | LD   | Error/ %            | $L \mathbb{D}$ | Error/ %                          | L            | Error/ % |  |
| 1       | 528  | 0.59 | 2.15 | 4.02 | 47                  | 2.88           | 25                                | 2.72         | 21       |  |
| 2       | 576  | 0.75 | 2.85 | 5.26 | 46                  | 4.13           | 31                                | 3.42         | 17       |  |
| 3       | 671  | 0.57 | 2.63 | 4.07 | 35                  | 2.89           | 9                                 | 2.76         | 5        |  |
| 4       | 1239 | 0.47 | 2.14 | 3.76 | 43                  | 2.45           | 13                                | 2.42         | 12       |  |
| 5       | 2148 | 0.75 | 3.50 | 7.37 | 53                  | 5.74           | 39                                | 4.96         | 29       |  |
| 6       | 1024 | 0.40 | 1.96 | 3.28 | 40                  | 2.02           | 3                                 | 2.00         | 2        |  |
| 7       | 1024 | 0.50 | 2.21 | 3.79 | 42                  | 2.60           | 15                                | 2.55         | 13       |  |
| 8       | 1024 | 0.60 | 2.58 | 4.61 | 44                  | 3.32           | 22                                | 3.16         | 18       |  |
| Average |      |      |      |      | 44                  |                | 20                                |              | 15       |  |

 Table 1
 Comparison of average interconnection length

## 3 Channel width estimation for FP-GAs

Recently, some efforts have been made to use Rent 's Rule to address the problem for FPGAs. RISA<sup>[9]</sup> is an empirical interconnection estimation method based on the wiring distribution map, derived from a large number of randomly generated optimal Steiner trees. Lou 's Method<sup>[10]</sup> estimates routing demand using the ratio of the number of paths that use a specific routing region to the total number of possible paths. The fGREP<sup>[1]</sup> and fGREP2<sup>[2]</sup> methods are new interconnection estimation methods based on the concept of routing flexibility over the routing elements. In this section we propose our new estimation method, which is very simple to understand and easy to compute.

#### 3.1 Proposed model

Recently, it has been found that routability is the best predicted by estimating total wire length in a circuit rather than the average wire length<sup>[11]</sup>. Assuming the number of logic blocks N needed to realize a design in FPGA equals  $4^{K}$ , we can obtain the total wire length

$$L_{\text{total}} = (N_k l_k) + N_{\text{ext}} L_{\text{ext}}$$
(11)

For 2D island-style FPGAs with one-unit-long wire segments<sup>[12]</sup>, channel width W is given by

$$W = \frac{L_{\text{total}}}{2N} = \frac{(N_k l_k) + N_{\text{ext}} L_{\text{ext}}}{2 \times 4^K} \quad (12)$$

where  $N_k$ ,  $I_k$ ,  $N_{ext}$  and  $L_{ext}$  can be obtained using equations 8,9,10 and 5.

### 3.2 Validation

In order to validate our new model, we set up some experiments using 20 benchmark circuits from MCNC. For each of the benchmark circuits, the total number of CLBs and the Rent exponent rare presented in Table 2. The results are also tabulated in Table 2.  $W_{\text{VPR}}$ ,  $W_{\text{GI}}$ ,  $W_{\text{GE}}$ ,  $W_{\text{F}}$ ,  $W_{\text{F2}}$  and Ware, respectively, the channel width predicted by VPR 's detailed router, Gamal 's model in which the average interconnection estimation only considers internal interconnections, Gamal 's model in which the average interconnection estimation considers internal and external interconnections, the f GREP method, the f GREP2 method, and our new model. The  $W_{VPR}$  in the Table 2 is different from the literature<sup>[1]</sup>. The number of I/O pads per row (column) in Ref. [1] is 1 (io  $\_$  rat = 1) in the FP-GA architecture description file. We set io  $\_$  rat = 2, in line with current commercial FPGAs<sup>[13]</sup>.

Table 2 Comparison of channel width

| Benchmark<br>circuit # CLBs | " (T D |      |              | Gamal    |              |          | f GR EP |          | f GR EP2     |          | New model |          |       |
|-----------------------------|--------|------|--------------|----------|--------------|----------|---------|----------|--------------|----------|-----------|----------|-------|
|                             | r      | Wvpr | $W_{\rm GI}$ | Error/ % | $W_{\rm GE}$ | Error/ % | WF      | Error/ % | $W_{\rm F2}$ | Error/ % | W         | Error/ % |       |
| Alu4                        | 1522   | 0.67 | 11           | 13.07    | 18.8         | 12.56    | 14.2    | 9.97     | 9.4          | 9.80     | 10.9      | 9.95     | 9.5   |
| Apex2                       | 1878   | 0.72 | 12           | 15.33    | 27.8         | 14.35    | 19.6    | 10.54    | 12.2         | 10.92    | 9         | 11.47    | 4.4   |
| Apex4                       | 1262   | 0.80 | 14           | 19.71    | 40.8         | 17.06    | 21.9    | 11.96    | 14.6         | 12.12    | 13.4      | 13.98    | 0.1   |
| Bigkey                      | 1707   | 0.51 | 7            | 7.91     | 13           | 7.86     | 12.3    | 7.47     | 6.7          | 7.76     | 10.9      | 6.14     | 12.3  |
| Clma                        | 8383   | 0.63 | 13           | 14.67    | 12.8         | 14.44    | 11.1    | 11.06    | 14.9         | 11.92    | 8.3       | 11.31    | 13    |
| Des                         | 1591   | 0.65 | 8            | 12.27    | 53.4         | 11.88    | 48.5    | 8.07     | 0.9          | 8.31     | 3.9       | 9.38     | 17.3  |
| Diffeq                      | 1497   | 0.60 | 9            | 10.47    | 16.3         | 10.27    | 14.1    | 8.09     | 10.1         | 8.88     | 1.3       | 8.07     | 10.3  |
| Dsip                        | 1370   | 0.52 | 8            | 8.16     | 2            | 8.09     | 1.1     | 7.56     | 5.5          | 7.61     | 4.9       | 6.33     | 20.9  |
| Elliptic                    | 3604   | 0.71 | 13           | 14.85    | 14.2         | 13.99    | 7.6     | 10.46    | 19.5         | 10.53    | 19        | 11.15    | 14.2  |
| Ex1010                      | 4598   | 0.68 | 12           | 17.94    | 49.5         | 17.44    | 45.3    | 10.73    | 10.6         | 11.39    | 5.1       | 13.73    | 14.4  |
| Ex5p                        | 1064   | 0.80 | 16           | 19.71    | 23.2         | 17.06    | 6.6     | 12.97    | 18.9         | 13.12    | 19.5      | 13.98    | 12.6  |
| Frisc                       | 3556   | 0.74 | 16           | 16.33    | 2.1          | 15.07    | 5.8     | 11.76    | 26.5         | 12.37    | 22.7      | 12.10    | 24.4  |
| Misex3                      | 1397   | 0.75 | 12           | 16.86    | 40.5         | 15.43    | 28.6    | 10.02    | 16.5         | 10.72    | 10.7      | 12.42    | 3.5   |
| Pdc                         | 4575   | 0.77 | 19           | 25.85    | 36.1         | 23.86    | 25.6    | 16.16    | 14.9         | 16.24    | 14.5      | 19.11    | 0.6   |
| S298                        | 1931   | 0.49 | 9            | 7.44     | 17.3         | 7.40     | 17.8    | 7.59     | 15.7         | 8.34     | 7.3       | 5.78     | 35.8  |
| S38417                      | 6406   | 0.57 | 9            | 11.58    | 28.7         | 11.49    | 27.7    | 8.90     | 1.1          | 8.76     | 2.7       | 8.97     | 0.3   |
| S38584.1                    | 6447   | 0.55 | 9            | 10.72    | 19.1         | 10.66    | 18.4    | 8.94     | 0.7          | 8.68     | 3.6       | 8.31     | 7.7   |
| Seq                         | 1750   | 0.75 | 13           | 16.86    | 29.7         | 15.43    | 18.7    | 10.38    | 20.2         | 10.99    | 15.5      | 12.42    | 4.5   |
| Spla                        | 3690   | 0.78 | 16           | 18.52    | 15.8         | 16.45    | 2.8     | 11.87    | 25.8         | 12.87    | 19.6      | 13.37    | 16.4  |
| Tseng                       | 1047   | 0.57 | 8            | 9.53     | 19.1         | 9.40     | 17.5    | 7.38     | 7.8          | 7.43     | 7.1       | 7.36     | 8     |
| Average                     |        |      |              |          | 24.01        |          | 18.26   |          | 12.63        |          | 10.51     |          | 11.51 |

 $W_{\text{GI}}$ ,  $W_{\text{GE}}$ ,  $W_{\text{F}}$ ,  $W_{\text{F2}}$ , and W are compared with the experimental results  $W_{\text{VPR}}$ . Compared to  $W_{\text{GI}}$ ,  $W_{\text{GE}}$  and  $W_{\text{F}}$ , the improvements of our estimates are 12.50 %, 6.75 %, and 1.12 %, respectively. In addition, the improvement of  $W_{\text{GE}}$  is 5.75 % compared to  $W_{\text{GI}}$ . This result further validates the accuracy of the average inter-

connection length prediction described in section 2. Although the error of our estimates is 1 % larger than that of  $W_{F2}$ , our estimates can save much time by making the computations more easily.

### 4 Conclusion

We have presented the problem of interconnection prediction for FPGAs. In the interconnection length estimation, we introduce the external interconnection estimation. The results show that our estimates are closest to the experimental results. Compared to that of Donath 's technique and the improved Donath 's technique, the improvements of our estimates are 29 % and 5 % respectively. Also ,we propose an FPGA channel width estimation method based on the total interconnection length of the design. Compared to  $W_{GI}$ ,  $W_{GE}$  and  $W_{\rm F}$ , the improvements of our estimates are 12. 50 %, 6. 75 %, and 1. 12 %, respectively. Although the error of our estimates is 1 % larger than that of  $W_{F2}$ , our estimates can save much time by making the computations more easily.

In conclusion, using our new methods, the estimations of the interconnection length and the channel width for FPGAs are very easy and accurate.

### References

- [1] Kannan P,Balachandran S,Bhatia D. f GREP-fast generic routing demand estimation for placed FPGA circuits. Proceedings of 11th International Workshop on Field Programmable Logic and Applications ,Berlin , Germany ,2001:37
- [2] Kannan P, Balachandran S, Bhatia D. On metrics for comparing interconnect estimation methods for FPGAs. IEEE Trans Very Large Scale Integration Systems, 2004, 12(4):381
- [3] Stroobandt D, Van Marck H, Van Campenhout J. An accurate

interconnection length estimation for computer logic. Proceedings of the 6th Great Lakes Symposium on VLSI, Ames, Iowa, USA, 1996:50

- [4] Sutherland I E, Oestreicher D. How big should a printed circuit board be. IEEE Trans Computers, 1972, C-22(5):537
- [5] Donath W E. Placement and average interconnection lengths of computer logic. IEEE Trans Circuits and Systems, 1979, CAS 26(4):272
- [6] Van Marck H, Stroobandt D, Van Campenhout J. Interconnection length distributions in 3-dimensional anisotropic systems. Proceedings of the 13th Iasted International Conference on Applied Informatics, Igls, Austria, 1995:98
- [7] Stroobandt D, Van Campenhout J. Estimating interconnection lengths in three-dimensional computer systems. Proceedings of the 6th Workshop on Synthesis And System Integration of Mixed technologies, Fukuoka, Kyushu, Japan, 1996:236
- [8] Cotter J E, Christie P. The analytical form of the length distribution function of computer interconnections. IEEE Trans Circuits and Systems, 1991, 38 (3):317
- [9] Cheng C E. RISA :accurate and efficient placement routability modeling. Proceedings of the International Conference on Computer-Aided Design ,San Jose ,CA ,2001
- [10] Lou J , Krishnamoorthy S ,Sheng H S. Estimating routing congestion using probabilistic analysis. Proceedings of International Symposium on Physical Design ,2001
- [11] Rahman A, Das S, Chandrakasan A, et al. Wiring requirement and three-dimensional integration of field-programmable gate arrays. Proceedings of 3rd International Workshop on System Level Interconnect Prediction, Sibina, California, USA, 2001
- [12] Gao Haixia, Yang Yintang, Dong Gang. Theoretical analysis of the effect of LUT size on the area and delay of FPGA. Chinese Journal of Semiconductors ,2005 ,26(5) :893
- [13] Betz V, Rose J. VPR: a new packing, placement and routing tool for FPGA research. Proceedings of International Workshop on Field Programmable Logic and Applications, London, UK, 1997

# 准确的 FPGA 互连长度和通道宽度估计\*

### 高海霞† 马晓华 杨银堂

(西安电子科技大学微电子研究所,西安 710071)

摘要:提出了一种新的 FPGA 互连预测算法,包括互连长度估计算法和通道宽度估计算法.实验结果表明,与现有 算法相比,该估计算法能获得更准确的估计结果.

<sup>\*</sup>国防科技预先研究基金资助项目(批准号:41308010205)

<sup>†</sup>通信作者. Email : hxgao @mail. xidian. edu. cn 2005-12-26 收到,2006-03-18 定稿