# A Run-to-Run Film Thickness Control of Chemical-Mechanical Planarization Processes

Jingang Yi Department of Mechanical Engineering Texas A&M University College Station, TX 77843, USA jgyi@tamu.edu Wei-Shu Sang and Eugene Zhao CSBG Group Lam Research Corporation Fremont, CA 94538, USA {wei-shu.sang,eugene.zhao}@lamrc.com

Abstract— With the continuing shrink of device geometries, tightly control of semiconductor manufacturing processes becomes a critical factor to improve the process performance, throughput and yield. In this paper, we present design, analysis and implementation of a run-to-run film thickness control scheme for chemical-mechanical planarization (CMP) processes. A predictor-corrector type of control law is utilized to regulate the CMP process time. The control algorithm uses the information of the monitor wafer removal rate and the consumable lifetime to compensate for process drifts and shifts. We also discuss a compensation method for CMP polisher head-to-head variations. The process results in a production fab show a significant improvement of CMP performance under the proposed control scheme.

## I. INTRODUCTION

Chemical-mechanical planarization (CMP) is widely used in semiconductor manufacturing as a planarization strategy. Determining and controlling when to stop a CMP process is a challenge task. Control of CMP processes can be classified into two categories [1]: polishing inlaid film through end-point detection mechanism and polishing a dielectric film (such as inter-level dielectric (ILD) or intermetal dielectric (IMD)) to a target film thickness using runto-run controllers. In this study, we discuss a film thickness run-to-run control scheme for the IMD CMP process.

Run-to-run thickness control of CMP processes was widely developed and utilized in recent years [2]-[6]. In [2], a linear process model was built for CMP processes and an exponential weighted moving average (EWMA) scheme was used to estimate the model parameters. In [7], [8], a neural network and an adaptive optimization schemes were studied to tune the weighting parameter  $\lambda$  in the EWMA scheme. In [5], a concept of sheet film equivalent (SFE) was proposed for designing a run-ro-run film thickness controller for CMP processes. The authors also compared the proposed run-to-run CMP process control law with an EWMA controller [5]. The EWMA control scheme can be used to compensate for the slow process drifts. However, an EWMA controller cannot compensate for the severe process shifts such as changing of CMP process consumables. In [9], a predictor-corrector control (PCC) scheme, or so-call double EWMA, was discussed

to compensate for both process shifts and drifts. In [4], an extension of double EWMA control scheme was proposed to compensate for process drifts and shifts due to consumable aging. Recently, [10] extended the EWMA estimation scheme to a recursive least square estimation method and applied to etch rate estimations.

Most work above utilize some estimation schemes (e.g. EWMA) to predict the material removal rate (RR) of the CMP process. In this paper, we design a direct runto-run polishing time controller. The proposed run-to-run controller is based on the feedforward estimation (predictor) before polishing a wafer and feedback metrology measurements (corrector) after the polishing process. We use the removal rate information from the daily monitoring blanket wafers to compensate for the process drifts due to pad and conditioner disk wears. The blanket wafer removal rate information is also utilized to compensate for the process shifts due to consumable changes. There are several advantages of the proposed film thickness control scheme. First, the control scheme can compensate for both process drifts and shifts and therefore improve the robustness of the CMP performance. The blanket monitor wafer information can be used to compensate for head-to-head variations and the process shifts. The control scheme is easy to implement in a fab production environment without increasing high capital investments. Therefore it is an effective method to improve the productivity.

The paper is organized as follows. In section II, we describe Lam linear planarization technology (LPT). We present the run-to-run control scheme for the IMD CMP processes in section III. Some implementation issues are discussed in section IV. In section IV, we also present some production wafer results. Finally we present the concluding remarks in section V.

# II. LINEAR CHEMICAL-MECHANICAL PLANARIZATION

CMP processes use the chemical and mechanical interactions among the wafer, polishing pad, and the slurry to planarize the wafer surface. Fig. 1 shows the schematic of the Lam LPT. The polishing pad is moving linearly while the wafer carrier is rotating against the pad. An air-bearing supports the polishing pad from the underneath air platen. During a CMP process, the surface of polish pad must be maintained at a certain roughness level in order to keep the process performance stability. Conditioning the pad is an effective method to maintain the roughness level. In practice, a moving conditioning disk is pushed against on the moving polish pad (Fig. 1).

An integrated CMP polisher consists of two polisher modules (as shown in Fig. 2) and other auxiliary modules such as post-CMP cleaning modules and transferring robots etc. A set of four heads (wafer carriers) move wafers from one module to the next module by a head indexer.



Fig. 1. Schematic of an LPT CMP module.



Fig. 2. A layout schematic of Lam CMP systems.

One of the most important CMP process specifications is the material removal rate (RR). Maintaining a stable RR is challenging because of the varying environments, wears and changes of polishing pad and conditioner disk, and the lack of *in-situ* sensors. Fig. 3 shows the removal rate variations of blanket oxide monitor wafers and polishing pad lifetime in a production fab. It is observed that RR has been drifting down slowly during a pad lifetime. When we change the polishing pad, the RR shifts significantly. The amount of RR drifting and shifting is also varying from the pad-to-pad and the run-to-run. For patterned wafers, the RR also depends on the wafer topography and it is more difficult to accurately predict the RR. Therefore we need a RR adjustment mechanism to compensate for the RR drifts and shifts.



Fig. 3. RRox drifts and shifts as functions of the polishing pad lifetime.

# III. POST-CMP FILM THICKNESS CONTROLLER Design

Fig. 4 shows a schematic diagram of the IMD film thickness change during a CMP process. For IMD devices, an inter-connection metal layer (Al/Cu) is embedded into the oxide layer (SiO<sub>2</sub>). The oxide layer deposition process produces a non-planarized topography. A CMP process is then used to remove and planarize the wafer surface down to a target oxide film thickness ( $L_1$ ).



Fig. 4. A schematic of the film thickness change for an IMD CMP process.

Due to the throughput requirement, we cannot measure each wafer before and after the CMP process to obtain the incoming and post-CMP film thicknesses. The metrology could be either in-line (integrated with the CMP polisher) or off-line (stand-alone). In this study, we use off-line metrology to measure the incoming and post-CMP thicknesses of one wafer per each lot (25 wafers). The delay time of the measurement could be one or two lots.

There are mainly three types of variations in the process flow that affect the CMP performance: (1) incoming thickness variations, (2) CMP process variations, and (3) metrology measurement noises. The incoming thickness variation comes from the upstream process flows, such as deposition. The CMP process variation results from the process drifts and shifts due to the consumable (pad and conditioner disk) wear and change. In this paper, we neglect the metrology measurement noises and only consider the first two variations. We design a run-to-run film thickness control scheme through two steps. In the first step, we propose a prediction-correction scheme to compensate for the incoming thickness variations assuming the RR is stable; in the second step, we modify the control scheme to compensate for the RR variations.

# A. Design of run-to-run thickness control scheme with a constant RR

We denote the nominal incoming film thickness as  $L_0$ , target post-CMP film thickness as  $L_1$ , and the metrology delay as  $N_d$ . At *n*th run (lot), we measure the pre-thickness  $L_{\rm pre}(n)$  (in Å) and calculate the polishing time  $T_p(n)$  (in sec.) according to following equations.

1. **Prediction** – before we polish wafers at *n*th run (lot),

$$T_0(n) = \sum_{i=1}^{N} \alpha_i T(n-i)$$
 (1)

$$T_p(n) = T_0(n) + \frac{L_{\text{pre}}(n) - L_0}{\overline{RR}_p}$$
. (2)

2. Correction – after we have the post-thickness measurement  $L_{pst}$  of *n*th run (lot),

$$T(n) = T_0(n) + \frac{L_{\text{pst}}(n - N_d) - L_1}{\overline{RR}_p}$$
, (3)

where  $\alpha_i$  is the moving average (MA) coefficients and  $\overline{RR}_p$ is the estimated production wafer RR (constant). N is the moving average horizon length. T(n) is a state variable.  $T_0(n)$  is a moving average of the state variable T(n). The MA model coefficients  $\alpha_i$  satisfy

$$\alpha_i \ge 0$$
, and  $\sum_{i=1}^{N} \alpha_i = 1$ . (4)

It is easy to write the post-thickness  $L_{pst}(n)$  for *n*th run as the function of pre-thickness  $L_{pre}(n)$  and polishing time  $T_p(n)$  as the following equation.

$$L_{\text{pst}}(n) = L_{\text{pre}}(n) - RR_p(n)T_p(n), \qquad (5)$$

where  $RR_p(n)$  is the average RR for the *n*th run. Combining Eqs. (1)-(3) and (5), we obtain

$$\begin{split} T_{0}(n) &= \sum_{i=1}^{N} \alpha_{i} \left[ T_{0}(n-i) - \beta_{n-i-N_{d}} T_{0}(n-i-N_{d}) \right] + \\ &= \frac{1}{\overline{RR_{p}}} \sum_{i=1}^{N} \alpha_{i} (1 - \beta_{n-i-N_{d}}) L_{\text{pre}}(n-i-N_{d}) + \\ &= \frac{1}{\overline{RR_{p}}} \sum_{i=1}^{N} \alpha_{i} \left( \beta_{n-i-N_{d}} L_{0} - L_{1} \right) \,, \end{split}$$

where

$$\beta_i = \frac{RR_p(i)}{\overline{RR}_p}$$

is the ratio of the production wafer removal rate  $RR_p$  and the estimated removal rate at *i*th run.

It is appropriate to assume that the pre-thicknesses  $L_{\rm pre}(n)$  are independently and identically (i.i.d.) normal distribution, i.e.  $L_{\rm pre} \sim \mathcal{N}(L_0, \sigma_{\rm pre}^2)$ , where  $\sigma_{\rm pre}$  is the standard variation. Then we can write  $L_{\rm pre}(n)$  as

$$L_{\rm pre}(n) = L_0 + w_{\rm pre}(n), \qquad (6)$$

where  $w_{\text{pre}}(n)$  is i.i.d. and  $w_{\text{pre}}(n) \sim \mathcal{N}(0, \sigma_{\text{pre}}^2)$ . Plugging the above equation into  $T_0$  dynamics and using Eq. (4), we obtain

$$T_0(n) = \sum_{i=1}^N \alpha_i \left[ T_0(n-i) - \beta_{n-i-N_d} T_0(n-i-N_d) \right] + \frac{1}{\overline{RR}_p} \sum_{i=1}^N \alpha_i (1 - \beta_{n-i-N_d}) w_{\text{pre}}(n-i-N_d) + \frac{\Delta L}{\overline{RR}_p} ,$$
(7)

where  $\Delta L = L_0 - L_1$  is the target material removal.

**Remark 1** The stability of  $T_0(n)$  depends on the choice of model coefficients  $\alpha_i$ , moving average horizon length N, and metrology delay  $N_d$ . In this study, the metrology delay is only one or two lot ( $N_d = 1$  or 2). If we neglect such a delay we can prove that  $T_0$  is stable by the choice of  $\alpha_i$  as Eq. (4). A similar discussion can be found in [11].

Using Eqs. (2), (5) and (7), we can obtain the postthickness  $L_{pst}(n)$  as the function of  $T_0$ .

$$L_{\rm pst}(n) = L_0 - \beta_n \overline{RR}_p T_0(n) + (1 - \beta_n) w_{\rm pre}(n) \,. \tag{8}$$

We are ready to calculate the post-thickness variation due to the pre-thickness variation  $w_{pre}(n)$ . In this case,  $RR_p(n) = RR_p$  and therefore

$$\beta_i = \frac{RR_p(i)}{RR_p} = \frac{RR_p}{RR_p} = \bar{\beta} = \text{constant}$$

We can calculate the expectation and variance of the post-thickness  $L_{pst}(n)$  as

$$\mathbf{E}(L_{\rm pst}) = L_1 \tag{9}$$

$$\mathbf{Var}(L_{\text{pst}}) = \sigma_{\text{pst}}^2 = \frac{2(1-\bar{\beta})^2}{2-\bar{\beta}}\sigma_{\text{pre}}^2.$$
(10)

From the analysis above, we can clearly see that  $T_0(n)$  has the expectation of a polishing time that removes a thickness of  $\Delta L$  for a given constant removal rate  $RR_p$ . Under such a control scheme, the post-CMP film thickness has the expectation value at the target thickness  $L_1$ . In order to have the smallest variation, from Eq. (10) we need  $\bar{\beta} = 1$ , which implies that the estimated production wafer removal rate is equal to the real RR.

# *B.* Design enhancement with blanket monitor wafer information

In this section, we investigate how to use the information of the daily monitor wafer removal rate  $RR_{ox}(n)$  to compensate the slow removal rate drift and the large RR shifts. In production runs, a few testing blanket oxide wafers are regularly sent to the CMP polisher to check and monitor the process stability and machine conditions. We can use the RR information of these blanket oxide wafers to estimate the production wafer removal rate  $RR_p(n)$ .

Assume that the oxide monitor wafer removal rate  $RR_{ox}(n)$  and production wafer removal rate  $RR_p(n)$  are following the same trend as functions of consumable lifetime.

$$RR_p(n) = RR_{p0}(1 + \epsilon(n)) + w_p(n)$$
 (11)

$$RR_{ox}(n) = RR_{ox0}(1 + \epsilon(n)) + w_{ox}(n), \quad (12)$$

where  $\epsilon(n)$  is a removal rate drifting function.  $RR_{p0}$  and  $RR_{ox0}$  are the production and blanket wafer removal rates with a set of new consumable sets (n = 0).  $w_p$  and  $w_{ox}$  are the removal rate run-to-run noises for production and oxide blanket wafers, respectively.

If we neglect the removal rate noises  $w_p$  and  $w_{ox}$ , we can re-write  $\beta_i$  as

$$\beta_n = \frac{RR_p(n)}{\overline{RR}_p} = \frac{RR_{p0}}{\overline{RR}_p} \frac{RR_{ox}(n)}{RR_{ox0}} = \overline{\beta} \frac{RR_{ox}(n)}{RR_{ox0}} .$$
(13)

With the information of oxide removal rate  $RR_{ox}(n)$  at nth run, we can enhance the run-to-run thickness control scheme discussed in the previous section using the above equation. Particularly, we can cancel the impact of the removal rate drift due to consumable changes by adding the compensation factor  $\gamma(n)$  in the control algorithm (2) as

$$T_p(n) = T_0(n) + \gamma(n) \frac{L_{\text{pre}}(n) - L_0}{\overline{RR}_p}, \qquad (14)$$

where  $\gamma(n) = \frac{RR_{ox0}}{RR_{ox}(n)}$ . To see how the modified estimation (14) can cancel the process shifts, one can re-derive  $T_0$  dynamics with Eqs. (13) and (14).

A simple way to implement the design enhancement into production could be estimation of function curve  $\gamma(n)$  based on the consumable lifetime *n*. The estimation can be accomplished off-line, stored inside the computer and reset the starting point of the estimation function when we change the consumable set after each preventive maintenance (PM)<sup>1</sup>.

#### IV. IMPLEMENTATION AND RESULTS

#### A. Head-to-head (H2H) variations

For the integrated CMP polisher shown in Fig. 2, wafers are flowed into each polishing module on four different heads (wafer carriers). Ideally process performance on

<sup>1</sup>We assume that the shape of  $\gamma(n)$  does not change after each PM.

each head should be identical. However, due to the head manufacturing variations, each head sometimes shows a particular RR characteristic. In production it is not practical to implement the proposed run-to-run thickness control scheme for each individual head. Instead, we propose to track the blanket oxide monitor wafer removal rates on each head and use this information to compensate for head-to-head variations.

A compensation ratio  $(r_H)$  is calculated based on the daily blanket monitor wafers. Denote the *n*th monitor wafer RR for Head #*i* as  $RR_{ox_i}(n)$ , i = 1, 2, 3, 4. We can calculate the offset for Head #*i* at *n*th run as

$$\delta RR_{ox_i}(n) = RR_{ox_i}(n) - \overline{RR}_{ox}(n), \qquad (15)$$

where  $\overline{RR}_{ox}(n) = \frac{\sum_{i=1}^{4} RR_{ox_i}(n)}{4}$ . The compensation ratio  $r_H$  is defined as the fraction of offset  $\delta RR_{ox_i}(n)$  at *n*th run that is used to adjust for (n+1)th run. Denote the compensated monitor wafer RR as  $\widehat{RR}_{ox_i}(n+1)$  for Head #*i* at (n+1)th run. Then we have

$$\widehat{RR}_{ox_i}(n+1) = RR_{ox_i}(n+1) + r_H \times \delta RR_{ox_i}(n) .$$
(16)

In order to obtain an optimal compensation ratio  $r_H^*$ , we minimize the average head-to-head RR range of the compensated monitor wafer removal rate,  $\widehat{RR}_{ox_i}(n)$ , over a set of m monitor wafer runs.

$$r_H^* = \arg\min\left(\frac{\sum_{n=1}^m \Delta \widehat{RR}_{H2H}(n)}{m}\right), \quad (17)$$

where

$$\Delta \widehat{RR}_{H2H}(n) = \max_{1 \le i \le 4} \widehat{RR}_{ox_i}(n) - \min_{1 \le i \le 4} \widehat{RR}_{ox_i}(n) \,.$$

Fig. 5 shows an example of the estimated average headto-head RR range as a function of the compensation ratio  $r_H$  from a production fab over a three-month period. An optimal value  $r_H^* = 0.66$  is found to minimize the average RR range.



Fig. 5. Average estimate range of  $\Delta RR_{ox}$  vs. compensation ratio.

### B. Algorithms

We implement the run-to-run thickness control scheme at a production fab for IMD CMP processes. The algorithms are executed on the fab central computer systems. Based on measurements and process information feedback from the metrology and CMP polisher at the current run, the polishing time for the next run is computed and then feedback to the polisher.

Fig. 6 illustrates the flow chart of the run-to-run advanced process control (APC) algorithm. In production runs, we set a threshold  $T_{\rm th}$  for the run-to-run polishing time difference between two runs. If the calculated polishing time  $T_p$  has a large variation from the previous run, we consider the calculation is in-valid and use the previous run's polishing time. Also we discard this set of data for future calculations. We update the polishing time calculation once the metrology measurement is available. The update of head-to-head RR compensation is dependent on the availability of the blanket oxide monitor wafer RR information.



Fig. 6. A flow chart of the advanced process control (APC) algorithm.

#### C. Controller parameters

In control algorithms (1)-(3), we need to choose the parameters such as moving average coefficients  $\alpha_i$  and horizon length N, etc.

In production, we choose the coefficients  $\alpha_i$  as

$$\alpha_i = \lambda (1 - \lambda)^{i-1}, \quad 0 \le \lambda \le 1, i = 1, 2, \cdots, N.$$
 (18)

It can be shown that by choosing  $\alpha_i$  as in Eq. (18) the control scheme (1)-(3) is equivalent to the double-EWMA controller [11]. If we choose  $\alpha_i$  by Eq. (18), condition 4 can be satisfied only when the moving average N is large. Therefore, we use N = 20 in the implementation.

To determine how to choose parameter  $\lambda$  in Eq. (18), we use simulation to find the optimal value  $\lambda^*$ . We use the  $C_{\rm pk}$  value to quantitatively define the CMP production performance. Denote the CMP target film thickness as  $L_1$  and an associated thickness tolerance as  $S_t$ . For a set of K wafers with post-CMP thicknesses  $L_{\rm pst}(i)$ , i = $1, 2, \dots, K$ , the  $C_{\rm pk}$  value for such a CMP process is defined as

$$C_{\rm pk} = \frac{\min\{(L_1 + S_t) - \bar{L}_{\rm pst}, \bar{L}_{\rm pst} - (L_1 - S_t)\}}{3\sigma_{\rm pst}}, \quad (19)$$

where  $\bar{L}_{pst} = \frac{\sum_{i=1}^{K} L_{pst}(i)}{K}$  and  $\sigma_{pst}^2 = Var(L_{pst}(i))$ . It is easy to observe that a higher  $C_{pk}$  value implies a better film thickness control. The optimization target is to achieve a maximum  $C_{pk}$  value for the production wafers.

$$\lambda^* = \arg\max C_{\rm pk}(\sigma_{RR}). \tag{20}$$

Fig. 7 shows the  $C_{\rm pk}$  values as a function of both parameter  $\lambda$  and RR standard derivation  $\sigma_{RR}$ . The simulation study is based on the production data information. From Fig. 7 we can see that at a certain RR standard derivation  $\sigma_{RR}$ ,  $C_{\rm pk}$  value first increases quickly between  $\lambda = 0$  to 0.2 and then slowly decreases until  $\lambda = 1$ . We then can calculate the  $\lambda$  values that give the maximum  $C_{\rm pk}$ s under a certain level of  $\sigma_{RR}$ . In this study,  $\lambda^* = 0.33$  is found the average value of  $\lambda$  that achieves the maximum  $C_{\rm pk}$ .



Fig. 7. A simulated  $C_{\rm pk}$  value as a function of  $\lambda$  values and RR standard derivation (std).

#### D. Production results

Fig. 8 shows the post-CMP thickness for one type IMD device on one polisher. The average incoming oxide film thickness is  $L_0 = 15500$  Å, the post-CMP target film thickness  $L_1 = 5500$  Å, and the process thickness tolerance  $S_t = 1000$  Å. We use  $\overline{RR}_p = 6400$  Å/min,  $\overline{RR}_{ox} = 3400$  Å/min, and N=20. We obtain  $\lambda = 0.33$  off-line from the blanket oxide monitor wafer data. We can see that the runto-run post-CMP thickness control (Fig. 8(b)) maintains a tighter and smaller variations than those of without control. Calculations of  $C_{\rm pk}$  values show a significant improvement from 1.22 (without the control) to 1.58 (with the control).

Table. I shows the  $C_{pk}$  value improvements for another type IMD device on a different polisher under the thickness control scheme with and without head-to-head compensation.  $C_{pk}$  values show an improvement from 1.32 (without the compensation) to 2.11 (with the compensation). The average incoming oxide film thickness is  $L_0 = 16500$ Å, the post-CMP target film thickness  $L_1 = 10500$  Å, and the process thickness tolerance  $S_t = 900$  Å. We use the same type of post-CMP thickness control scheme with head-to-head compensation. The compensation ratio  $r_H^* = 0.66$  is chosen from the blanket oxide monitor wafer data. The run-to-run post-CMP thickness control with a head-to-head compensation maintains a tighter and smaller variations than those of without compensation. The significant improvement of  $C_{pk}$  is due to the fact that the four wafer carriers on this polisher show a large variance compared with the run-to-run variance.

The improvements of production  $C_{pk}$  values clearly demonstrate that by using a proper designed run-to-run control scheme we can improve the CMP process performance significantly.

TABLE I  $C_{PK}$  value improvement

|              | Without H2H compen. | With H2H compen. |
|--------------|---------------------|------------------|
| $C_{\rm pk}$ | 1.32                | 2.11             |

#### V. CONCLUSION

In this paper, we discussed design and implementation of a run-to-run film thickness control system for chemicalmechanical planarization (CMP) processes. A run-to-run based predictor-corrector type of controller was utilized to regulate the processing time based on the metrology measurement of each lot before and after CMP processes. Information of monitor wafer removal rate and consumable lifetime was used to compensate for the process drifts and shifts. We also considered the head-to-head compensation in the proposed run-to-run controller. The proposed control algorithms have been used in a production fab and the results showed a significant improvement of CMP performance.

#### REFERENCES

- N. Patel, G. Miller, and S. Jenkins, "In-Situ Estimation of Blanket Polish Rates and Wafer-to-Wafer Variations," *IEEE Trans. Semiconduct. Manufact.*, vol. 15, no. 4, pp. 513–522, 2002.
- [2] D. Boning, W. Moyne, T. Smith, J. Moyne, R. Telfeyan, A. Hurwitz, S. Shellman, and J. Taylor, "Run by Run Control of Chemical-Mechanical Polishing," *IEEE Trans. Comp., Packag., Manufact. Technol. C*, vol. 19, no. 4, pp. 307–314, 1996.
- [3] R. Telfeyan, J. Moyne, N. Chaudhry, J. Pugmire, S. Shellman, D. Boning, W. Moyne, A. Hurwitz, and J. Taylor, "A Multilevel Approach to the Control of a Chemical-Mechanical Planarization Process," *Journal of Vacuum Science and Technology*, A, vol. 14, no. 3, pp. 1907–1913, 1996.
- [4] A. Chen and R. Guo, "Age-Based Double EWMA Controller and Its Application to CMP Processes," *IEEE Trans. Semiconduct. Manufact.*, vol. 14, no. 1, pp. 11–19, 2001.

#### PRODUCTION AND OPERATIONS MANAGEMENT



(a)



(b)

Fig. 8. One type IMD device post-CMP film thickness, (a) without APC ( $C_{\rm pk} = 1.22$ ), (b) with APC and no head-to-head compensation ( $C_{\rm pk} = 1.58$ ).

- [5] N. Patel, G. Miller, C. Guinn, A. Sanchez, and S. Jenkins, "Device Dependent Control of Chemical-Mechanical Polishing of Dielectric Films," *IEEE Trans. Semiconduct. Manufact.*, vol. 13, no. 3, pp. 331–343, 2000.
- [6] J. Yi, Y. Sheng, and C. Xu, "Neural Network Based Uniformity Profile Control of Linear Chemical-Mechanical Planarization," *IEEE Trans. Semiconduct. Manufact.*, vol. 16, no. 4, pp. 609–620, 2003.
- [7] T. Smith and D. Boning, "A Self-Tuning EWMA Controller Utilizing Artificial Neural Network Function Approximation Techniques," *IEEE Trans. Comp., Packag., Manufact. Technol. C*, vol. 20, no. 2, pp. 121–132, 1997.
- [8] N. Patel and S. Jenkins, "Adaptive Optimization of Run-to-Run Controllers: The EWMA Example," *IEEE Trans. Semiconduct. Manufact.*, vol. 13, no. 1, pp. 97–107, 2000.
- [9] S. Butler and J. Stefani, "Supervisory Run-to-Run Control of Polysilicon Gate Etch Using *In Situ* Ellipsometry," *IEEE Trans. Semiconduct. Manufact.*, vol. 7, no. 2, pp. 193–201, 1994.
  [10] J. Wang, S. Qin, C. Bode, and M. Purdy, "Recursive Least Squares
- [10] J. Wang, S. Qin, C. Bode, and M. Purdy, "Recursive Least Squares Estimation and Its Application to Show Trench Isolation," in *Proceedings of SPIE: Advanced Process Control and Automation*, vol. 5044, Santa Clara, CA, 2003, pp. 109–120.
- [11] J. Mullins and J. Zou, "Frequency Domain Stability and Performance Analysis of Moving Average Run-to-Run Controllers," in *Proceedings of the 2003 AEC/APC Symposium*, Denver, CO, 2003.