Robust Power System State Estimation Using t-Distribution Noise Model

In this paper, we propose an optimal robust state estimator using maximum likelihood optimization with the <inline-formula><tex-math notation="LaTeX">$t$</tex-math></inline-formula>-distribution noise model. In robust statistics literature, the <inline-formula><tex-math notation="LaTeX">$t$</tex-math></inline-formula>-distribution is used to model Gaussian and non-Gaussian statistics. The influence function, an analytical tool in robust statistics, is employed to obtain the solution to the resulting maximum likelihood estimation optimization problem, so that the proposed estimator can be implemented within the framework of traditional robust estimators. Numerical results obtained from simulations of the IEEE 14-bus system, IEEE 118-bus system, and experiment on a microgrid demonstrated the effectiveness and robustness of the proposed estimator. The proposed estimator could suppress the influence of outliers with smaller average mean-squared errors (AMSE) than the traditional robust estimators, such as quadratic–linear, square-root, Schweppe–Huber generalized-M, multiple-segment, and least absolute value estimators. A new approximate AMSE formula is also derived for the proposed estimator to predict and evaluate its precision.

Estimator third threshold parameter for e i . Ω, Λ Diagonal matrix. v i , ξ i t-distribution parameters.

I. INTRODUCTION
P OWER system state estimation (PSSE) is one of the key functions in an energy management system. The Gaussian noise assumption is commonly made in PSSE but this assumption is only an approximation to reality [1]. For example, there are studies in power systems where the Gaussian noises are corrupted by outliers with uniform [2] or Gaussian distribution [3]. The recent introduction of phasor measurement unit (PMU) [4]- [6] makes possible the measurements of voltages and currents in a synchronized manner with respect to the global positioning system and the measurement model becomes linear. However, the PMU measurements are known to experience sudden changes or may become unavailable due to the communication system malfunction or noise [7]. This would lead to occurrence of outliers and then generate non-Gaussian measurements errors [8]. Outliers that are far away from the expected Gaussian distribution function can give rise to misleading estimation results [9].
The weighted least squares (WLS) estimator is widely used but it is not robust. Therefore, the largest normalized residuals (LNR) test-based method [10] is usually used in conjunction with WLS. The negative impact of the outliers on the state estimates and several ways to reduce them have been discussed in [11], where Merrill and Schweppe propose a new approach to screen and suppress the gross error in measurements by iterations. Following this approach, many traditional M-estimators with nonquadratic cost functions such as quadratic-linear (QL), square-root (SR), Schweppe-Huber generalized-M (SHGM), multiple-segment (MS), and least absolute value (LAV) have been discussed in standard textbooks [12], [13] and papers [5], [14]- [16] where Gaussian noise assumption is commonly made.
In this paper, we propose an optimal robust state estimator using maximum likelihood optimization with the t-distribution noise model. It is well known that the t-distribution reduces to the Gaussian distribution when the shape parameter tends to infinity [8]. Therefore, the t-distribution can be used to model both the Gaussian and non-Gaussian noise distributions. Moreover, the t-distribution is commonly used in robust state estimation literature [17]- [19] to model noise with outliers. In Example 1, we use the t-distribution to model the Gaussian noise corrupted by outliers with the uniform [2] or Gaussian distribution [3].
The main contributions of this paper are as follows. 1) Instead of the usual Gaussian noise distribution, the use of t-distribution in the proposed estimator could suppress the influence of outliers and provides smaller average meansquared errors (AMSE) than the traditional robust estimators mentioned earlier.
2) The influence function (IF) is employed to give the solution to the optimization problem arising from maximum likelihood estimation (MLE) so that the proposed estimator with t-distribution noise can be efficiently implemented within the framework of traditional robust estimators. 3) A new approximate AMSE formula is also derived for the proposed estimator. Recently, many new robust estimators are proposed [17]- [23]. However, to obtain the statistics such as the AMSE or variances of their state estimates, hundreds of simulations have to be run [24]. In contrast, for the proposed estimator, because of the IF approximation, AMSE or variances can be obtained from an analytical formula. The formula as a mathematical function is more insightful than just a numerical answer from simulation. For example, in estimator design, instead of trial and error through simulations, the formula allows us to predict and evaluate its precision in terms of AMSE or variances for different choices of estimator parameter values leading to a more efficient and effective estimator design process.
The rest of the paper is organized as follows. The robust state estimation problem is formulated in Section II and the proposed estimator using IF approximation is discussed in Section III. Numerical results obtained from the simulations of the IEEE 14-bus system, IEEE 118-bus system, and experiment on a microgrid are presented in Section IV to verify the effectiveness of the proposed estimator in terms of AMSE and computational time. Conclusions are given in Section V.

II. ROBUST STATE ESTIMATION
Robust state estimation has been discussed elsewhere in [2], [5], [12], and [16]. This section focuses on the equations necessary for the derivation of the results in this paper.

A. Measurement Model
Consider the following measurement model in the power system: where The nonlinear case is applicable when the measurements are collected from Supervisory Control and Data Acquisition (SCADA). Measurement z i (k), i = 1, . . . , m, is taken at each time instance k = 1, . . . , N. i (k) is the measurement noise and the rectangular coordinates of the bus voltage phasor x(k) are the states of the system. A traditional power system may be considered as a quasi-static system [25] because load demands change slowly and, hence, the state changes slowly. The typical measurement sampling interval is 2-4 s while the estimates are usually updated only once every few minutes [26]. Hence, the estimated states during the time instants k = 1, . . . , N can be taken asx. The first-order Taylor series expansion about the operating pointx is used to approximate the nonlinear function h(x(k)) in (1) to give where the Jacobian matrix The ith measurement residual is defined as where When the entire measurements are collected from enough PMUs and we assume that the system is observable, model (1) reduces to [5] where H is the measurement matrix. In this case, the ith measurement residual is

B. Traditional Robust State Estimation
Given N sets of m measurements, the state estimatex is obtained by minimizing the following cost function: where ρ-function is a nonlinear chosen function of the residual e i (k). Differentiating (7) wrtx gives where (·) T represents the transpose operation and Using (6), (8) can be written as wherē When the measurement model is nonlinear, linearization is proceeded using (3) and (4). Moreover, z(k) in (11) is replaced bỹ To minimize the cost function in (7), set ∂J ∂x = 0 in (10) and the solution forx is given aŝ Notice that the matrixH T WH is an invertible matrix when the system is assumed to be observable. For the WLS estimator, the element of diagonal matrix W is constant, i.e., the reciprocal of the error variance for measurements 1/σ 2 i . Note that σ i is the standard deviation of noise i . The LNR method [10] is usually used in WLS to deal with bad data. The normalized residuals are calculated as The normalized residuals e norm i (k) are calculated according to the residual covariance matrixΩ and measurement residual e i (k). If the normalized residuals e norm i (k) are larger than a pre-determined threshold, the largest one will correspond to the bad measurement Z bad i . Once the LNR is found, corresponding measurement is updated The states will then be re-calculated based on the updated measurements Z new i . Several iterations may be needed in order to make sure that all normalized residuals are less than predetermined threshold, for example, 3.0 [5].
The matrix W in (13) for the MS, QL, SR, and SHGM estimators is a re-weighted matrix and is updated according to Table I [ 12], [24].
The parameters a i , b i , and r i are thresholds, and κ i is the penalty factor chosen specifically to cancel the effect of any existing leverage points in the measurement set. Note that a i < b i < r i . The QL, WLS, and LAV estimators are special cases of an MS estimator. The MS estimator reduces to the QL estimator when b i → ∞ and the WLS estimator when a i → ∞. The QL estimator reduces to the LAV estimator when the threshold a i is small [12].

III. PROPOSED ESTIMATOR
In this section, a new robust estimator based on MLE is proposed where the t-distribution is used to model noise with outliers. Moreover, the IF is used to solve the optimization problem arising from MLE so that the proposed estimator can be easily implemented within the framework of the traditional robust estimators.

A. t-Distribution
In the proposed state estimation algorithm, the following t-distribution probability density function (pdf) [17] is used to where Γ(·) is the gamma function, ξ i is the scale parameter, and v i is the shape parameter. It is well known that the t-distribution reduces to the Gaussian distribution when the shape parameter tends to infinity [8], [17]. Therefore, the t-distribution can be used to model both the Gaussian and non-Gaussian noise distribution. Moreover, the t-distribution is commonly used in robust state estimation literature [17]- [19] to model and study outliers. The t-distribution with ξ i = 0.01 and different v i are shown in Fig. 1. When the shape parameter v i tends to ∞, the t-distribution reduce to a Gaussian distribution (36) with σ i = 0.01.

B. Maximum Likelihood Estimation
In MLE, we minimize the following performance index: To minimize J o , take the derivative of the log likelihood J o and

C. IF Illustration
A textbook example [12] is used to illustrate the IF. The measurement model is given as where the noise k follows the t-distribution with v i = 3, ξ i = 2, and i = 1, . . . , 4. The measurements are z = −3.01 3.52 −5.49 4.03 5.01 T .
According to (18) Letting Ψ = 0 gives the MLE solutionx MLE = [3.005, −4.010] T . The red surfaces at the top of Fig. 2 shows the plots of Ψ 1 and Ψ 2 as functions ofx 1 andx 2 , whereas the black surfaces represent Ψ 1 = 0 and Ψ 2 = 0. The two curves from the intersections of the red surfaces and the black surfaces are obtained by setting (20) and (21) equals zero. The intersection between the two curves, as shown at the bottom of Fig. 2, is the solution to the MLE.
An alternative approach to solve Ψ = 0 is given as follows. Take the first-order Taylor series approximation of Ψ about the  operating pointx giving Setting Ψ = 0 in (22) giveŝ   The term ∂ Ψ ∂x x=x in (22) is sensitive to the change from 5.01 to 15.01. Therefore, replacing the term with its expectation, i.e., Setting (24) equals zero giveŝ where the second term on the right-hand side is the IF [28]. If the value of the fifth measurement is still 5.01 and the other measurements are unchanged, an approximation ofx MLE using (25)

D. IF Approximation
The definition of the IF is given in [1], [28], and [29] as where since the measurement noise i is assumed to be independent and identically distributed. Using (26),x is given aŝ In practice, since x is unknown, the noise vector ε = Z −Hx is replaced by E = Z −Hx and (28) is used to findx iteratively, given an initial guessx aŝ where IF(E) is given by The derivation of (29) and the definition of Ω and Λ are given in Appendix.
The noise pdf can be obtained by analyzing the historical data. The robust estimators, such as MS, QL, SR, and SHGM estimators, need to calculate the inverse of matrix (H T WH) as given in (15) because the matrix W is updated during the iteration process. When the size of power system becomes large, the inverse computation will take a long time. However, the matrix (H T ΩH) −1 of the IF in (29) can be pre-computed because Ω remains unchanged, thus saving computational load in real-time applications.
Unlike [17]- [19], the contribution of this paper is to employ the IF approximation so that the proposed estimator with tdistribution noise can be efficiently implemented within the framework (see Fig. 6) of traditional robust estimators. This flowchart shows that the only difference between the algorithms of the traditional robust estimators and the proposed estimator is in the calculation of Δ. For the traditional robust estimator, Δ = (H T WH) −1H T W E according to (15). For the proposed estimator, Δ = (H T ΩH) −1H T ΛE according to (29).

E. Estimation Precision
Precision of estimation is traditionally given by its variances. As given in [28], the variance of the state estimates can be approximated as Let the residue E be given by the noise ε in (10), (30) gives where Ω =   Substituting (19) into (32) gives

The sum of the variances (SV) is the trace of the matrix
and taking the average gives which can be approximated from the simulation as the AMSE where N r is the number of simulation runs.

F. WLS Connection
The proposed estimator will reduce to WLS when the noise i is Gaussian It is well known that (16) reduces to (36) when v i → ∞ and ξ i = σ i [17]. According to (39) and (40), we would get Then, the estimated resultsx is given aŝ which is the same as (13).

IV. SIMULATION AND EXPERIMENT RESULTS
In this section, simulation results obtained from the IEEE 14-bus and 118-bus systems are given in Example 1 and 2, respectively. The measurement noises are generated according to [2] and [3]. Experimental results obtained from the 13-bus microgrid at the Nanyang Technological University (NTU) is given in Example 3 where the measurements are collected from the real system and the noise distributions are unknown. For easy reference, the pdfs of the Gaussian distribution and outlier, the t-distribution (to model noise with outliers), the Gaussian distribution (to model noise with outliers), and the parameters of the robust state estimators are summarized in Table II.

A. Example 1: IEEE 14-Bus System With PMU Measurements
The IEEE 14-bus system is modeled in RT-LAB, a real-time simulation platform for power systems [30], [31]. Meanwhile, the proposed estimator is programmed on a remote computer simulating the control center. The computer is connected to RT-LAB via cable.
The PMUs are placed according to [6]. Fifty-eight measurements z i , i = 1, . . . , 58, comprising of 12 voltages (i = 1, . . . , 12) and 46 currents (i = 13, . . . , 58) are taken at each time instance and N = 3 sets of measurements are used to give one set of estimates. The measurement matrix H is calculated according to the parameters in [31]. There are n = 28 states in the vector The real and imaginary part of the voltage phasor are defined as V r i and V im i , respectively. The parameters for the QL and SR estimators are chosen as a i = 3, whereas the parameters for the MS estimator are a i = 3, b i = 4, and r i = 5 according to [5]. The penalty κ i , i = 1, . . . , 58, of the SHGM estimator is calculated according to [12] and [15]. Three different types of measurement noise pdfs are considered in the simulations. Note that the actual noises need not to be given by the t-distribution. There are studies in power systems where the Gaussian noises are corrupted by outliers with the uniform or Gaussian distribution [2], [3]. In Example 1, we model the Gaussian noise corrupted by outliers with the uniform or Gaussian distribution by the t-distribution.

1) 1 Gaussian + 1 Uniform (1G+1U):
First, the noise is associated with the pdf where σ i = 0.005. The pdf in the form of a mixture distribution in (37) is also used in [2]. The 5% of the uniform distribution 0.05 4σ i in (37) is useful for modeling initial conditions, disturbances, and measurement errors that are equally likely to occur anywhere within a given interval [33]. Fig. 7 shows how the Gaussian (σ i = 0.007) and t-distribution (v i = 4 and ξ i = 0.0046) can be fitted to the histogram using the maximum likelihood criterion according to [34]. The histogram represents the measurement noise generated by (37). In practice, the fitting process can be done offline using historical measurements.
The threshold parameters δ and q max are chosen as 10 −8 and 200, respectively. The AMSE results obtained by N r = 10000 simulations run using (35) and by the formula in (34) are given in Table III. The AMSEs of the proposed estimator obtained from simulation are verified by the formula in (34) as given in the last two columns. In addition, it is clear that the AMSE of the proposed estimator is the smallest. Hence, we have demonstrated that the t-distribution can be used as the noise model in the proposed estimator. Compared to the WLS estimator, the proposed estimator improves on the AMSE by 2.95−2.08 2.95 × 100% = 29.5%. The improvement of the proposed estimator over other robust estimators can be calculated similarly and the results are given in Table III.
2) 1 Gaussian + 1 Gaussian (1G+1G): In this section, the noise i is associated with the pdf [3], [28] f i ( i ) = 0.95 where σ i = 0.005. The first term in the above mentioned pdf represents the 95% of the Gaussian noise, whereas the second term represents outliers by the 5% Gaussian distribution with standard deviation 10σ i . Following the above-mentioned section, the t-distribution with v i = 2 and ξ i = 0.0042 and the Gaussian distribution with σ i = 0.012 are used to model the noise generated by (38). The AMSE results are given in Table III. It is clear that the AMSE = 2.20 × 10 −6 of the proposed estimator is the smallest. More estimated results such as the mean-squared error (MSE k = 1 n x k − x k 2 ),V r 1 , andV im 14 of the MS estimator and the proposed estimator are shown in Figs. 8 and 9(a) and (b), respectively. The ones obtained from the proposed estimator are closer to the true states.
3) t 3 Noise: In this section, let the noise i be associated with the following t-distribution, which is commonly used to model noise with outliers [28]. The parameters in (16) are chosen as v i = 3 and ξ i = 0.005 for measurement noises and Table II  TABLE IV  AMSES AND AVERAGE COMPUTATIONAL TIME FOR ONE SIMULATION RUN OF THE ESTIMATORS FOR THE IEEE 118-BUS SYSTEM AMSE unit: ×10 −6 .
Proposed estimator's improvement over itself is 0%. shows the Gaussian parameters obtained by the maximum likelihood criterion. The AMSE results of the different estimators are given in Table III. It is clear that the AMSE = 2.55 × 10 −6 of the proposed estimator is the smallest. Table III also shows the improvements of the proposed estimator over other robust estimators. Finally, it is clear from Table III that the AMSEs of the proposed estimator from simulation using (35) are verified by the formula in (34) as shown in the last two columns and they are smaller than the AMSEs of all other estimators studied in this paper.

B. Example 2: IEEE 118-Bus System With PMU Measurements
To further verify the performance of the proposed estimator, the IEEE 118-bus system is used in the simulation. The PMUs are placed according to [6] where a total number of 108 voltage measurements and 366 current measurements are considered. The pdf of measurement noise is given in (16) and the t-distribution parameters are v i = 3 and ξ i = 0.005. The AMSE results are given in Table IV and it is clear that the AMSE = 0.58 × 10 −6 of the proposed estimator is the smallest.
The computational efficiency of the proposed estimator is analyzed and compared to that of other robust estimators. The simulation runs using MATLAB version R2012b on a Windows 10 computer configured with Intel Core, CPU i7-4500U, 1.80 GHz, and 8 GB RAM, while the LAV optimization problem is expressed as an equivalent linear programming problem and is solved using GUROBI [5]. The computational time of the different estimators in the IEEE 118-bus system is given in Table IV. Except for WLS, WLS with LNR, and LAV estimators, the proposed estimator is faster. This is because the matrix (H T ΩH) −1 in the proposed estimator can be computed offline.

C. Example 3: Experiment in Microgrid
Experiments on a lab microgrid at NTU have been performed for isolated operations of the microgrid. The lab microgrid has  a total of 13 buses. There are several generation sources and loads currently installed for the microgrid. For simplicity, we use the programmable source and one load in the microgrid, see Fig. 10. The programmable source and the load are connected to Bus 7 and Bus 4, respectively. Measurement data were sampled at intervals of 5 s with very low noise levels. We fit the measurement noises with the t-distribution and Gaussian distribution using the maximum likelihood criterion. Fig. 11 shows the raw measurement (the real power flow from Bus 3 to Bus 2). The proposed estimator estimates the bus voltage phasor every 30 s. The system provides a total of 31 measurements consisting of 1 voltage magnitude measurement, 2 pairs of power injection, and 13 pairs of power flow.
According to [12]: "Treatment of bad data can be viewed as a robustness issue for an estimator. If the estimated state remains insensitive to major deviations in a limited number of measurements, then the corresponding estimator is considered statistically robust." In Fig. 12, a measurement (the real power flow from Bus 3 to Bus 2) outlier occurs at k = 300. Compared to the WLS and MS estimators, the proposed estimator is more robust since smaller deviations of the estimation results are obtained. Note that the QL, WLS, and LAV estimators are special cases of the MS estimator.

V. CONCLUSION
In this paper, a robust state estimator using the t-distribution as the noise model is proposed. The IF is employed to obtain the solution to the resulting MLE optimization problem so that the proposed estimator can be implemented within the framework of traditional robust estimators. Numerical results obtained from simulations of the IEEE 14-bus system, IEEE 118-bus system, and experiment on the microgrid demonstrated the effectiveness and robustness of the proposed estimator. The proposed estimator could suppress the influence of outliers with smaller AMSE than the traditional robust estimators. APPENDIX DERIVATION OF (29) According to (10), (12), (18), and (26) Λ i (k) = w i (k) = v i + 1 (ξ i ) 2 v i + (e i (k)) 2 . (40)