Training algorithm design and weight convergence analysis for discrete-time recurrent neural networks
Date of Issue2013
School of Electrical and Electronic Engineering
Recurrent neural networks (RNNs) have become an important study subject in the field of neural networks due to the remarkable developments in both theoretical research and practical applications. RNNs contain feedback loops in the structures which make them much more powerful in dynamical modeling of complex systems as compared with other neural network architectures. This thesis focuses on the design of robust training algorithms for RNNs based on the popular real time recurrent learning (RTRL) concept. As a starting point, an efficient robust gradient descent training algorithm for multi-input multi-output (MIMO) discrete-time RNNs is proposed which can provide an optimal or suboptimal tradeoff between RNNs training accuracy and weight convergence speed. We design a multivariate robust adaptive gradient-descent (MRAGD) training algorithm for MIMO RNNs. The weight convergence of MRAGD during training is proven in the sense of Lyapunov function. To test the efficiency of the proposed algorithm, RNN based system identifications for both open and close loop conditions are developed. The RNNs are trained by the MRAGD method and the weight convergence conditions are proven. Secondly, we propose a robust recurrent simultaneous perturbation stochastic approximation (RRSPSA) algorithm under the framework of deterministic system with guaranteed weight convergence. RRSPSA is inspired by the excellent properties of simultaneous perturbation stochastic approximation (SPSA) algorithm which is a well-known recursive procedure for finding roots of equations in the presence of noisy measurements. SPSA has the potential to be significantly more computationally efficient than the usual algorithms of Kiefer-Wolfowitz/Blum type that are based on standard finite-difference gradient. We show that RRSPSA has the same form as SPSA, and only two objective function measurements are used at each iteration, which maintains the efficiency of SPSA. Next, we propose a recurrent kernel online learning (RKOL) algorithm which integrates kernel and RTRL learning algorithms. The novel RKOL algorithm achieves guaranteed weight convergence with a sparsification procedure which is explained from system stability point of view to reduce the computational complexity. It can automatically eliminate the respective kernel according to the weight convergence and stability condition. Finally, in order to further reduce the computational time for RKOL, we propose an improved recurrent kernel online learning algorithm (IRKOL) with a coherence-based sparsification rule to reduce the computational complexity. Furthermore, we present some closed formulas for the sparsification scheme that can be derived from the weight convergence of the RBF-like recurrent network as integral expressions. By focusing on the Gram matrix embedded in the weight convergence proof, we provide explicit formulas suited for the design of the robust recurrent training algorithm.
DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems