Local Model Toolbox User Guide
Local Model Networks Toolbox for Nonlinear System Identification Oliver Nelles, Benjamin Hartmann, Tobias Ebert, Torsten Fischer, Julian Belz, Geritt Kampmann March 2012
Contents
Abstract
A new Matlab toolbox for nonlinear system identification with local model networks is introduced. The toolbox is available as freeware and can be downloaded from www.uni-siegen.de/fb11/mrt. Its goal is to provided even unskilled users with an easy possibility to generate nonlinear model from data. It trains a number of different models of different complexity with different architectures and a polynomial for comparison. At the end it gives an overview on the achieved performances and makes a recommendation for the best model. The toolbox can be used by the trivial function call LMNTrain(data) where the training data data = [u 1 u 2 ... u p y] contains the inputs in the first columns and the output in the last column. The model with the best AIC or GCV is recommended, respectively [1, 2].
1 Introduction
This toolbox limits itself to nonlinear models with p inputs and 1 output. For several outputs the user has to generate separate models for each. In the static case, the generated models are of the type y? = f(u1,u2,...,up) (1) Particularly for high-dimensional input spaces, i.e., high values of p, this can be a very demanding task. Dynamic models are not considered so far. They will be included in future toolbox releases.
2 Toolbox Methods
The toolbox carries out several incremental learning algorithms for local model networks. It is build to run automatically and recommends the ?best? model achieved. Here ?best? 1 stands for the least expected squared error on new data. Therefore, it tries to find a good bias/variance tradeoff and thus to avoid overfitting. Of course, each method can be called on each own and many parameters and options are available to fine tune the results. The primary goal of this toolbox, however, is to address the non-expert and deliver a very robust performance. It is highly encouraged and recommended to look into more details of the proposed methods. This can improve the performance even further. The investigated methods are:
1. LOLIMOT with local linear models
2. LOLIMOT with local quadratic models
3. HILOMOT with local linear models
4. HILOMOT with local quadratic models
These methods are briefly discussed in the following. They will be extended and improved in the future. More methods will be added. Some methods can easily be commented out in order to save computation time.
2.1 LOLIMOT with Local Linear Models
LOLIMOT (LOcal LInear MOdel Tree) is extensively covered in [5]. It is an incremental tree-structured algorithm that starts with a simple (typically linear) global model and refines its performance in each iteration by splitting the input space in an axes-orthogonal manner. In each iteration the worst local model is split into two equal halves. The associated local linear models are estimated by a local least squares approach. The algorithm is very fast and robust. Its performance can deteriorate with increasing input dimensions due to the sub-optimality of the axes-orthogonal splits.
2.2 LOLIMOT with Local Quadratic Models
In this case, the LOLIMOT algorithm trains local models of full polynomial type. All quadratic local model terms including all cross-terms of type u1u2 and similar are consid- ered. It also is recommended for optimization purposes but becomes inefficient for high- dimensional input spaces due to the huge number of cross-terms. Then the sparse quadratic models might be a superior choice.
2.3 HILOMOT with Local Linear Models
HILOMOT and its original ideas are discussed in e.g. [3, 4]. It realizes two major improve- ments over the LOLIMOT algorithm. a) The strong limitations of axes-orthongonal splits fall and b) the flat (parallel) model structure is transferred into a hierarchical model struc- ture. Advantage a) means that axis-oblique splits are carried out which overcomes the key weakness of LOLIMOT. The price to be paid is that the split has to be optimized which necessarily is a nonlinear optimization problem. This makes HILOMOT one or two orders of magnitude slower than LOLIMOT. However, it results in much better and more compact models. Advantage b) means that re-activation effects [6] due to the normalization can be completely avoided and only involves drawbacks on truly parallel hardware. 2
2.4 HILOMOT with Local Quadratic Models
Axes-oblique version of the method from Section 2.2; see Section 2.3.
3 Toolbox Inputs and Outputs
The toolbox can be called with two inputs traindata and optionally valdata and/or testdata, respectively, and delivers two outputs LMNBest and AllLMN : [LMNBest, AllLMN]=LMNTrain(traindata,valdata,testdata) This is the help of the toolbox: Each method builds its models incrementally. The choice for a good model complexity is of paramount importance. Commonly, no separate validation data set is available. Then the function is just called as LMNTrain(data). The complexity is determined for each method individually with the help of the AIC and GCV criteria, respectively [1, 2]. It is not aimed for the optimal bias/variance trade-off but other benefits for parsimonious models are also considered such as: computational demand, interpretability, and ease of handling. Furthermore, a too complex model with significant overfitting is considered more dangerous than a too simple model, because it causes the illusion of a good model fit to the unexperienced user. Ifvalidationdataisavailable,thetoolboxshallbecalledbyLMNTrain(traindata, valdata). Then the model is assessed and the recommendation is made according to this separate vali- dation data set which is more reliable. Additionally, the user can provide a separate test data set by calling LMNTrain(traindata, [ ], testdata), which is then used only for testing the model. This ensures comparability to other models.
4 Model Use
The model considered best is characterized by the structure LMNBest. The model output can be evaluated for a single data point or a whole data set by output = LMNBest.calculateModelOutput(input) with input as 1×p vector (p = number of inputs) or N ×p matrix (N = number of data points), respectively. For a one- or two-dimensional visualization of the model the user can type LMNBest.plotModel into the Matlab command window. For further information type help plotModel. Fur- thermore, the partitioning can be plotted by typing LMNBest.plotPartition.
5 Toolbox Examples
The toolbox functionality is demonstrated on four static example data sets. The user just has to run the function LMNTrainDemo. After choosing the example the training procedure starts and after completion the user gets informed about the modeling via the LMNTool output screen (see Figs. 1 and 2).
References
[1] H. Akaike. Information theory and an extension of the maximum likelihood principle. In Second international symposium on information theory, volume 1, pages 267?281. Springer Verlag, 1973. [2] K.P. Burnham and D.R. Anderson. Model selection and multimodel inference: a practical information-theoretic approach. Springer Verlag, 2002. [3] S. Ernst. Hinging hyperplane trees for approximation and identification. In IEEE Con- ference on Decision and Control (CDC), pages 1261?1277, Tampa, USA, 1998. [4] B. Hartmann and O. Nelles. Automatic adjustment of the transition between local models in a hierarchical structure identification algorithm. In European Control Conference (ECC), Budapest, Hungary, August 2009. [5] O. Nelles. Nonlinear System Identification. Springer, Berlin, Germany, 2001. [6] R. Shorten and R. Murray-Smith. Side-effects of normalising basis functions in local model networks. In Multiple Model Approaches to Modelling and Control, chapter 8, pages 211?229. Taylor & Francis, London, 1997.