Local Model Networks Toolbox User Guide

Local Model Networks Toolbox for Nonlinear System Identification

Oliver Nelles, Benjamin Hartmann, Tobias Ebert, Torsten Fischer, Julian Belz, Geritt Kampmann

University of Siegen, Germany

January 2012

www.uni-siegen.de/fb11/mrt

Contents

Abstract

A new Matlab toolbox for nonlinear system identification with local model networks is introduced. Its goal is to provide even unskilled users with an easy possibility to generate nonlinear models from data. It trains a number of different models of different complexity with different architectures and a polynomial for comparison. At the end it gives an overview on the achieved performances and makes a recommendation for the best model. In [4] a more detailed description of the validation methods used for this toolbox is given.

The toolbox can be used by the trivial function call LMNTrain(data) where the training data data = [u 1 u 2 ... u p y] contains the inputs in the first columns and the output in the last column. The model with the best penalty loss function value (AICc) is recommended [1, 2].

1 Introduction

This toolbox limits itself to nonlinear models with p inputs and 1 output. For several outputs the user has to generate separate models for each. In the static case, the generated models are of the type

  y_hat = f(u1,u2,...,up)

Particularly for high-dimensional input spaces, i.e., high values of p, this can be a very demanding task.

Dynamic models can be constructed by using tapped delay lines, i.e.:

  y_hat(k+1) = f(u_1(k), ..., u_1(k-n_u1), u_2(k), ..., u_2(k-n_u2), ...,
                      u_p(k), ..., u_p(k-n_up), y(k), ..., y(k-n_y))

However, such models performs only one-step-ahead predictions from y(k) to y_hat(k + 1). For a simulation feedback is necessary.

This documentation concentrates on static models only. Dynamic models are not documented so far. They will be included in the documentation with future toolbox releases.

2 Toolbox Methods

The toolbox carries out several incremental learning algorithms for local model networks. It is build to run automatically and recommends the best model achieved. Here best stands for the least expected squared error on new data. Therefore, it tries to find a good bias/variance tradeoff and thus to avoid overfitting. Of course, each method can be called on each own and many parameters and options are available to fine tune the results. The primary goal of this toolbox, however, is to address the non-expert and deliver a very robust performance. It is highly encouraged and recommended to look into more details of the proposed methods. This can improve the performance even further.

The investigated methods are:

1. LOLIMOT with local linear models

2. LOLIMOT with local quadratic models

3. HILOMOT with local linear models

4. HILOMOT with local quadratic models

These methods are briefly discussed in the following. They will be extended and improved in the future. More methods will be added. Some methods can easily be commented out in order to save computation time.

2.1 LOLIMOT with Local Linear Models

LOLIMOT (LOcal LInear MOdel Tree) is extensively covered in [6]. It is an incremental tree-structured algorithm that starts with a simple (typically linear) global model and refines its performance in each iteration by splitting the input space in an axes-orthogonal manner. In each iteration the worst local model is split into two equal halves. The associated local linear models are estimated by a local least squares approach. The algorithm is very fast and robust. Its performance can deteriorate with increasing input dimensions due to the sub-optimality of the axes-orthogonal splits.

2.2 LOLIMOT with Local Quadratic Models

In this case, the LOLIMOT algorithm trains local models of full polynomial type. All quadratic local model terms including all cross-terms of type u1*u2 and similar are considered. It also is recommended for optimization purposes but becomes inefficient for high- dimensional input spaces due to the huge number of cross-terms. Then the sparse quadratic models might be a superior choice.

2.3 HILOMOT with Local Linear Models

HILOMOT and its original ideas are discussed in e.g. [3, 5]. It realizes two major improve- ments over the LOLIMOT algorithm. a) The strong limitations of axes-orthongonal splits fall and b) the flat (parallel) model structure is transferred into a hierarchical model structure. Advantage a) means that axis-oblique splits are carried out which overcomes the key weakness of LOLIMOT. The price to be paid is that the split has to be optimized which necessarily is a nonlinear optimization problem. This makes HILOMOT one or two orders of magnitude slower than LOLIMOT. However, it results in much better and more compact models. Advantage b) means that re-activation effects [7] due to the normalization can be completely avoided and only involves drawbacks on truly parallel hardware.

2.4 HILOMOT with Local Quadratic Models

Axes-oblique version of the method from Section 2.2; see Section 2.3.

3 Toolbox Inputs and Outputs

The toolbox can be called with two inputs traindata and optionally valdata and/or testdata, respectively, and delivers two outputs LMNBest and AllLMN:

[LMNBest, AllLMN]=LMNTrain(traindata,valdata,testdata)

This is the help of the LMNtool toolbox:

Each method builds its models incrementally. The choice for a good model complexity is of paramount importance. Commonly, no separate validation data set is available. Then the function is just called as LMNTrain(data). The complexity is determined for each method individually with the help of the AICc criterion [1, 2]. It is not aimed for the optimal bias/variance trade-off but other benefits for parsimonious models are also considered such as: computational demand, interpretability, and ease of handling. Furthermore, a too complex model with significant overfitting is considered more dangerous than a too simple model, because it causes the illusion of a good model fit to the unexperienced user.

If validation data is available,the toolbox shall be called by LMNTrain(traindata, valdata). Then the model is assessed and the recommendation is made according to this separate validation data set which is more reliable. Additionally, the user can provide a separate test data set by calling LMNTrain(traindata, [ ], testdata), which is then used for testing the model only. This ensures comparability to other models.

4 Model Use

The model considered best is characterized by the structure LMNBest. The model output can be evaluated for a single data point or a whole data set by

output = LMNBest.calculateModelOutput(input)

with input as 1 x p vector (p = number of inputs) or N x p matrix (N = number of data points), respectively. For a one- or two-dimensional visualization of the model the user can type LMNBest.plotModel into the Matlab command window. For further information type help plotModel. Furthermore, the partitioning can be plotted by typing LMNBest.plotPartition.

If there are more than two inputs, a graphical user interface (GUI) can be used to visualize the model at a demanded operating point. The user is free to choose the two dependent variables, that span the input space. While changing the operating point the change in the model output is simultaneously calculated and visualized. Furthermore, different models or different complexities of one model can be compared. There are two different possibilities to load a model to the GUI. Either you use the following syntax to import the trained local model networks lmn1 up to lmnN from the MATLAB workspace:

GUIvisualize(lmn1, lmn2, ..., lmnN)

or you save a trained local model network to a variable and save this variable as a mat-file. The mat-file can be imported within the GUI.

5 Toolbox Examples

The toolbox functionality is demonstrated on four static example data sets. The user just has to run one of the follwing functions:

LMNTrainDemo
hilomotDemo
lolimotDemo

After choosing the example the training procedure starts and after completion the user gets informed about the modeling via the LMNTool output screen or Matlab's command window.

References

[1] H. Akaike. Information theory and an extension of the maximum likelihood principle. In Second international symposium on information theory, volume 1, pages 267?281. Springer Verlag, 1973.

[2] K.P. Burnham and D.R. Anderson. Model selection and multimodel inference: a practical information-theoretic approach. Springer Verlag, 2002.

[3] S. Ernst. Hinging hyperplane trees for approximation and identification. In IEEE Con- ference on Decision and Control (CDC), pages 1261?1277, Tampa, USA, 1998.

[4] B. Hartmann, T. Ebert, T. Fischer, J. Belz, G. Kampmann, and O. Nelles. LMNtool - Toolbox zum automatischen Trainieren lokaler Modellnetze. In Workshop Computational Intelligence, Dortmund, Germany, December 2012.

[5] B. Hartmann and O. Nelles. Automatic adjustment of the transition between local models in a hierarchical structure identification algorithm. In European Control Conference (ECC), Budapest, Hungary, August 2009.

[6] O. Nelles. Nonlinear System Identification. Springer, Berlin, Germany, 2001.

[7] R. Shorten and R. Murray-Smith. Side-effects of normalising basis functions in local model networks. In Multiple Model Approaches to Modelling and Control, chapter 8, pages 211?229. Taylor & Francis, London, 1997.