# Data-driven control system

(Redirected from Data-driven control systems)

Data-driven control systems are a broad family of control systems, in which the identification of the process model and/or the design of the controller are based entirely on experimental data collected from the plant [1].

In many control applications, trying to write a mathematical model of the plant is considered a hard task, requiring efforts and time to the process and control engineers. This problem is overcome by data-driven methods, which allow to fit a system model to the experimental data collected, choosing it in a specific models class. The control engineer can then exploit this model to design a proper controller for the system. However, it is still difficult to find a simple yet reliable model for a physical system, that includes only those dynamics of the system that are of interest for the control specifications. The direct data-driven methods allow to tune a controller, belonging to a given class, without the need of an identified model of the system. In this way, one can also simply weight process dynamics of interest inside the control cost function, and exclude those dynamics that are out of interest.

## Overview

The standard approach to control systems design is organized in two-steps:

1. Model identification aims at estimating a nominal model of the system ${\displaystyle {\widehat {G}}=G\left(q;{\widehat {\theta }}_{N}\right)}$, where ${\displaystyle q}$ is the unit-delay operator (for discrete-time transfer functions representation) and ${\displaystyle {\widehat {\theta }}_{N}}$ is the vector of parameters of ${\displaystyle G}$ identified on a set of ${\displaystyle N}$ data. Then, validation consists in constructing the uncertainty set ${\displaystyle \Gamma }$ that contains the true system ${\displaystyle G_{0}}$ at a certain probability level.
2. Controller design aims at finding a controller ${\displaystyle C}$ achieving closed-loop stability and meeting the required performance with ${\displaystyle {\widehat {G}}}$.

Typical objectives of system identification are to have ${\displaystyle {\widehat {G}}}$ as close as possible to ${\displaystyle G_{0}}$, and to have ${\displaystyle \Gamma }$ as small as possible. However, from an identification for control perspective, what really matters is the performance achieved by the model-based controller on ${\displaystyle G_{0}}$ and not the intrinsic quality of the model.

One way to deal with uncertainty is to design a controller that has an acceptable performance with all models in ${\displaystyle \Gamma }$, including ${\displaystyle G_{0}}$. This is the main idea behind robust control design procedure, that aims at building frequency domain uncertainty descriptions of the process. However, being based on worst-case assumptions rather than on the idea of averaging out the noise, this approach typically leads to conservative uncertainty sets. Rather, data-driven techniques deal with uncertainty by working on experimental data, and avoiding excessive conservativism.

In the following, the main classifications of data-driven control systems are presented.

### Indirect and direct methods

The fundamental distinction is between indirect and direct controller design methods. The former group of thechniques is still retaining the standard two-step approach, i.e. first a model is identified, then a controller is tuned based on such model. The main issue in doing so is that the controller is computed from the estimated model ${\displaystyle {\widehat {G}}}$ (according to the certainty equivalence principle), but in practice ${\displaystyle {\widehat {G}}\neq G_{0}}$. To overcome this problem, the idea behind the latter group of techniques is to map the experimental data directly onto the controller, without any model to be identified in between.

### Iterative and noniterative methods

Another important distinction is between iterative and noniterative (or one-shot) methods. In the former group, repeated iterations are needed to estimate the controller parameters, during which the optimization problem is performed based on the results of the previous iteration, and the estimation is expected to become more and more accurate at each iteration. This approach is also prone to on-line implementations (see below). In the latter group, the (optimal) controller parametrization is provided with a single optimization problem. This is particularly important for those systems in which iterations or repetitions of data collection experiments are limited or even not allowed (for example, due to economic aspects). In such cases, one should select a design technique capable of delivering a controller on a single data set. This approach is often implemented off-line (see below).

### On-line and off-line methods

Since, on practical industrial applications, open-loop or closed-loop data are often available continuously, on-line data-driven techniques use those data to improve the quality of the identified model and/or the performance of the controller each time new information is collected on the plant. Instead, off-line approaches work on batch of data, which may be collected only once, or multiple times at a regular (but rather long) interval of time.

## Iterative feedback tuning

The iterative feedback tuning (IFT) method was introduced in 1994 [2], starting from the observation that, in identification for control, each iteration is based on the (wrong) certainty equivalence principle.

IFT is a model-free technique for the direct iterative optimization of the parameters of a fixed-order controller; such parameters can be successively updated using information coming from standard (closed-loop) system operation.

Let ${\displaystyle y^{d}}$ be a desired output to the reference signal ${\displaystyle r}$; the error between the achieved and desired response is ${\displaystyle {\tilde {y}}(\rho )=y(\rho )-y^{d}}$. The control design objective can be formulated as the minimization of the objective function:

${\displaystyle J(\rho )={\frac {1}{2N}}\sum _{t=1}^{N}E\left[{\tilde {y}}(t,\rho )^{2}\right].}$

Given the objective function to minimize, the quasi-Newton method can be applied, i.e. a gradient-based minimization using a gradient search of the type:

${\displaystyle \rho _{i+1}=\rho _{i}-\gamma _{i}R_{i}^{-1}{\frac {d{\widehat {J}}}{d\rho }}(\rho _{i}).}$

The value ${\displaystyle \gamma _{i}}$ is the step size, ${\displaystyle R_{i}}$ is an appropriate positive definite matrix and ${\displaystyle {\frac {d{\widehat {J}}}{d\rho }}}$ is an approximation of the gradient; the true value of the gradient is given by the following:

${\displaystyle {\frac {dJ}{d\rho }}(\rho )={\frac {1}{N}}\sum _{t=1}^{N}\left[{\tilde {y}}(t,\rho ){\frac {\delta y}{\delta \rho }}(t,\rho )\right].}$

The value of ${\displaystyle {\frac {\delta y}{\delta \rho }}(t,\rho )}$ is obtained through the following three-step methodology:

1. Normal Experiment: Perform an experiment on the closed loop system with ${\displaystyle C(\rho )}$ as controller and ${\displaystyle r}$ as reference; collect N measurements of the output ${\displaystyle y(\rho )}$, denoted as ${\displaystyle y^{(1)}(\rho )}$.
2. Gradient Experiment: Perform an experiment on the closed loop system with ${\displaystyle C(\rho )}$ as controller and 0 as reference ${\displaystyle r}$; inject the signal ${\displaystyle r-y^{(1)}(\rho )}$ such that it is summed to the control variable output by ${\displaystyle C(\rho )}$, going as input into the plant. Collect the output, denoted as ${\displaystyle y^{(2)}(\rho )}$.
3. Take the following as gradient approximation: ${\displaystyle {\frac {\delta {\widehat {y}}}{\delta \rho }}(\rho )={\frac {\delta C}{\delta \rho }}(\rho )y^{(2)}(\rho )}$.

A crucial factor for the convergence speed of the algorithm is the choice of ${\displaystyle R_{i}}$; when ${\displaystyle {\tilde {y}}}$ is small, a good choice is the approximation given by the Gauss–Newton direction:

${\displaystyle R_{i}={\frac {1}{N}}\sum _{t=1}^{N}{\frac {\delta {\widehat {y}}}{\delta \rho }}(\rho _{i}){\frac {\delta {\widehat {y}}^{T}}{\delta \rho }}(\rho _{i}).}$

## Noniterative correlation-based tuning

Noniterative correlation-based tuning (nCbT) is a noniterative method for data-driven tuning of a fixed-structure controller[3]. It provides a one-shot method to directly synthesize a controller based on a single dataset.

Suppose that ${\displaystyle G}$ denotes an unknown LTI stable SISO plant, ${\displaystyle M}$ a user-defined reference model and ${\displaystyle F}$ a user-defined weighting function. An LTI fixed-order controller is indicated as ${\displaystyle K(\rho )=\beta ^{T}\rho }$, where ${\displaystyle \rho \in \mathbb {R} ^{n}}$, and ${\displaystyle \beta }$ is a vector of LTI basis functions. Finally, ${\displaystyle K^{*}}$ is an ideal LTI controller of any structure, guaranteeing a closed-loop function ${\displaystyle M}$ when applied to ${\displaystyle G}$.

The goal is to minimize the following objective function:

${\displaystyle J(\rho )=\left\|F{\bigg (}{\frac {K^{*}G-K(\rho )G}{(1+K^{*}G)^{2}}}{\bigg )}\right\|_{2}^{2}.}$

${\displaystyle J(\rho )}$ is a convex approximation of the objective function obtained from a model reference problem, supposing that ${\displaystyle {\frac {1}{(1+K(\rho )G)}}\approx {\frac {1}{(1+K^{*}G)}}}$.

When ${\displaystyle G}$ is stable and minimum-phase, the approximated model reference problem is equivalent to the minimization of the norm of ${\displaystyle \varepsilon (t)}$ in the scheme in figure.

The idea is that, when G is stable and minimum phase, the approximated model reference problem is equivalent to the minimization of the norm of ${\displaystyle \varepsilon }$.

The input signal ${\displaystyle r(t)}$ is supposed to be a persistently exciting input signal and ${\displaystyle v(t)}$ to be generated by a stable data-generation mechanism. The two signals are thus uncorrelated in an open-loop experiment; hence, the ideal error ${\displaystyle \varepsilon (t,\rho ^{*})}$ is uncorrelated with ${\displaystyle r(t)}$. The control objective thus consists in finding ${\displaystyle \rho }$ such that ${\displaystyle r(t)}$ and ${\displaystyle \varepsilon (t,\rho ^{*})}$ are uncorrelated.

The vector of instrumental variables ${\displaystyle \zeta (t)}$ is defined as:

${\displaystyle \zeta (t)=[r_{W}(t+\ell _{1}),r_{W}(t+\ell _{1}-1),\ldots ,r_{W}(t),\ldots ,r_{W}(t-\ell _{1})]^{T}}$

where ${\displaystyle \ell _{1}}$ is large enough and ${\displaystyle r_{W}(t)=Wr(t)}$, where ${\displaystyle W}$ is an appropriate filter.

The correlation function is:

${\displaystyle f_{N,\ell _{1}}(\rho )={\frac {1}{N}}\sum _{t=1}^{N}\zeta (t)\varepsilon (t,\rho )}$

and the optimization problem becomes:

${\displaystyle {\widehat {\rho }}={\underset {\rho \in D_{k}}{\operatorname {arg\,min} }}J_{N,\ell _{1}}(\rho )={\underset {\rho \in D_{k}}{\operatorname {arg\,min} }}f_{N,\ell _{1}}^{T}f_{N,\ell _{1}}.}$

Denoting with ${\displaystyle \phi _{r}(\omega )}$ the spectrum of ${\displaystyle r(t)}$, it can be demonstrated that, under some assumptions, if ${\displaystyle W}$ is selected as:

${\displaystyle W(e^{-j\omega })={\frac {F(e^{-j\omega })(1-M(e^{-j\omega }))}{\phi _{r}(\omega )}}}$

then, the following holds:

${\displaystyle \lim _{N,\ell _{1}\to \infty ,\ell _{1}/N\to \infty }{\widehat {\rho }}=\rho ^{*}.}$

### Stability constraint

There is no guarantee that the controller ${\displaystyle K}$ that minimizes ${\displaystyle J_{N,\ell _{1}}}$ is stable. Instability may occur in the following cases:

• If ${\displaystyle G}$ is non-minimum phase, ${\displaystyle K^{*}}$ may lead to cancellations in the right-half complex plane.
• If ${\displaystyle K^{*}}$ (even if stabilizing) is not achievable, ${\displaystyle K(\rho )}$ may not be stabilizing.
• Due to measurement noise, even if ${\displaystyle K^{*}=K(\rho )}$ is stabilizing, data-estimated ${\displaystyle {\widehat {K}}(\rho )}$ may not be so.

Consider a stabilizing controller ${\displaystyle K_{s}}$ and the closed loop transfer function ${\displaystyle M_{s}={\frac {K_{s}G}{1+K_{s}G}}}$. Define:

${\displaystyle \Delta (\rho ):=M_{s}-K(\rho )G(1-M_{s})}$
${\displaystyle \delta (\rho ):=\left\|\Delta (\rho )\right\|_{\infty }.}$
Theorem
The controller ${\displaystyle K(\rho )}$ stabilizes the plant ${\displaystyle G}$ if
1. ${\displaystyle \Delta (\rho )}$ is stable
2. ${\displaystyle \exists \delta _{N}\in (0,1)}$ s.t. ${\displaystyle \delta (\rho )\leq \delta _{N}.}$

Condition 1. is enforced when:

• ${\displaystyle K(\rho )}$ is stable
• ${\displaystyle K(\rho )}$ contains an integrator (it is canceled).

The model reference design with stability constraint becomes:

${\displaystyle \rho _{s}={\underset {\rho \in D_{k}}{\operatorname {arg\,min} }}J(\rho )}$
${\displaystyle {\text{s.t. }}\delta (\rho )\leq \delta _{N}.}$

A convex data-driven estimation of ${\displaystyle \delta (\rho )}$ can be obtained through the discrete Fourier transform.

Define the following:

{\displaystyle {\begin{aligned}&{\widehat {R}}_{r}(\tau )={\frac {1}{N}}\sum _{t=1}^{N}r(t-\tau )r(t){\text{ for }}\tau =-\ell _{2},\ldots ,\ell _{2}\\[4pt]&{\widehat {R}}_{r\varepsilon }(\tau )={\frac {1}{N}}\sum _{t=1}^{N}r(t-\tau )\varepsilon (t,\rho ){\text{ for }}\tau =-\ell _{2},\ldots ,\ell _{2}.\end{aligned}}}

For stable minimum phase plants, the following convex data-driven oprimization problem is given:

{\displaystyle {\begin{aligned}{\widehat {\rho }}&={\underset {\rho \in D_{k}}{\operatorname {arg\,min} }}J_{N,\ell _{1}}(\rho )\\[3pt]&{\text{s.t.}}\\[3pt]&{\bigg |}\sum _{\tau =-\ell _{2}}^{\ell _{2}}{\widehat {R}}_{r\varepsilon }(\tau ,\rho )e^{-j\tau \omega _{k}}{\bigg |}\leq \delta _{N}{\bigg |}\sum _{\tau =-\ell _{2}}^{\ell _{2}}{\widehat {R}}_{r}(\tau ,\rho )e^{-j\tau \omega _{k}}{\bigg |}\\[4pt]\omega _{k}&={\frac {2\pi k}{2\ell _{2}+1}},\qquad k=0,\ldots ,\ell _{2}+1.\end{aligned}}}

## Virtual reference feedback tuning

Virtual Reference Feedback Tuning (VRFT) is a noniterative method for data-driven tuning of a fixed-structure controller. It provides a one-shot method to directly synthesize a controller based on a single dataset.

VRFT was first proposed [4] as ${\displaystyle VRD^{2}}$ and then fixed and extended for LTI [5] and LPV [6] systems.

The main idea is to define a desired closed loop model ${\displaystyle M}$ and to use its inverse dynamics to obtain a virtual reference ${\displaystyle r_{v}(t)}$ from the measured output signal ${\displaystyle y(t)}$.

The main idea is to define a desired closed loop model M and to use its inverse dynamics to obtain a virtual reference from the measured output signal y.

The virtual signals are ${\displaystyle r_{v}(t)=M^{-1}y(t)}$ and ${\displaystyle e_{v}(t)=r_{v}(t)-y(t).}$

The optimal controller is obtained from noiseless data by solving the following optimization problem:

${\displaystyle {\widehat {\rho }}_{\infty }={\underset {\rho }{\operatorname {arg\,min} }}\lim _{N\to \infty }J_{vr}(\rho )}$

where the optimization function is given as follows:

${\displaystyle J_{vr}^{N}(\rho )={\frac {1}{N}}\sum _{t=1}^{N}\left(u(t)-K(\rho )e_{v}(t)\right)^{2}.}$

## References

1. ^ Bazanella, A.S., Campestrini, L., Eckhard, D. (2012). Data-driven controller design: the ${\displaystyle H_{2}}$ approach. Springer, ISBN 978-94-007-2300-9, 208 pages.
2. ^ Hjalmarsson, H., Gevers, M., Gunnarsson, S., & Lequin, O. (1998). Iterative feedback tuning: theory and applications. IEEE control systems, 18(4), 26–41.
3. ^ van Heusden, K., Karimi, A. and Bonvin, D. (2011), Data-driven model reference control with asymptotically guaranteed stability. Int. J. Adapt. Control Signal Process., 25: 331–351. doi:10.1002/acs.1212
4. ^ Guardabassi, Guido O., and Sergio M. Savaresi. "Approximate feedback linearization of discrete-time non-linear systems using virtual input direct design." Systems & Control Letters 32.2 (1997): 63–74.
5. ^ Campi, Marco C., Andrea Lecchini, and Sergio M. Savaresi. "Virtual reference feedback tuning: a direct method for the design of feedback controllers." Automatica 38.8 (2002): 1337–1346.
6. ^ Formentin, S., Piga, D., Tóth, R., & Savaresi, S. M. (2016). Direct learning of LPV controllers from data. Automatica, 65, 98–110.