If you see this, something is wrong

Collapse and expand sections

To get acquainted with the document, the best thing to do is to select the "Collapse all sections" item from the "View" menu. This will leave visible only the titles of the top-level sections.

Clicking on a section title toggles the visibility of the section content. If you have collapsed all of the sections, this will let you discover the document progressively, from the top-level sections to the lower-level ones.

Cross-references and related material

Generally speaking, anything that is blue is clickable.

Clicking on a reference link (like an equation number, for instance) will display the reference as close as possible, without breaking the layout. Clicking on the displayed content or on the reference link hides the content. This is recursive: if the content includes a reference, clicking on it will have the same effect. These "links" are not necessarily numbers, as it is possible in LaTeX2Web to use full text for a reference.

Clicking on a bibliographical reference (i.e., a number within brackets) will display the reference.

Speech bubbles indicate a footnote. Click on the bubble to reveal the footnote (there is no page in a web document, so footnotes are placed inside the text flow). Acronyms work the same way as footnotes, except that you have the acronym instead of the speech bubble.

Discussions

By default, discussions are open in a document. Click on the discussion button below to reveal the discussion thread. However, you must be registered to participate in the discussion.

If a thread has been initialized, you can reply to it. Any modification to any comment, or a reply to it, in the discussion is signified by email to the owner of the document and to the author of the comment.

First published on Wednesday, Jul 23, 2025 and last modified on Wednesday, Jul 23, 2025 by François Chaplais.

Robust MPC for Uncertain Linear Systems – Combining Model Adaptation and Iterative Learning

arXiv
Published version: 10.48550/arXiv.2504.11261

Hannes Petrenz Student at the University of Stuttgart, 70174 Stuttgart, Germany, Visiting Student Researcher at the Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA 94701, USA Email

Johannes Köhler Institute for Dynamic Systems and Control, ETH Zurich, 8052 Zürich, Switzerland Email

Francesco Borrelli Department of Mechanical Engineering, University of California at Berkeley , Berkeley, CA 94701 Email

Abstract

This paper presents a robust adaptive learning Model Predictive Control (MPC) framework for linear systems with parametric uncertainties and additive disturbances performing iterative tasks. The approach iteratively refines the parameter estimates using set membership estimation. Performance enhancement over iterations is achieved by learning the terminal cost from data. Safety is enforced using a terminal set, which is also learned iteratively. The proposed method guarantees recursive feasibility, constraint satisfaction, and a robust bound on the closed-loop cost. Numerical simulations on a mass-spring-damper system demonstrate improved computational efficiency and control performance compared to an existing robust adaptive MPC approach.

1 INTRODUCTION

Model predictive control (MPC) [1] is an established optimization-based control technique, which is widely used for systems subject to input and state constraints. When dealing with iterative tasks, past data can be leveraged to enhance the safety and performance of the controller [2] and estimate model uncertainties. This paper investigates an MPC formulation that ensures robust constraint satisfaction while improving performance by adapting to unknown parameters and learning from previous iterations.

Robust MPC schemes [3, 4] ensure robust constraint satisfaction for bounded model mismatch. In particular, tube-based robust MPC formulations [5] provide such robustness guarantees with a favorable computational complexity.

Schematic of the proposed robust adaptive learning Model Predictive Control (RALMPC) framework: At time t) , the RALMPC controller receives the state x_t) and computes the control input u_t) . Furthermore, the parameter set ^{HC,h}_t) is updated by set membership estimation using x_t) . Using predicted states x_{t|k}^h) and s_{t|k}^h) , the terminal set CS^h_{robust}) and the terminal cost function Q^h) . — Figure 1. Schematic of the proposed robust adaptive learning Model Predictive Control (RALMPC) framework: At time \( t\) , the RALMPC controller receives the state \( x_t\) and computes the control input \( u_t\) . Furthermore, the parameter set \( \Theta^{\text{HC},h}_t\) is updated by set membership estimation using \( x_t\) . Using predicted states \( x_{t|k}^h\) and \( s_{t|k}^h\) , the terminal set \( \mathcal{CS}^h_{\text{robust}}\) and the terminal cost function \( Q^h\) .

These approaches achieve robust constraint satisfaction by enclosing all possible trajectories within a tube around the nominal trajectory using a local feedback law. While tube-based robust MPC performs well under disturbances, it can be overly conservative in the presence of constant parametric uncertainties. This limitation has increased interest in the development of robust adaptive MPC, where uncertain parameters are adapted online using past data [6].

In [7], safety and performance are decoupled using two separate models, where only the performance model is adapted. Set membership estimation techniques have been used to update models while establishing recursive feasibility and stability for time-invariant, time-variant, and stochastic finite impulse response systems [8, 9, 10]. However, these methods are applicable only to a limited class of asymptotically stable systems. This motivated the authors of [11] to extend these results to a broader class of linear state-space systems. However, the resulting robust adaptive MPC formulation requires optimization over an increasing number of variables and constraints. To address this, the authors in [12] and [13] developed robust adaptive MPC algorithms with fixed computational complexity during runtime. Furthermore, [14] demonstrates an experimental application of [12] to a quadrotor system.

In many applications, the same control task is repeatedly solved, making it beneficial to leverage information from previous iterations in the controller design. Iterative Learning Control [15] addresses this by refining the controller in each iteration. [16, 17] develop an iterative learning MPC scheme that utilizes past experiments to define a terminal cost and a terminal set. By including these terminal ingredients in the MPC formulation, convergence to an improved closed-loop cost function by learning a safety set and a cost-to-go function is shown. The recent work [18] combines iterative learning and system identification for Lipschitz-continuous nonlinear systems. However, this approach results in a large computational demand. A modular framework for learning-based control of uncertain linear systems is proposed in [19]. The approach ensures robust safety by iteratively refining a safe set using robust tubes and improved model estimates. In [10], set membership estimation and iterative learning are combined for an iterative task with a linear system. However, this method is limited to constant parametric offsets and cannot treat general parametric uncertainty and disturbances.

We propose a new robust adaptive learning MPC (RALMPC) algorithm that combines the robust adaptive MPC framework from[12] with the iterative learning approach of [17]. The main idea is to adaptively estimate the uncertain parameters online using set membership estimation while simultaneously refining the control policy for iterative tasks using data from previous iterations. The algorithm is designed for linear state-space systems subject to constant parametric uncertainties and additive bounded disturbances. Figure 1 shows a schematic overview of the different elements of the algorithm.

The key contributions of this work are as follows:

A robust adaptive learning MPC algorithm is developed for linear time-invariant systems with parametric uncertainties and additive disturbances performing iterative tasks.
A robust control invariant set is constructed using previously predicted trajectories. The set is iteratively expanded by adding new trajectories. Additionally, a terminal cost based on the worst-case cost to go of the predicted trajectories is used. This approach enables the controller to (robustly) improve the closed-loop cost over iterations.
The proposed method ensures recursive feasibility and robust constraint satisfaction. In addition, a robust bound on the closed-loop cost is shown.
A numerical example is provided to demonstrate the effectiveness of the proposed algorithm, comparing it with the approach in [12]. The results highlight the advantages of a shorter prediction horizon, leading to reduced computational burden, as well as improved closed-loop cost due to the improvement of terminal conditions over iterations.

The remainder of this paper is structured as follows. Section 2 presents the problem formulation. Section 3 introduces the parameter learning, constraint tightening, and terminal conditions. Section 4 provides a theoretical analysis of the resulting MPC scheme. Section 5 presents numerical simulations, followed by concluding remarks in Section 6.

Notation: We denote \( [A]_i\) as the \( i\) -th row of matrix \( A\) . The quadratic norm with the positive definite matrix \( Q \succ 0\) is defined as \( \|x\|_Q^2 = x^T Q x\) . The unit hypercube is denoted by \( \mathbb{B}_p = \{\theta \in \mathbb{R}^p | \lVert \theta \rVert_{\infty} \leq 1 \}\) , and the Minkowski sum is denoted by \( A \oplus B\) . The cardinality of set \( A\) is denoted by \( |A|\) . \( \mathcal{K}_\infty\) is a class of functions \( \alpha\) : \( \mathbb{R}_{\geq0}\rightarrow\mathbb{R}_{\geq0}\) , which are continuous, strictly increasing, unbounded and satisfy \( \alpha(0)=0\) .

2 PROBLEM STATEMENT

Consider the discrete-time linear uncertain system

\[ \begin{equation} x_{t+1} = A_\theta x_t + B_\theta u_t + d_t, ~ x_0=x_s \end{equation} \]

(1)

where \( x_t \in \mathbb{R}^n\) and \( u_t \in \mathbb{R}^m\) are state and input at time \( t\in\mathbb{N}\) . The system is affected by an additive disturbance \( d_t \in \mathbb{R}^n\) and an unknown constant parameter \( \theta=\theta^* \in \mathbb{R}^p\) .
We consider mixed state and input constraints of the form

\[ \begin{equation} (x_t, u_t) \in \mathcal{Z} ~ \forall t\in\mathbb{N}, \end{equation} \]

(2)

where \( \mathcal{Z}= \{(x,u) \in \mathbb{R}^{n+m} \mid F_j x + G_j u \leq 1, \, j = 1, \dots, q\}\) is a compact set. Moreover, the uncertainties and the disturbance satisfy the following assumption.

Assumption 1

The disturbance \( d_t\) lies in a compact, polytopic set \( \mathbb{D}=\{d \in \mathbb{R}^n \mid H_d d \leq h_d \}\) for all time \( t\in\mathbb{N}\) .
The uncertain parameter \( \theta \in \mathbb{R}^p\) enters affinely in the system matrices

\[ \begin{equation} A_\theta = A_0 + \sum_{i=1}^p [\theta]_i A_i, ~ B_\theta = B_0 + \sum_{i=1}^p [\theta]_i B_i. \end{equation} \]

(3)
The parameter \( \theta\) is bounded in a prior known convex hypercube set \( \Theta_0^{\text{HC}} \in \bar{\theta}_0 \oplus \eta_0 \mathbb{B}_p\) with scaling factor \( \eta_0\geq 0\) and center point \( \bar{\theta}_0\) .

Throughout this paper, we focus on robustly stabilizing the origin. We consider an iterative scenario and assume we start from the same initial state \( x_s\) at each iteration. Each successful steering of the system from \( x_s\) to the origin is considered a successful iteration, denoted by \( h\in\mathbb{N}\) . Our goal is to design a controller that improves the control performance over iterations while robustly ensuring safety. Conceptually, we consider the following infinite horizon control problem:

\[ \begin{align} &\min_{u^h_{t}}\sum_{t=0}^\infty\max_{x\in \mathbb{X}_t^h}\ell(x,u_t^h(x)) \end{align} \]

(4)

\[ \begin{align} &\text{subject to:} \\ &~\mathbb{X} ^h_{0}\ni x^h_0=x_s,\nonumber\\ &~\mathbb{X} ^h_{t+1} \ni A_\theta x + B_\theta u^h_{t}(x)+d \forall x \in \mathbb{X} ^h_{t}, \, d \in \mathbb{D}, \, \theta \in \Theta^{\text{HC},h},\nonumber \\ &~(x,u^h_{t}(x)) \in \mathcal{Z} ~ \forall x \in \mathbb{X} ^h_{t},\nonumber\\ &~t\in\mathbb{N} \nonumber \\\end{align} \]

where \( \mathbb{X} ^h_{t}\) is a tube that outer bounds all possible state trajectories, \( \Theta^{\text{HC},h} \subseteq \Theta_0^{\text{HC}}\) is the hypercubic parameter set at the \( h\) -iteration, and \( u^h_{t}(x)\) is some feedback policy. We consider a quadratic stage cost \( \ell(x, u) = \|x\|_Q^2 + \|u\|_R^2\) where Q and R are positive definite. Since problem (4) is not tractable, we present a robust adaptive learning MPC scheme that approximates the above problem in the next section.

3 Robust Adaptive Learning MPC

In Section 3.1, we introduce the framework for parameter adaptation. This is followed by the introduction of the polytopic tube based on the previous work of [12] in Section 3.2 and the worst-case stage cost in Section 3.3. The robust adaptive learning MPC algorithm is presented in Section 3.4. Section 3.5 deals with learning terminal conditions, which is the main theoretical contribution of this paper. Finally, Section 3.6 presents the offline and online computation algorithms.

3.1 Set membership estimation

Set membership estimation allows to reduce the size of the feasible parameter set \( \theta^*\in \Theta_t^{\text{HC}}\subseteq\Theta_0^{\text{HC}}\) by using past data. Since the controller must satisfy the constraints for all \( \theta \in \Theta_t^{\text{HC}}\) , reducing the size of \( \Theta_t^{\text{HC}}\) leads to a decrease in conservatism, indicated by reduced tube size. We define

\[ \begin{equation} D(x, u) = [A_1 x + B_1 u, \dots, A_p x + B_p u] \in \mathbb{R}^{n \times p}, \end{equation} \]

(5)

and the non-falsified parameter set

\[ \begin{equation} \Delta_t := \{\theta \in \mathbb{R}^p \mid x_{t} - A_\theta x_{t-1} - B_\theta u_{t-1} \in \mathbb{D}\}, \end{equation} \]

(6)

which is the polytopic set that contains all parameters that are feasible with the data point \( (x_{t-1},u_{t-1},x_{t})\) . In a moving horizon fashion, Algorithm 1 from [12] uses the past \( M\in\mathbb{N}\) non-falsified parameter sets and \( \Theta_{t-1}^{\text{HC}}\) to compute a smaller overapproximating hypercube \( \Theta_{t}^{\text{HC}}\) .

Algorithm

Input: \( \{\Delta_k\}_{k=t,\dots,t-M-1},\, \Theta_{t-1}^{\text{HC}}.\) Output: \( \Theta_t^{\text{HC}}=\bar{\theta}_t \oplus \eta_t \mathbb{B}_p\)

Algorithm 1 Moving window hypercube update [12, Alg. 1]

1.Define polytope \( \Theta_t^M := \Theta_{t-1}^{\textrm{HC}} \bigcap_{k=t-M-1}^{t} \Delta_k\)

2.Solve \( 2p\) LPs (\( i=1,…,p\) ):

3. \( \theta_{i,t,\min} := \min_{\theta \in \Theta_t^M} e_i^\top \theta\) , \( \theta_{i,t,\max} := \max_{\theta \in \Theta_t^M} e_i^\top \theta\)

4. with unit vector \( e_i = [0, \dots, 1, \dots, 0] \in \mathbb{R}^p, [e_i]_i = 1.\)

5.Set \( [\bar{\theta}_t]_i = 0.5(\theta_{i,t,\min} + \theta_{i,t,\max})\)

6.Set \( \eta_t = 0.5 \max_i (\theta_{i,t,\max} - \theta_{i,t,\min})\)

7.Project: \( \bar{\theta}_t\) on \( \bar{\theta}_{t-1} \oplus (\eta_{t-1} - \eta_t) \mathbb{B}_p\)

In contrast to the update rule \( \Theta_t := \Theta_{t-1}\cup \Delta_t\) , which can lead to an infinite number of non-redundant half-spaces, Algorithm 1 provides a fixed complexity description of \( \Theta_{t}^{\text{HC}}\) using a hypercube. Moreover, this fixed parameterization ensures that the parameter changes satisfies

\[ \begin{equation} \bar{\theta}_t - \bar{\theta}_{t-1} \in (\eta_{t-1} - \eta_t) \mathbb{B}_p, \end{equation} \]

(7)

where \( (\eta_{t-1} - \eta_t)\geq0\) . This is one of the key properties used in the analysis in Section 4. In addition, it allows us to state the following lemma.

Lemma 1

Let Assumption 1 hold and consider Algorithm 1. The recursively updated sets \( \Theta_t^{\text{HC}}\) contain the true parameters \( \theta^*\) and satisfy \( \Theta_t^{\text{HC}} \subseteq \Theta_{t-1}^{\text{HC}}\) \( \forall t\in\mathbb{N}\) .

More details on Algorithm 1 and the proof of Lemma 1 can be found in [12, Sec. 2.B]

3.2 Polytopic tube

Tube-based MPC bounds all possible realizations of \( d_t\in \mathbb{D}\) and \( \theta \in \Theta_t^{\text{HC}}\) in a tube \( \mathbb{X}\) around the nominal predicted trajectory. Thus, it describes the propagation of uncertainty and is crucial for ensuring safety. To reduce online computation, a fixed offline computed polytope

\[ \begin{equation} \mathcal{P} = \{x \in \mathbb{R}^n \mid H_i x \leq 1, \, i=1, \dots, r\} \end{equation} \]

(8)

is used to parametrize the tube \( \mathbb{X}\) . In particular, the tube is given by this fixed polytope scaled online by a scalar (dilation) \( s\geq 0\) and centered by a nominal prediction \( z\) , i.e. \( \mathbb{X}=z\oplus s \mathcal{P}\) .
To reduce conservatism, we use a pre-stabilizing feedback \( Kx\) that satisfies the following assumption.

Assumption 2

There exist a feedback matrix \( K \in \mathbb{R}^{m \times n}\) and a positive definite matrix \( P\) , such that \( A_{\text{cl},\theta} := A_{\theta} + B_{\theta}K\) is quadratically stable [20, Def. 1] and satisfies

\[ A_{\text{cl},\theta}^{\top} P A_{\text{cl},\theta} + Q + K^{\top} R K \preceq P, ~ \forall \theta \in \Theta_0^{\text{HC}}, \]

with the prior parameter set \( \Theta_0^{\text{HC}}\) from Assumption 1.

The technical properties of tube propagation can be found in [12] and are summarized by the following lemma.

Lemma 2

Let Assumptions 1 and 2 hold. Define the following constants:

\[ \begin{align} \rho_{\bar{\theta}} &\coloneq \underset{i}{\max} ~\underset{x \in \mathcal{P}}{\max} H_i A_{\text{cl},\bar{\theta}} x, \end{align} \]

(9)

\[ \begin{align} L_{\mathbb{B}} &\coloneq \underset{i,l}{\max} ~\underset{x \in \mathcal{P}}{\max} H_i D(x, Kx) \tilde{e}_l, \end{align} \]

(10)

\[ \begin{align} \bar{d} &\coloneq \underset{i}{\max} ~\underset{d \in \mathbb{D}}{\max} H_i d , \end{align} \]

(11)

where \( \tilde{e}_l\in\mathbb{R}^p\) denotes the \( 2^p\) vertices of the unit hypercube \( \mathbb{B}_p\) . Recalling that the effect of the model parameters \( \theta\) on the dynamics are given by \( D(x,u)(\theta^*-\bar{\theta})\) with \( D(x,u)\) according to (5), we define the function:

\[ \begin{align} w_{\eta}(z,v) \coloneq \eta \underset{i,l}{\max} H_i D(z, v) \tilde{e}_l. \end{align} \]

(12)

Then, for any \( z \in \mathbb{R}^n \) , \( v \in \mathbb{R}^m\), and \( \Theta^{\text{HC}} = \bar{\theta} \oplus \eta \mathbb{B}_p \) with \( \bar{\theta} \in \mathbb{R}^p \),\( \eta \geq 0 \), and for any \( x \in z \oplus s \mathcal{P} \) with \( s \geq 0 \), it holds that

\[ x^+ \in z^+ \oplus s^+ \mathcal{P} \]

with:

\[ \begin{align} z^+ &= A_{\text{cl},\bar{\theta}}z + B_{\bar{\theta}}v, \end{align} \]

(13)

\[ \begin{align} s^+ &=(\rho_{\bar{\theta}} + \eta L_{\mathbb{B}}) s + \bar{d} + w_{\eta}(z,v), \end{align} \]

(14)

\[ \begin{align} x^+ &= A_{\text{cl},\theta} x + B_{\theta} v + d, ~ \forall \theta \in \Theta^{\text{HC}}, ~ d \in \mathbb{D}. \end{align} \]

(15)

The lemma states that the state \( x^+\) described by the dynamic (15) is contained in the tube \( s^+\mathcal{P}\) around the nominal state \( z^+\) with the dynamic (13), where \( s^+\) follows the scalar tube dynamic (14). The following assumption ensures that the tube propagation from Lemma 2 yields a bounded scaling \( s\) , which is related to the choice of polytope \( \mathcal{P}\) .

Assumption 3

The polytope \( \mathcal{P}\) is chosen such that

\[ \begin{equation} 1 > \rho_{\bar{\theta}_0} + \eta_0 L_{\mathbb{B}}. \end{equation} \]

(16)

Finally, recall the mixed constraints \( F_j x + G_j u \leq 1\) , the following constants will be useful later in this paper:

\[ \begin{equation} c_j:= \underset{x \in \mathcal{P}}{\max} [F+GK]_j x, ~ j=1, \dots, q. \end{equation} \]

(17)

3.3 Worst-case stage cost

We define the worst-case stage cost as

\[ \begin{equation} \ell_{\text{max}}(x,v,s)=\ell(x,Kx+v)+L_{\text{cost}}s, \end{equation} \]

(18)

where \( L_{\text{cost}}\geq 0\) is computed such that \( \forall \,(x,Kx+v) \in \mathcal{Z}\) and \( (z,Kz+v) \in \mathcal{Z}\) with \( {s=\max_{i}H_i(x-z)}\) :

\[ \begin{equation} \ell(z,Kz+v)\leq\ell(x,Kx+v)+L_{\text{cost}}s. \end{equation} \]

(19)

Since the quadratic cost \( \ell\) is Lipschitz continuous on the compact set \( \mathcal{Z}\) , the constant \( L_{\text{cost}}\) exists and can be computed analogously to a Lipschitz constant. This cost satisfies the following monotonicity property

\[ \begin{equation} \ell_{\text{max}}(x,v,s)\leq \ell_{\text{max}}(\tilde{x},v,\tilde{s}) \end{equation} \]

(20)

for any \( x\) , \( s\) , \( \tilde{x}\) , \( \tilde{s}\) satisfying \( x \oplus s \mathcal{P} \subseteq \tilde{x} \oplus \tilde{s} \mathcal{P}\) . Moreover, we define the steady-state cost for \( x_{\text{steady}}=0\) and \( v_{\text{steady}}=0\) as

\[ \begin{equation} \ell_{\text{max,s}}=L_{\text{cost}}s_{\text{steady}}, \end{equation} \]

(21)

with \( s_{\text{steady}}=1/(1-(\rho_{\bar{\theta}_0}+\eta_0 L_{\mathbb{B}}))\bar{d}\) .

3.4 Robust adaptive learning MPC scheme

At each time \( t\) of the \( h\) -iteration, given the state \( x_t^h\) , the current hypercube \( \Theta_t^{\text{HC},h}=\bar{\theta}_t^h\oplus\eta_t^h \mathbb{B}_p\) , the constants \( \rho_{\bar{\theta}_t^h}\) , \( L_{\mathbb{B}}\) , \( \bar{d}\) , \( K\) and the function \( w_{\eta_t^h}(z,v)\) , the proposed robust adaptive learning MPC scheme is given by:

\[ \begin{align} &\min_{v_{.|t}^h,\lambda_{t}^h} \sum_{k=0}^{N-1} \ell_{\text{max}}(x_{k|t}^h,v_{k|t}^h,s_{k|t}^h) + Q^{h-1}(\lambda_{t}^h) \notag\\ &\text{subject to} \notag\\ & ~x_{0|t}^h = x_t^h, ~ s_{0|t}^h = 0, \end{align} \]

(22.a)

\[ \begin{align} & ~x_{k+1|t}^h = A_{\text{cl},\bar{\theta}_{t}^h} x_{k|t}^h + B_{\bar{\theta}_{t}^h} v_{k|t}^h, \end{align} \]

(22.b)

\[ \begin{align} & ~s_{k+1|t}^h = \rho_{\bar{\theta}_{t}^h} s_{k|t}^h + w_{k|t}^h, \end{align} \]

(22.c)

\[ \begin{align} & ~w_{k|t}^h=\bar{d}+\eta_{t}^hL_{\mathbb{B}}s_{k|t}^h+w_{\eta_{t}^h}(x_{k|t}^h,u_{k|t}^h), \end{align} \]

(22.d)

\[ \begin{align} & ~F_j x_{k|t}^h + G_j u_{k|t}^h + c_j s_{k|t}^h \leq 1, \end{align} \]

(22.e)

\[ \begin{align} & ~u_{k|t}^h = v_{k|t}^h + Kx_{k|t}^h, \end{align} \]

(22.f)

\[ \begin{align} & ~(x_{N|t}^h, s_{N|t}^h,\lambda_{t}^h) \in \mathcal{CS}_{\text{robust}}^{h-1}, \end{align} \]

(22.g)

\[ \begin{align} & ~j=1,\dots,q. \\\end{align} \]

(22.h)

The scalar dynamic in Equations (22.c)–(22.d) describe the propagation of the tube from Lemma 2. Based on \( x^h_{k|t}\) , \( s^h_{k|t}\) , and \( v_{k|t}^h\) , the cost function minimizes an upper bound on the stage cost. The tightened constraints (22.e) with \( c_j\) in (17) ensure that all trajectories in \( \mathbb{X}_{k|t}\) lie in the constraints.

The terminal cost \( Q^{h-1}(\lambda_{t}^h) \) and the terminal set \( \mathcal{CS}_{\text{Robust}}^{h-1}\) are constructed at each iteration using data from previous iterations as specified in the following section. The optimal solution is indicated by \( v_{k|t}^{h,*}\) , \( w_{k|t}^{h,*}\) , \( \lambda_{t}^{h,*}\) , \( x_{k|t}^{h,*}\) , \( s_{k|t}^{h,*}\) , \( u_{k|t}^{h,*}\) and the optimal cost function is \( J^{\text{LMPC},h,*}_t\) . Moreover, the closed-loop input is \( u_t^h=Kx_t^h+v_{0|t}^{h,*}\) .

3.5 Terminal conditions

The terminal condition (22.g) utilizes past prediction trajectories to mitigate the limitation of the finite prediction horizon \( N\) . To construct this set, we consider the following assumption.

Assumption 4

We have access to a finite-horizon trajectory \( [x_s,x_1^0,\dots,xy_{\bar{N}}^0]\) , \( [v_0^0,\dots,v_{\bar{N}-1}^0]\) and \( [0,s_1^0,\dots,s_{\bar{N}}^0]\) , which satisfies (22.b)–(22.e) for \( k=0,\dots,\bar{N}\) ,

\[ \begin{equation} x_{\bar{N}}^0=x_{\text{steady}}=0,~ s_{\bar{N}}^0=s_{\text{steady}}=\frac{1}{1-(\rho_{\bar{\theta}_0}+\eta_0L_{\mathbb{B}})}\bar{d}, \end{equation} \]

(23)

with \( x_k^0=0\) , \( s_k^0=s_{\text{steady}}\) , \( v_k^0=0\) for \( k\geq \bar{N}\)

The terminal constraint \( \mathcal{CS}^h_{\text{robust}}\) is constructed based on data to be a robust control invariant set, compare also [19]. By adding \( (x_{k|t}^h,s_{k|t}^h)\) for all \( k=1,\dots, N\) at time \( t\) of the \( h\) -iteration to the set, we can enlarge the set at runtime and therefore we enlarge the solution space which enables learning. In the iterative learning MPC [17], a terminal set is constructed with measured closed-loop trajectories, assuming no model mismatch. Instead, we utilize the tube prediction of (22.b) and (22.c). We define the sample set at \( h\) -iteration as

\[ \begin{equation} \mathcal{SS}^h = \left\{ \bigcup_{\hat{h}=0}^h \bigcup_{\hat{t}=0}^{\infty} \bigcup_{\hat{k}=0}^{N-1} (x_{\hat{k}|\hat{t}}^{\hat{h}}, s_{\hat{k}|\hat{t}}^{\hat{h}}) \right\}, \end{equation} \]

(24)

and the convex sample set

\[ \begin{align} \mathcal{CS}^h=& \left\{ (z,\lambda)\in\mathbb{R}^{n+1+|\mathcal{SS}^h|} \;\middle|\; \right.\\ &~ \left. z=\sum_{i=1}^{|\mathcal{SS}^h|} \lambda_iz_i, \lambda_i\geq0,\sum_{i=1}^{|\mathcal{SS}^h|} \lambda_i=1, z_i \in \mathcal{SS}^h ~\right\},\notag \\\end{align} \]

(25)

where Assumption 4 ensures that both sets are non-empty at \( h=0\) . The robust convex safe set

\[ \begin{equation} \begin{split} \mathcal{CS}^h_{\text{robust}} =& \left\{ (x,s,\lambda) \in \mathbb{R}^{n+1+|\mathcal{SS}^h|}\;\middle|\; \exists \; (x', s',\lambda) \in \mathcal{CS}^h\; \text{s.t.} \right.\\ &~ \left. \underset{i}{\max} H_i(x - x') \leq s'-s \; \right\}, \end{split} \end{equation} \]

(26)

is constructed such that \( \mathbb{X} = \{z | H_i (z - x) \leq s\} \subseteq \mathbb{X}^{\prime}= \{z | H_i (z - x^{\prime}) \leq s^{\prime}\}\) . Given the initial trajectory, we define the initial terminal cost

\[ \begin{equation} \begin{split} Q^{0}(\lambda)&= \sum_{\hat{k}=0}^{\infty} \lambda_{\hat{k}|0}^{0} \mathcal{J}^{0}_{\text{wc},\hat{k}|0}\\ \text{with}& ~\mathcal{J}^{0}_{\text{wc},\hat{k}|0}=\sum_{k=\hat{k}}^{\bar{N}-1}\ell_{\text{max}}(x^{0}_{k},v^{0}_{k},s^{0}_{k})-\ell_{\text{max,s}},\\ &~\lambda=[\dots,\lambda_{\hat{k}|0}^{0},\dots] \end{split} \end{equation} \]

(27)

where \( \lambda\in \mathbb{R}^{|\mathcal{SS}^0|}\) and \( \mathcal{J}^{0}_{\text{wc},\hat{k}|0}=0\) for \( k\geq \bar{N}\) . Furthermore, we define the worst-case cost to go for each trajectory in the set (24) recursively as

\[ \begin{equation} \begin{split} \mathcal{J}^{\hat{h}}_{\text{wc},N|\hat{t}} =& Q^{\hat{h}-1}(\lambda^{\hat{h}}_{\hat{t}}),\\ \mathcal{J}^{\hat{h}}_{\text{wc},\hat{k}|\hat{t}} =& \ell_{\text{max}}(x^{\hat{h}}_{\hat{k}|\hat{t}},v^{\hat{h}}_{\hat{k}|\hat{t}},s^{\hat{h}}_{\hat{k}|\hat{t}})-\ell_{\text{max,s}} + \mathcal{J}^{\hat{h}}_{\text{wc},\hat{k}+1|\hat{t}}, \end{split} \end{equation} \]

(28)

for \( \hat{k}=1,\dots,N-1\) . The terminal cost is the convex combination of the worst-case cost to go

\[ \begin{equation} \begin{split} Q^h(\lambda)&= \sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} \mathcal{J}^{\hat{h}}_{\text{wc},\hat{k}|\hat{t}}\\ \text{s.t} & ~ \lambda=[\dots,\lambda_{\hat{k}|\hat{t}}^{\hat{h}},\dots] \\ \end{split} \end{equation} \]

(29)

Remark 1

Given \( Q(\lambda)\) and \( \mathcal{CS}^h_{\text{robust}}\) , we can define the optimal cost to go as \( Q^{h,\star}(x,s)=\min_{(x,s,\lambda)\in\mathcal{CS}^h_{\text{robust}}}Q(\lambda)\) . This cost shows a more direct dependence on the terminal state \( (x_{N|t}^h, s_{N|t}^h)\) , similar to the notation in [16]. However, this notation would make some of the technical arguments in the following analysis more cumbersome, which is why we use both the cost \( Q^h\) and the set \( \mathcal{CS}^h_{\text{robust}}\) separately in the following analysis.

3.6 Offline and Online Algorithms

The computation is divided into an offline part and an online part. Complex computations are, as far as possible, outsourced to the offline computation in Algorithm 2. In the online algorithm 3, the convex quadratic program (22) is solved at each time \( t\) . The obtained online data is then used to refine \( \Theta^{\text{HC},h}\) and expand \( \mathcal{SS}^h\) .

Algorithm 2 Robust adaptive learning MPC - Offline

1.Compute feedback K (Assumption 2).

2.Design the polytope \( \mathcal{P}\) (8).

3.Compute \( \rho_{\bar{\theta}_0}\) , \( L_{\mathbb{B}}\) , \( \bar{d}\) , \( c_j\) (9), (10), (11), (17).

4.Compute an initial solution (Assumption 4) and construct \( \mathcal{CS}_{\textrm{Robust}}^{0} \) (26) and \( Q^{0}\) (27).

Algorithm 3 Robust adaptive learning MPC - Online

1.For each iteration h, starting at \( x_0^h=x_s\) , \( \Theta^{\textrm{HC},h-1}\) do:

2.For each time \( t\) , given \( x_t^h\) do:

3.Update \( \Theta_t^{\textrm{HC},h}\) (Alg. 1), \( \rho_{\bar{\theta}_t^h}\) (9)

4.Solve MPC optimization problem (22)

5.Apply control input \( u_t^h=v_{0|t}^{h,*}+Kx_t^h\)

6.Update terminal condition (26), (29)

4 Theoretical Analysis

In this section, we show that the proposed MPC scheme is recursively feasible, ensures constraint satisfaction, and achieves a desirable closed-loop performance bound. First, we ensure robust control invariance of the learned terminal set \( \mathcal{CS}^h_{\text{robust}}\) :

Proposition 1

Suppose Assumptions 1, 2, 3, and 4 hold. Then, the set \( \mathcal{CS}^h_{\text{robust}}\) , as defined in (26), is a robust control invariant set subject to the constraints (2), i.e., for all \( (x, s,\lambda) \in \mathcal{CS}^h_{\text{robust}}\) , \( h\in\mathbb{N}\) and any \( (\bar{\theta},\eta)\) with \( \bar{\theta}\oplus\eta \mathbb{B}_p\subseteq \Theta_0^{\text{HC}}\) , there exists a control input \( v\in \mathbb{R}^m\) and \( \lambda^+\in \mathbb{R}^{|\mathcal{SS}^h|}\) such that

\[ \begin{align} \big(x^+, s^+ ,\lambda^+\big) &\in \mathcal{CS}^{h}_{\text{robust}}, \\ \big(x , Kx+v \big) &\in \mathcal{Z},\\ Q^h(\lambda^+)\leq &Q^h(\lambda)-(\ell_{\mathrm{max}}(x,v,s)-\ell_{\mathrm{max,s}}). \end{align} \]

(30)

where \( x^+=A_{\text{cl},\bar{\theta}} x + B_{\bar{\theta}} v\) and \( s^+=(\rho_{\bar{\theta}}+\eta L_{\mathbb{B}}) s + \bar{d} + w_{\eta}(x, Kx+v)\) .

Proof

Part 1: Given that \( (x,s,\lambda) \in \mathcal{CS}^h_{\text{robust}}\) , there exists \( \lambda=[\dots, \lambda_{\hat{k}|\hat{t}}^{\hat{h}}, \dots]\) with \( \sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}}=1\) and \( \lambda_{\hat{k}|\hat{t}}^{\hat{h}}\geq0\) such that \( (x',v',s')=\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} (x_{\hat{k}|\hat{t}}^{\hat{h}},v_{\hat{k}|\hat{t}}^{\hat{h}},s_{\hat{k}|\hat{t}}^{\hat{h}})\) satisfies \( \underset{i}{\max} H_i(x-x')\leq s'-s\) . We denote \( ({\bar{t}},{\bar{h}})=\arg\min_{\hat{t},\hat{h}} \eta_{\hat{t}}^{\hat{h}}\) . Recall that \( (x_{N|\hat{t}}^{\hat{h}},s_{N|\hat{t}}^{\hat{h}},\lambda_{\hat{t}}^{\hat{h}})\in\mathcal{CS}^{\hat{h}-1}_{\text{robust}}\) , i.e., there exists a \( (\tilde{x}_{\hat{t}}^{\hat{h}},\tilde{s}_{\hat{t}}^{\hat{h}},\lambda_{\hat{t}}^{\hat{h}})\in\mathcal{CS}^{\hat{h}-1}\) such that \( \max_iH_i(x_{N|\hat{t}}^{\hat{h}}-\tilde{x}_{\hat{t}}^{\hat{h}})\leq\tilde{s}_{\hat{t}}^{\hat{h}}-s_{N|\hat{t}}^{\hat{h}}\) for all \( \hat{h}<h\) . Let us consider

\[ \begin{align} \notag &x'^{+}:=\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-2} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} x_{\hat{k}+1|\hat{t}}^{\hat{h}}+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} \tilde{x}_{\hat{t}}^{\hat{h}} \end{align} \]

(31)

\[ \begin{align} \notag =&\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} x_{\hat{k}+1|\hat{t}}^{\hat{h}}-\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (x_{N|\hat{t}}^{\hat{h}}-\tilde{x}_{\hat{t}}^{\hat{h}})\\ \notag \stackrel{~(22.b)}{=}& A_{\text{cl},\bar{\theta}} x' + B_{\bar{\theta}} v' - D(x',u') \Delta \bar{\theta}_{\bar{t}}^{\bar{h}} \\ &-\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \left( \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} D(x_{\hat{k}|\hat{t}}^{\hat{h}},u_{\hat{k}|\hat{t}}^{\hat{h}}) \Delta\bar{\theta}_{\hat{t}}^{\hat{h}}\right. \\ \notag &\left. +\lambda_{N-1|\hat{t}}^{\hat{h}} (x_{N|\hat{t}}^{\hat{h}}-\tilde{x}_{\hat{t}}^{\hat{h}})\right), \notag \\\end{align} \]

(32)

where \( \Delta \bar{\theta}_{\bar{t}}^{\bar{h}} = \bar{\theta} - \bar{\theta}_{\bar{t}}^{\bar{h}}\) , \( \Delta \bar{\theta}_{\hat{t}}^{\hat{h}} = \bar{\theta}_{\bar{t}}^{\bar{h}} - \bar{\theta}_{\hat{t}}^{\hat{h}}\) and \( u'=Kx'+v'\) .

Since it holds that \( \theta_{\bar{t}}^{\bar{h}}-\theta_{\hat{t}}^{\hat{h}} \in \Delta\eta_{\hat{t}}^{\hat{h}} \mathbb{B}_p\) with \( \Delta\eta_{\hat{t}}^{\hat{h}}=\eta_{\hat{t}}^{\hat{h}} - \eta_{\bar{t}}^{\bar{h}}\) , we can lower bound

\[ \begin{align} &s'^+ \end{align} \]

(33)

\[ \begin{align} ~\notag :=&\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-2} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} s_{\hat{k}+1|\hat{t}}^{\hat{h}}+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} \tilde{s}_{\hat{t}}^{\hat{h}}\\ \notag =&\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} s_{\hat{k}+1|\hat{t}}^{\hat{h}}-\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (s_{N|\hat{t}}^{\hat{h}}-\tilde{s}_{\hat{t}}^{\hat{h}})\\ \notag =&\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} \left[(\rho_{\bar{\theta}_{\hat{t}}^{\hat{h}}} +\eta_{\hat{k}|\hat{t}}^{\hat{h}} L_{\mathbb{B}})s + \bar{d} + w_{\eta}(x_{\hat{k}|\hat{t}}^{\hat{h}}, u_{\hat{k}|\hat{t}}^{\hat{h}})\right]\\ \notag &-\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (s_{N|\hat{t}}^{\hat{h}}-\tilde{s}_{\hat{t}}^{\hat{h}})\\ \notag \geq&\rho_{\bar{\theta}_{\bar{t}}^{\bar{h}}} s'+w'+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} w_{\Delta\eta_{\hat{t}}^{\hat{h}}}(x_{\hat{k}|\hat{t}}^{\hat{h}},u_{\hat{k}|\hat{t}}^{\hat{h}})\\ \notag &+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (\tilde{s}_{\hat{t}}^{\hat{h}}-s_{N|\hat{t}}^{\hat{h}}),\notag \\\end{align} \]

(34)

by using the inequality \( \rho_{\theta_{\bar{t}}^{\bar{h}}}-\Delta\eta_{\hat{t}}^{\hat{h}}L_{\mathbb{B}}\leq\rho_{\theta_{\hat{t}}^{\hat{h}}}\) [12, Prop. 1] and \( w'=\bar{d}+\eta_{\bar{t}}^{\bar{h}} L_{\mathbb{B}} s'+w_{\eta_{\bar{t}}^{\bar{h}}}(x',u')\) . Note that \( (\tilde{x}_{\hat{t}}^{\hat{h}},\tilde{s}^{\hat{h}}_{\hat{t}},\tilde{\lambda}^{\hat{h}}_{\hat{t}})\in\mathcal{CS}^{\hat{h}-1}\) and \( (x_{\hat{k}+1|\hat{t}}^{\hat{h}},s_{\hat{k}+1|\hat{t}}^{\hat{h}})\in \mathcal{SS}^{h}\) , \( k=0,\dots,N-2\) ensures that there exists a \( \lambda^+\) such that \( (x^{'+},s^{'+},\lambda^+)\in\mathcal{CS}^h\) . Additionally, we define the auxiliary tube dynamic

\[ \begin{equation} \begin{split} \tilde{s}'^+=&\rho_{\bar{\theta}}\tilde{s}'+w_{\Delta \eta_{\bar{t}}^{\bar{h}}}(x',u')\\ &+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} w_{\Delta \eta_{\hat{t}}^{\hat{h}}}(x_{\hat{k}|\hat{t}}^{\hat{h}},u_{\hat{k}|\hat{t}}^{\hat{h}})\\ &+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (\tilde{s}_{\hat{t}}^{\hat{h}}-s_{N|\hat{t}}^{\hat{h}}), \end{split} \end{equation} \]

(35)

with \( \tilde{s}'=s'-s\) . Next, we show that the auxiliary tube \( \tilde{s}'\cdot\mathcal{P}\) bounds the error between \( x'^+\) and \( x^+=A_{\text{cl},\bar{\theta}} x + B_{\bar{\theta}} v'\) , i.e., \( x^+-x'^+=e^+\in \tilde{s}'^+\cdot\mathcal{P}\) . Using \( \underset{i}{\max} H_i(x-x')\leq s'-s=\tilde{s}'\) , it holds that

\[ \begin{equation} \begin{split} &\underset{i}{\max}H_i(x^+-x'^+)\\ \stackrel{~(31)}{=}&\underset{i}{\max} H_i\left(A_{\text{cl},\bar{\theta}}(x-x')+ D(x',u') \Delta \bar{\theta} _{\bar{t}}^{\bar{h}}\right.\\ &+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} D(x_{\hat{k}|\hat{t}}^{\hat{h}},u_{\hat{k}|\hat{t}}^{\hat{h}}) \Delta \bar{\theta}_{\hat{t}}^{\hat{h}}\\ &\left. +\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (x_{N|\hat{t}}^{\hat{h}}-\tilde{x}_{\hat{t}}^{\hat{h}})\right)\\ \leq&\rho_{\bar{\theta}}\tilde{s}'+w_{\Delta \eta_{\bar{t}}^{\bar{h}}}(\bar{x}',\bar{u}')\\ &+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} w_{\Delta \eta_{\hat{t}}^{\hat{h}}}(x_{\hat{k}|\hat{t}}^{\hat{h}},u_{\hat{k}|\hat{t}}^{\hat{h}})\\ &+\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \lambda_{N-1|\hat{t}}^{\hat{h}} (\tilde{s}_{\hat{t}}^{\hat{h}}-s_{N|\hat{t}}^{\hat{h}})=s'^+. \end{split} \end{equation} \]

(36)

where the inequality uses (9), (12), (31), (33). Thus, \( (x^+,s^+,\lambda^+)\in \mathcal{CS}^h_{\text{robust}}\) reduces to \( s^++\tilde{s}'^+-s'^+\leq0\) with \( s^+=\rho_{\bar{\theta}} s + w\) . Similar to proof of [12, Th. 1], this can be shown by using (22.c), (33) and (35). Hence, we showed that \( (x^+,s^+,\lambda^+) \in \mathcal{CS}^h_{\text{robust}}\) for \( \lambda^+\) and \( v=v'\) . Finally, the constraint satisfaction (2) follows with

\[ \begin{equation} \begin{split} &F_j x + G_j u + c_j s\leq F_j x' + G_j u'+c_j s'\\ &=\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}}(F_j x_{\hat{k}|\hat{t}}^{\hat{h}} + G_j u_{\hat{k}|\hat{t}}^{\hat{h}} + c_j s_{\hat{k}|\hat{t}}^{\hat{h}} )\leq 0, \end{split} \end{equation} \]

(37)

using the fact that all data points satisfy the tightened constraints (22.e).
Part 2: Using the definition (29), we get

\[ \begin{equation} \begin{split} &Q^h(\lambda^+)-Q^h(\lambda)\\ =& \sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}} \big( \mathcal{J}^{\hat{h}}_{\text{wc},k+1|\hat{t}} \\ &- \left(\ell_{\text{max}}(z_{\hat{k}|\hat{t}}^{\hat{h}},v_{\hat{k}|\hat{t}}^{\hat{h}},s_{\hat{k}|\hat{t}}^{\hat{h}})-\ell_{\text{max,s}}\right)-\mathcal{J}^{\hat{h}}_{\text{wc},k+1|\hat{t}} \big) \\ =&\sum_{\hat{h}=0}^{h} \sum_{\hat{t}=0}^{\infty} \sum_{\hat{k}=0}^{N-1} \lambda_{\hat{k}|\hat{t}}^{\hat{h}}(-\ell_{\text{max}}(z_{\hat{k}|\hat{t}}^{\hat{h}},v_{\hat{k}|\hat{t}}^{\hat{h}},s_{\hat{k}|\hat{t}}^{\hat{h}})+\ell_{\text{max,s}})\\ \leq&-\ell_{\text{max}}(x',v,s')+\ell_{\text{max,s}}, \end{split} \end{equation} \]

(38)

where \( \ell_{\text{max}}(x',v,s')\) is a lower bound on the convex combination. Moreover, using (20) we get \( -\ell_{\text{max}}(x',v,s')\leq-\ell_{\text{max}}(x,v,s)\) for all\( (x,s)\in \mathbb{R}^{n+1}\) with \( x \oplus s \mathcal{P} \subseteq x' \oplus s' \mathcal{P}=\mathbb{X}'\) .

The following theorem utilizes the properties of the learned terminal set and cost (Prop. 1) to establish the closed-loop properties of the proposed MPC scheme.

Theorem 1

Suppose Assumption 1, 2, 3, and 4 hold. Then Problem (15) is feasible for all \( h\in\mathbb{N}\) , \( t\in\mathbb{N}\) for the closed-loop system with \( u_t^h=v_{0|t}^{h,*}+Kx_t^h\) and the constraints (2) are satisfied. Moreover, the closed-loop cost satisfies

\[ \begin{equation} \limsup_{T\rightarrow\infty}\dfrac{1}{T}\sum_{t=0}^{T-1} \ell(x_t^h,u_t^h)\leq \ell_{\mathrm{max,s}}. \end{equation} \]

(39)

Proof

Part 1: Recursive feasibility is proved by induction. At \( t=0\) , the initial solution (Ass. 4) is a feasible solution to the problem (22). At time \( t+1\) , given a feasible solution to (22) at \( t\) , we consider the candidate inputs \( v_{k|t+1}^h=v_{k+1|t}^{h,*}\) and \( v_{N-1|t+1}^h=v\) according to Proposition 1. Furthermore, we define \( x_{N+1|t}^{h,*}\) , \( s_{N+1|t}^{h,*}\) with (22.b) and (22.c)using \( v_{N|t}^*=v_{N-1|t+1}^h\) . The error between the candidate solution \( x_{k|t+1}^h\) and the previous optimal solution \( {x}_{k+1|t}^{h,*}\) is described by the error dynamics

\[ \begin{equation} e_{k+1|t+1}^h = A_{\text{cl},\bar{\theta}_{t+1}^h} e_{k|t+1}^h + D(x_{k+1|t+1}^{h,*}, u_{k+1|t+1}^{h,*}) \Delta \bar{\theta}_t^h, \end{equation} \]

(40)

with \( \Delta \bar{\theta}_t^h = \bar{\theta}_{t+1}^h - \bar{\theta}_t^h\) and \( e_{0|t+1}^h=x^h_{t+1}-x_{1|t}^{h,*}\) . Additionally, we define the dynamic of the auxiliary tube

\[ \begin{equation} \tilde{s}_{k+1|t+1}^h = \rho_{\bar{\theta}_{t+1}^h} \tilde{s}_{k|t+1}^h + w_{\Delta \eta_t^h} (x_{k+1|t}^{h,*}, u_{k+1|t}^{h,*}), \end{equation} \]

(41)

with \( \tilde{s}_{0|t+1}^h=s_{0|t+1}^h-s_{1|t}^{h,*}=w_{0|t}^{h,*}\) and \( \Delta \eta_t^h= \eta_t^h-\eta_{t+1}^h\) . The auxiliary tube dynamic is used to bound the error dynamic (40) between the candidate solution and the previous optimal solution, which leads to the condition \( x_{k|t+1}^h-x_{k+1|t}^{h,*}=:e_{k|t+1}^h \in \tilde{s}_{k|t+1}^h \cdot \mathcal{P}\) . This can be shown similarly to (36).

Next, we prove that the tube of the previous optimal solution contains the tube of the candidate solution, which results in the condition

\[ \begin{equation} s_{k+1|t+1}^h - s_{k+2|t}^{h,*} + \tilde{s}_{k+1|t+1}^h\leq 0. \end{equation} \]

(42)

Similar to proof of [12, Th. 1], this can be shown by using (22.c) and (41). Constraint satisfaction follows similarly to the proof of Proposition 1. The last step is to prove satisfaction of the terminal constraint (22.g). Recall that \( \underset{i}{\max} H_i (x_{N|t+1}^h-x_{N+1|t}^{h,*})\leq s_{N+1|t}^{h,*}-s_{N|t+1}^h\) . Using Proposition 1, \( (x_{N|t}^{h,*},s_{N|t}^{h,*},\lambda_{t}^{h,*})\in\mathcal{CS}^{h-1}_{\text{robust}}\) ensures that there exist a input \( v\) and a \( \lambda^+\) such that \( (x_{N+1|t}^{h,*},s_{N+1|t}^{h,*},\lambda^+)\in\mathcal{CS}^{h-1}_{\text{robust}}\) . Furthermore, the definition of \( \mathcal{CS}^{h-1}_{\text{robust}} \) (26) guarantees that there exist \( (x'^+,s'^+,\lambda^+)\in \mathcal{CS}^{h-1}\) such that \( \underset{i}{\max} H_i (x_{N+1|t}^{h,*} - x'^+) \leq s'^+-s_{N+1|t}^{h,*}\) . Combining the two inequalities results in

\[ \begin{equation} \underset{i}{\max} H_i (x_{N|t+1}^h - x'^+)\leq s'^+-s_{N|t+1}^h,\\ \end{equation} \]

(43)

which proves that \( (x_{N|t+1}^{h},s_{N|t+1}^{h},\lambda^+)\in\mathcal{CS}^h_{\text{robust}}\) . This completes the recursive feasibility proof.
Part 2: Using the candidate solution at time \( t+1\) from the recursive feasibility proof, we derive:

\[ \begin{equation} \begin{split} & J^{\text{LMPC},h,*}_{t+1}(x_{t+1}^h,\Theta_{t+1}^{\text{HC},h})-J^{\text{LMPC},h,*}_t(x_{t}^h,\Theta_t^{\text{HC},h})\\ \sum_{k=0}^{N-1} \ell_{\text{max}}(x_{k|t+1}^{h},u_{k|t+1}^{h},s_{k|t+1}^{h}) + Q^{h-1}(\lambda_{t+1}^{h})\\ &-\sum_{k=0}^{N-1} \ell_{\text{max}}(x_{k|t}^{h,*},u_{k|t}^{h,*},s_{k|t}^{h,*}) - Q^{h-1}(\lambda_{t}^{h,*}) \\ \leq &-\ell_{\text{max}}(x_{0|t}^{h,*},u_{0|t}^{h,*},s_{0|t}^{h,*})+ Q^{h-1}(\lambda_{t+1}^{h}) \\ &+ \ell_{\text{max}}(x_{N-1|t+1}^{h},u_{N-1|t+1}^{h},s_{N-1|t+1}^{h})- Q^{h-1}(\lambda_{t}^{h,*}), \end{split} \end{equation} \]

(44)

where the last inequality uses monotonicity (20). From Proposition 1, we further obtain:

\[ \begin{equation} \begin{split} & J^{\text{LMPC},h,*}_t(x_{t+1}^h,\Theta_{t+1}^{\text{HC},h})-J^{\text{LMPC},h,*}_t(x_{t}^h,\Theta_t^{\text{HC},h}) \\ \leq&-\ell(x_t^h,u_t^h)+\ell_{\text{max,s}} \end{split} \end{equation} \]

(45)

The average stage cost bound (39) directly follows by using the cost decay (45) in a telescopic sum and the fact that \( J_t^{\mathrm{LMPC},h,*}\) is bounded on the compact feasible set.

The decrease condition (45) of the optimal cost with the positive definiteness quadratic cost \( \ell\) mirrors the Lyapunov condition to show practical asymptotic stability of the closed-loop system [21, Th. 2.20]. Hence, given also suitable lower and upper bounds on the optimal cost \( J^{\text{LMPC},h,*}\) , Theorem 1 also implies practical asymptotic stability. However, a formal proof, including possible additional required technical conditions, is beyond the scope of this paper.

5 Simulation Results

The following example illustrates the performance and computational efficiency improvements of the proposed robust adaptive learning MPC scheme compared to robust adaptive MPC(RAMPC) [12] for an iterative task.
We consider a mass spring damper system

\[ \begin{equation} m \ddot{x}_1 = -c \dot{x}_1 - k x + u + d, \end{equation} \]

(46)

with the fixed mass \( m=1\) , the uncertain damper constant \( c \in [0.1, 0.3]\) and the spring constant \( k \in [0.5, 1.5]\) . Additionally, the system is affected by an additive disturbance \( |d_t| \leq 0.2\) . The unknown true values are denoted as \( c^* = 0.3\) and \( k^* = 0.5\) . Moreover, the moving window \( M\) is set to \( 10\) for all experiments. We use Euler discretization with a sampling time \( T_s=0.1\) . Transforming the second-order ODE into a state-space model yields the state \( x = (x_1, \dot{x}_1) \in \mathbb{R}^2\) . Then the constraints are \( \mathcal{Z} = [-0.2, 4.1] \times [-5, 5] \times [-15, 15]\) . The control goal is to iteratively steer the system from \( x_s=\begin{bmatrix}4 & 0\end{bmatrix}^T\) to the origin, while minimizing the quadratic stage cost with \( Q=\text{diag}(1,10^{-2})\) and \( R=10^{-1}\) .
The controller \( K\) is computed using linear matrix inequalities [12]. Furthermore, the polytope \( \mathcal{P}\) is the maximal \( \rho\) -contractive set for the constraints. The initial solution of the Assumption 4 for the convex safe set (26) is computed by solving Problem (22) with a finite horizon and robust positively invariant terminal set based on [12, Proposition 3]. In the terminal set, we append steps until we reach the steady state in Assumption 4.

First, we investigate how changing the horizon \( N\) affects the computation time and the resulting cost in comparison to RAMPC. To obtain a comparable result over the iterations, we set the disturbances \( d_t\) to a constant value \( d_t=0.1\) . Both algorithms are repeated over 20 iterations, where the set \( \Theta^{\text{HC}}\) is further reduced from iteration to iteration. We solve the problem using CasADi[22] and IPOPT, and the code is publicly available online . The RAMPC algorithm requires a horizon of at least \( N=22\) steps to be feasible and we set the horizon to \( N_{\text{RAMPC}}=25\) . Table 1 shows an overall improvement in the cost of the final iteration for \( N_{\text{RALMPC}}=\{8,12,18\}\) and average CPU time compared to RAMPC. The improvement in computation time is due to the reduced horizon, which has a larger impact than the additional optimization variable \( \lambda\) . Note that the dimension of \( \lambda\) grows with the size of the sample set, emphasizing the importance of reducing the sample set to only promising states. Table 1 shows the trade-off between cost improvement and computation time increase in comparison to the optimal solution (OS), which solves the robust infinite horizon optimization problem with knowledge of the true parameter, accounting only robustly for the unknown disturbance \( d_t\)

Table 1. Comparison of RAMPC and RALMPC in terms of average computation time solving the MPC problem and closed-loop cost of the last iteration.
{N}	Average Computation Time (s)		Cost
	RAMPC	RALMPC	OS	RAMPC	RALMPC
25	11.9	-	-	159	-
18	-	8.0	-	-	139
12	-	3.6	-	-	140
8	-	1.8	-	-	149
6	-	1.3	-	-	161
\( \infty\)	-	-	135	-	-

Figure 2 shows the closed-loop trajectories for the different controllers. The horizon length of the RALMPC is set to \( N_{\text{RALMPC}}=12\) because this is the best trade-off between computation time and cost. It clearly shows that the RLAMPC algorithm approaches the optimal solution through learning. Moreover, it reveals that even if the horizon \( N_{\text{RAMPC}}\) is twice times larger, the trajectory of RAMPC is far from the optimal trajectory, which demonstrates the advantage of the RALMPC.

Comparison of RALMPC ( N_{RALMPC}=8) ) and RAMPC trajectories for state x_1)
over time. The initial trajectory (blue) serves as the starting point for RALMPC learning. The optimal solution (OS, orange) represents the theoretical best achievable performance, which serves as a benchmark. RALMPC (green: first iteration, magenta: last iteration) illustrates the iterative improvement of RALMPC. — Figure 2. Comparison of RALMPC (\( N_{\text{RALMPC}}=8\) ) and RAMPC trajectories for state \( x_1\) over time. The initial trajectory (blue) serves as the starting point for RALMPC learning. The optimal solution (OS, orange) represents the theoretical best achievable performance, which serves as a benchmark. RALMPC (green: first iteration, magenta: last iteration) illustrates the iterative improvement of RALMPC.

6 Conclusion

We proposed a robust adaptive learning MPC framework for linear systems with parametric uncertainties and additive disturbances. We extended the robust adaptive MPC in [12] to iterative tasks by learning the terminal cost and the terminal set. The proposed algorithm is recursively feasible with robust constraint satisfaction and guarantees a robust bound on the closed-loop cost. In a numerical example, we show the advantages of the method for iterative tasks in comparison to [12]. The RALMPC decreases the closed-loop cost function by iterative learning. Moreover, it allows to reduce the horizon of the MPC, which leads to a reduced computational complexity.

References

[1] James Blake Rawlings and David Q Mayne and Moritz Diehl and others Model predictive control: theory, computation, and design Nob Hill Publishing Madison, WI 2017 2

[2] Ugo Rosolia and Francesco Borrelli Learning how to autonomously race a car: a predictive control approach IEEE Transactions on Control Systems Technology 2019 28 6 2713–2719

[3] M Bahadır Saltık and Leyla Özkan and Jobert HA Ludlage and Siep Weiland and Paul MJ Van den Hof An outlook on robust model predictive control algorithms: Reflections on performance and computational aspects Journal of Process Control 2018 61 77–102

[4] Basil Kouvaritakis and Mark Cannon Model predictive control Switzerland: Springer International Publishing 2016 38 13-56 7

[5] Wilbur Langson and Ioannis Chryssochoos and SV Raković and David Q Mayne Robust model predictive control using tubes Automatica 2004 40 1 125–133

[6] T-H Kim and Toshiharu Sugie Adaptive receding horizon predictive control for constrained discrete-time linear systems with parameter uncertainties International Journal of Control 2008 81 1 62–73

[7] Anil Aswani and Humberto Gonzalez and S Shankar Sastry and Claire Tomlin Provably safe and robust learning-based model predictive control Automatica 2013 49 5 1216–1226

[8] Marko Tanaskovic and Lorenzo Fagiano and Roy Smith and Manfred Morari Adaptive receding horizon control for constrained MIMO systems Automatica 2014 50 12 3019–3029

[9] Marko Tanaskovic and Lorenzo Fagiano and Vojislav Gligorovski Adaptive model predictive control for linear time varying MIMO systems Automatica 2019 105 237–245

[10] Monimoy Bujarbaruah and Xiaojing Zhang and Francesco Borrelli Adaptive MPC with chance constraints for FIR systems Proc. Annual American Control Conference (ACC) 2018 2312–2317 IEEE

[11] Matthias Lorenzen and Mark Cannon and Frank Allgöwer Robust MPC with recursive model update Automatica 2019 103 461–471

[12] Johannes Köhler and Elisa Andina and Raffaele Soloperto and Matthias A Müller and Frank Allgöwer Linear robust adaptive model predictive control: Computational complexity and conservatism Proc. IEEE 58th Conference on Decision and Control (CDC) 2019 1383–1388 IEEE

[13] Xiaonan Lu and Mark Cannon Robust adaptive tube model predictive control Proc. American Control Conference (ACC) 2019 3695–3701 IEEE

[14] Alexandre Didier and Anilkumar Parsi and Jeremy Coulson and Roy S Smith Robust adaptive model predictive control of quadrotors Proc. European Control Conference (ECC) 2021 657–662 IEEE

[15] Douglas A Bristow and Marina Tharayil and Andrew G Alleyne A survey of iterative learning control IEEE control systems magazine 2006 26 3 96–114

[16] Ugo Rosolia and Francesco Borrelli Learning model predictive control for iterative tasks. a data-driven control framework IEEE Transactions on Automatic Control 2017 63 7 1883–1896

[17] Ugo Rosolia and Francesco Borrelli Learning model predictive control for iterative tasks: A computationally efficient approach for linear system IFAC-PapersOnLine 2017 50 1 3142–3147

[18] Wataru Hashimoto and Kazumune Hashimoto and Masako Kishida and Shigemasa Takai Robust learning-based iterative model predictive control for unknown non-linear systems IET Control Theory & Applications 2024 18 18 2540–2554

[19] Alexandre Didier and Kim P Wabersich and Melanie N Zeilinger Adaptive model predictive safety certification for learning-based control Proc. 60th IEEE Conference on Decision and Control (CDC) 2021 809–815 IEEE

[20] AB Acikmese and Martin Corless Stability analysis with quadratic Lyapunov functions: a necessary and sufficient multiplier condition PROCEEDINGS OF THE ANNUAL ALLERTON CONFERENCE ON COMMUNICATION CONTROL AND COMPUTING 2003 41 3 1546–1555 Citeseer

[21] Lars Grüne and Jürgen Pannek and Lars Grüne and Jürgen Pannek Nonlinear model predictive control Springer 2017

[22] Joel A E Andersson and Joris Gillis and Greg Horn and James B Rawlings and Moritz Diehl CasADi Mathematical Programming Computation 2019 11 1 1–36 10.1007/s12532-018-0139-4