If you see this, something is wrong
First published on Tuesday, Jun 18, 2024 and last modified on Friday, Jul 18, 2025 by François Chaplais.
École Normale Supérieure de Cachan 61, avenue du Président Wilson 94235 Cachan cedex France
Centre Automatique et Systèmes, Mines-ParisTech, 60 bvd Saint-Michel 75272 PARIS cedex 06, France Email
A windowed averaged scheme is defined for general control systems. The same method is used to average costs in optimal control problems (OCPs). A numerical parameter \( \alpha\) can be computed, which expresses the distance between the original system and the averaged system in a weak sense.
Then, if we use the optimal control of the averaged OCP in the original OCP, the suboptimality of the control is bounded by an expression of the form \( C\alpha^{2}\) .
We consider here optimal control problems which feature fast dependency on time. This is the case, for instance, if the problems depend on data which features high frequency phenomenons, i.e., features that happen at a much smaller time scale than the response time of the dynamics.
This time scaling has consequences on the numerical solving of these control problems. Typically, the discretization step used in a numerical method will be determined by the fastest phenomenon that is part of the system. In our settings, this is the sampling rate of the input data, which is much higher than the time constant of the dynamical system. We present in this paper a flexible and efficient way to approximately solve the original control problem using an averaging scheme. In this scheme, the state is sampled at a rate which is determined by the time constant of the system, and not the fast evolving features of the input data.
When the fast data is periodic with respect to time, a classic approach to the solving of differential equations is averaging. Historically, the method of averaging was introduced to study the motion of celestial bodies by solving a simple two body equation which is perturbed by the influence of other bodies (see the section in [1] about the history of averaging). As developed in [2], the framework was that of the perturbation of an orbit by the small influence of a periodic input. Averaging was then generalized in a geometric framework, notably in [3, 4]. A comprehensive book on the subject is [1]. In this framework, the faster oscillating phenomenon is either periodic or satisfies some kind of ergodicity assumption.
The previous references consider a non controlled dynamical systems. We turn our attention now to controlled systems, and, in particular, optimal control problems (OCPs).
In this perspective, we add to the differential equation an extra parameter \( u\) which is the open loop control of the system. If we have non-linearities in the dynamics, common sense dictates that averaging be performed after the control is introduced, by contrast to the approach where the control would be introduced after the differential system is averaged. Indeed, the high frequency content of the control may interact with the high frequency content of the data; therefore the averaging must be applied once this interaction is taken into account.
In the case of ordinary differential equations, the approximation value of the averaging method is determined by the proximity of the nominal and averaged trajectories.
However, in the case of controlled systems, there are as many trajectories to consider as there are different controls. Fortunately, in optimal control, there is a simple way to evaluate the performance of an approximation method:
Let us review some previous work on this topic.
An early work [5] applies averaging to two point boundary value problems, but its application to optimal control is limited to what is essentially the linear quadratic case. In [6], the method of averaging is applied to optimal control, both in open loop (the system is then periodic with respect to the fast time), and in closed loop (which is a study of the Hamilton-Jacobi equation under an ergodicity assumption). The study of the HJB equation is improved in [7]. Observe that, in these two references, the horizon is finite and the oscillatory input is fast.
References [8, 9, 10, 11] consider an optimal control problem “in the long run”, that is, with slow and “normal” time scales, using averaging techniques. The convergence of the optimal cost is proved, but there is no estimation of the loss of optimality that the approximation method produces.
Reference [12] constructs the method of averaging for controlled systems on time scales, and determines the algorithm of correspondence of controls over the original and averaged systems. It is an interesting work, but we shall prove in this article that the order of efficiency of the method is under-estimated (it should be two).
As we pointed out before, the application of averaging to celestial mechanics is a classic. It has been naturally transposed to the context of optimal control. Specifically, the optimal control of low thrust engines in space have been studied from a geometric view point and in the periodic case. Such a approach has been adopted in [13, 14, 15]. Averaging has also been used for similar problems [16, 17] in a spirit that is close to [6]. All of these references assume that the fast dynamics are either periodic or satisfy an ergodicity assumption.
There are applications of averaging in optimal control that are more data driven (by contrast to autonomous celestial mechanics). Energy management of thermal systems consider the optimal control of the temperature of buildings [18]. Naturally, this heavily depends of the weather data, which naturally features fast oscillating components (i.e. fast with respect to the building’s inertia).
Reference [19] considers a buoy which is subject to the oscillating influence of waves.
Averaging has been also used is stochastic optimal control, notably of Markov chains (see for instance [20, 21, 22]). Indeed, when there exists a cycle in a discrete state Markov chain that has high transition probabilities, then this cycle is gone through very fast and averaging can be applied. An alternative to the classical averaging scheme is ergodic theory [23].
In this article, we consider optimal control problems which are influenced by some fast perturbation data (with respect to the time constant of the system). The approach is somehow different from what exists in the litterature. In particular, there is no state space representation of the process that generates the fast phenonmena featured in the system. No periodicity is assumed, either.
Here we consider, as far as rapidly oscillating inputs are considered, that the behavior of the system is similar to the response of a single integrator. This is why we first define our averaging method by the effect of integration on general signals. The averaged signal is defined by its successive averages on consecutive intervals. This transformation can be from various points of view:
From this approximation technique we derive a similar one for ordinary differential equations, control systems, and ultimately optimal control problems.
In all of these contexts, the performance of the averaging method is evaluated by the difference between the output of the reference signal and the averaged one, through an integrator. The maximum value for this difference is called \( \alpha\) .
It is a very simple thing to evaluate and as a consequence, it may be applied to a wide range of situations.
An important feature of this problem is that the mesh size of the two point boundary value problem is not dictated by the fast data, but rather by the time constant of the system (which is larger than the width of the averaging interval). The minimization consists in the minimization of an averaged Hamiltonian.
This numerical technique has already been applied to the optimal control of a low thrust space engine that must reach its final orbit; the controls in the Keplerian case (where periodicity exists) and in the non Keplerian case (where the influence of the moon and of the sun breaks the periodicity) have been successfully computed [26].
The paper is organized as follows. Section 2 presents the original OCP. It then presents windowed averaging for functions, differential equations and control systems. It then presents the concept of averaged OCP that is used in this paper. This is where the averaging error \( \alpha\) is defined. Sections 3 and B compute formal expansions of the state and of the costate of the nominal problem with respect to the parameter \( \alpha\) . Auxilliary variables that will be used later in the course of the article are defined here. Section 4 introduces an auxiliary problem of optimization as well as new auxiliary variables. The main assumptions (bounded derivatives and convexity) are given before we state the auxiliary problem. The main result is exposed in section 5. It is a result on the control cost, trajectories and the optimal control. Section 6 is devoted to the proof of the main theorem. A conclusion is presented is section 7.
Note that, for a fluent reading of the paper, complicated computations are detailed in the appendices.
We wish to solve the following Optimal Control Problem (OCP) :
(1)
where \( x\) is is a finite dimensional state which satisfies the dynamics :
(2)
where \( u\) is an integrable finite dimensional and unconstrained control. The assumptions on \( f\) and \( L\) are given in section 4.1.
Let us consider a house in winter. It is cold outside of the house, and the people inside of the house want to be warm. These people can use a device to heat the house inside.
A simple representation of the system uses the following
It should be emphasized that the outer temperature here is a weather forecast data, and that no tractable state space representation of the weather evolution is available to the control designer. Weather forcasting requires a global representation of the weather dynamics, and it must be updated regularly with measures taken all around the globe. This is not an easy problem.
Let us go back to our house heating system. A very simplified state space representation of the house temperature evolution is the following
(3)
where \( \theta\) is the temperature inside the house, \( \theta_{o}\) is the outside temperature and \( u\) is the heating power.
The cost that should be minimized is typically
(4)
where \( c(t)\) is the time varying cost of the energy. Of course, the problem makes sense only if we have some constraint on the state, typically
(5)
which represents the comfort requirement of the people inside. These constraints vary in time because the house may be empty or not depending on time.
The control is also bounded. A nice presentation of this problem in context is in [18].
Observe that this problem is essentialy determined by three datasets:
In this article we will not elaborate in the solution of this particular optimal control problem, notably because it involves state constraints, and we shall focus on the behavior of the system (3).
This system features two time scales:
Figure 1 shows the outer temperature data over two months.
The issue is that, while the time constant of the system is of several days, a discretization of the system should be performed with a time step of one hour because of the fast variations of the outer temperature. This is the problem that we shall address in this paper.
Before going on, let us recall a result on differential equations.
Lemma 1
Let \( x\) a real value function of time that satisfies the differential equation
(6)
and such that, for any integrable signal \( z\) one has
(7)
with \( a\geq 0\) , and let \( y\) the (nonnegative) solution of
(8)
Then
(9)
Proof
Using the Picard fixed point iterations, one checks that (8) has a unique solution. We prove the lemma using also the Picard sequences for \( x\) and \( y\) . The Picard iteration for \( x\) is
(10)
and for \( y\) :
(11)
We have \( y_{0}=|x_{0}|\) , and, assuming that \( |x_{k}|\leq y_{k}\) , we derive
and we obtain \( |x|\leq y\) at the limit.
Observe that we never use the differential formulation (6) of the ODE in the proof.
We use the previous lemma to estimate a kind of BIBO behavior using only the integral of the input.
Theorem 1
Let \( x\) the solution of (6) and \( \lambda\) a Lipschitz constant of \( f\) . Let \( g\) another vector field and \( y\) the solution of
(12)
Define \( \alpha\) by
(13)
Then, for \( t\leq T\) ,
(14)
Proof
we have
Using lemma 1 and solving (8) proves the result.
We assume that our nominal system is (6) and that the basic solution methods for solving it are not very well conditioned. This is the case, for instance, of our example house heating system. We wish to solve the system (12) instead. If we knew in advance \( y\) , we could use some kind of signal processing method on \( g\) to obtain a better conditionned system with a small error term \( \alpha\) . We cannot do this, however, because this would amount to defining the vector fiel \( g\) by using the solution \( y\) of (12), which itself requires the definition of \( g\) .
Instead we shall proceed as follows
We can now proceed we the definition for averaging signals in next section, and then use it to approximate differential systems in the section thereafter.
We consider a signal defined on an interval and split this interval into \( N\) consecutive intervals. On each interval we compute the average of the signal.
Definition 1
Let \( N\) an integer, and g an integrable function on \( [0,T]\) . Let us subdivide \( [0,T]\) into \( N\) intervals \( [t_{k},t_{k+1}]\) , with:
(15)
The averages of \( g\) on these intervals define the low pass filter \( LP\) on \( g\) :
(16)
The difference \( g-LP[g]\) , that is the averaging error, defines the high pass filter \( HP\) on \( g\) :
(17)
We then denote I[g] the antiderivative of the averaging error:
(18)
We show now that the upper bound of the function \( I[g]\) is small when \( N\) is big and \( g\) is bounded.
Proposition 1
The following bound holds for any bounded function \( g\) and any \( t\in[0,T]\) :
(19)
Proof
Let us first prove that for any \( k=0..N\)
Indeed:
Hence, for \( t\in[t_{k}t_{k+1}[\) :
But \( |LP[g](s)|\leq \|g\|_{\infty}\) , so that \( |HP[g](s)|\leq 2 \|g\|_{\infty}\) . This gives finally:
Thus we may define a small number \( \alpha\) that represents how close the signal \( g\) is closed to its averaged signal \( LP[g]\) :
(20)
Let us suppose that \( g\) is periodical with period \( \frac{T}{N}\) . Then \( LP[g]\) is constant equal to the mean of \( g\) over a period. The function \( HP[g]\) is a periodic signal with \( 0\) average. The function \( I[g]\) is the periodic antiderivative of \( HP(g)\) with value \( 0\) at \( 0\) . Its upper bound \( \alpha\) is of order \( \frac{T}{N}\) .
Let us suppose that \( g\) is periodical with period \( \epsilon\) a small divisor of \( T\) , and \( N=1\) . Then \( LP[g]\) is constant equal to the mean of \( g\) over the many periods in \( [0,T]\) . The function \( HP[g]\) is a periodic signal of small period \( \epsilon\) with \( 0\) average. The function \( I[g]\) is the periodic antiderivative of \( HP(g)\) with value \( 0\) at \( 0\) . Its upper bound \( \alpha\) is of order \( \epsilon\) . This example shows that \( \alpha\) can be small with seemingly large intervals, proving that (19) is a conservative estimate.
We use this averaging method on the temperature signal of figure 1 with an averaging window of 24 hours. Figure 2 shows the difference between the signal and its average; figure 3 shows the output of this difference through an integrator. Figure 3 indicates that the averaging error \( \alpha\) is much smaller than the sup norm obtained from figure 2.
Let us consider the ordinary differential equation (ODE):
(21)
with \( f\) Lipschitz with respect to \( x\) and integrable with respect to t.
The averaged ODE is defined as:
(22)
where the low-pass filter \( LP\) on a general function \( g(x,t)\) is defined as:
(23)
As a summary, \( LP[g]\) averages \( g\) with respect to time on rectangular adjacent windows, leaving the state variable unchanged.
The state \( x_{0}\) is well defined and bounded because \( f\) , and thus \( LP[f]\) , is Lipschitz.
We now estimate the distance between the trajectories obtained respectively from the nominal and averaged system.
Proposition 2
Let \( x\) defined by the ODE (21) and let \( x_{0}\) defined by the averaged ODE (22).
We define the signal \( I[f,x_{0}]\) by
(24)
Let \( \lambda\) the Lipschitz constant of \( f\) and define
(25)
Then the following bound holds:
(26)
In other words, the averaging error \( \|x-x_{0}\|_{\infty}\) is in first order in \( \alpha\) , that is small when \( N\) is big because \( f\) is Lipschitz and thus bounded on \( [0,T]\) .
Proof
The result is proved by taking
in Theorem 1.
If we want to have an a priori error estimate, i.e. an estimate that does not depend on the trajectory \( x_{0}\) , we can use an error bound linked to the difference between the vector fields.
Proposition 3 (a priori error estimate)
Define, for a state \( \xi\) ,
(27)
and
Then
(28)
Proof
For \( t\in \left[k\frac{T}{N},(k+1)\frac{T}{N}\right]\) ,
Let us use \( \xi=x_{0}\left(k\frac{T}{N}\right)\) . Since both \( f\) and \( LP[f]\) are bounded by \( \| f \|_{\infty}\) , we derive that
Because the heating system is linear time invariant, we have \( \alpha = A\) . For the simulation we take a time constant of \( 3\) days = \( 72\) hours. Figure 4 displays the superposition of the inner temperature simulated with the original system (in black) and the averaged system (in red). For this system we have \( \alpha =0.2754\) , and the maximum error of the output temperature is \( 0.4613\) . Remember that the error estimate (26) is conservative since is does not take into account the stability of the system.
In the scope of this article, we define, for the integer \( N>0\) , and any functions \( u(t)\) and \( g(x,u(t),t)\) , an averaged function \( LP[g,u]\)
(29)
As a summary, \( LP[g]\) averages \( g\) with respect to time (this includes the open loop control) on rectangular adjacent windows, leaving the state (or costate) unchanged. The control is considered as an input signal among others.
We define the averaged OCP which consists in minimizing the cost
(30)
where \( y\) is the state defined by
(31)
The system (31) is a well defined differential equation.
Assumption 1
The averaged OCP admits an optimal control \( u_{0}\) with a corresponding trajectory \( x_{0}\) .
The optimal trajectory \( x_{0}\) is defined by the ODE:
(32)
Since this is a new kind of optimal control, it is wise to study its stationarity condition.
Theorem 2
Let H be the Hamiltonian of the nominal problem:
(33)
Let \( p_{0}\) be the costate of the averaged problem, defined along the optimal trajectory \( x_{0}\) by the backwards differential equation:
(34)
Then the following stationarity condition holds:
(35)
where \( a.e.\) stands for almost everywhere in \( t \in [0,T]\) .
Proof
see appendix A
We define an averaging error \( \alpha\) for the two point boundary value problem similarly to what was defined for initial boundary value problems.
For a function \( g(x,u,t)\) , the difference \( g(x_{0}(t),u_{0}(t),t)-LP]g,u_{0}](x_{0}(t),t)\) is the result of high pass filtering \( HP]g,u](x_{0}(t),t)\) . Let’s define the antiderivative:
(36)
Let’s define similarly the backwards antiderivative
(37)
Definition 2
For the rest of that document, we consider the small number \( \alpha\) defined as:
(38)
We shall see that \( \alpha\) measures how close the nominal and averaged problems are. Observe that \( \alpha\) is small, for instance, when \( N\) is big and assumption 2 below holds (bounded functions and their derivatives).
Observe that that \( \alpha\) depends only on the solution of the averaged problem. It is an a posteriori estimate like in definition (25), and not an a priori estimate.
Technically, the proof of our main approximation result is based on the Pontryagin principle. This is a theoretical result which holds independently of the actual numerical method used to solve the nominal (resp. averaged) problem.
Now that we have stated what the averaged optimal control problem is, we can begin to evaluate its benefit when solving numerically a general optimal control problem.
Let us consider a direct method for the solving of problem (1,2). To do so, we discretize the dynamics into
(39)
In order to represent accurately the influence of the data on the system, the discretization step \( \delta t\) must be a suitable sampling rate for the data. In this article’s framework, \( \delta t\) is much smaller than the time constant of the system. The approximation of the cost is obtained by using a quadrature rule for the integral. Without using extra knowledge on the system, the time step of the quadrature is \( \delta t\) .
In this framework, the optimal control problem becomes a finite dimensional optimization problem with a finite number of equality constraints.
We consider now a similar direct method to solve the averaged problem (30,31). To avoid confusion, we denote by capital \( T_{K}\) the boundaries of the averaging intervals, and \( X_{K}\) denotes the state of the discretized averaged system. The dynamics of \( X\) is
(40)
where a suitable quadrature rule is used to compute the integral. Consistently with the discretization (39), we consider the sampling rate of the control and data to be \( \delta t\) . The computation of the cost is handled similarly.
We see that, while (39) only consider the differential aspect of the dynamics, the averaged dynamics (40) incorporates the integral action of a dynamical system.
Let us now consider the benefit of solving the averaged problem versus solving the nominal problem. To make things even simpler, let us use a Riemann integral for the computation of (40) and define
We assume that \( R\) is a (large) integer. Indeed, \( \delta t\) is dictated by the sampling rate of the data, while \( T_{K+1}-T_{K}\) is dictated by the time constant of the system, which is much larger.
We see that the number of control variables \( u_{k}\) is the same in both problems. However, the number of constraints (40) for the averaged problem is \( R\) times smaller than the number of constraints (39) for the original problem. This is the main benefit of using averaging to solve the optimal control problem with direct methods.
Finally, observe that the definition of the averaged costate (34) indicates what should be the two point boundary problem used in an indirect method for solving the averaged problem. In this case, the benefit of using the averaging method is a coarser grid for the two point boundary value problem.
In what follows we shall take great care to give explicit expressions for the various constants that are used throughout the article to obtain error bounds. This is not for the sake of displayed complicated expressions. This is for the following reason: the parameter \( \alpha\) , though very important in the scope of this work, is not the only one to influence the error bounds. Typically, the magnitude of the dynamics, cost, and their derivatives are also very important. In fact, if some part of the dynamics is large, this means that we are close to the situation where we want to use averaging while also having singular perturbations, and this a difficult framework to work with.
So yes, the details of the expressions are also important, and the tedious computations are necessary to obtain this kind of granularity in the expression of the final result. The homogeneity of the expressions of the bounds with respect to the numerical parameters of the various assumptions gives precious insight on their physical meaning.
We are now going to define auxiliary variables by means of formal expansions with respect to \( \alpha\) . In what follows, \( u_{0}\) , \( x_{0}\) and \( p_{0}\) denote the optimal control, state and costate of the averaged problem as described in section 2.8.
This section has no intrinsic numerical purpose. However, the exansions obtained here will be used to define intermediate variables that will actually be used in the numerical part of the article.
Definition 3
We denote the following variables from the state \( x\) (resp. \( x_{0}\) ), the control \( u\) (resp. \( u_{0}\) ) and the costate \( p\) (\( p_{0})\) :
where \( p\) is the costate of the nominal problem defined by the ODE with final condition:
(41)
Definition 4
We denote the derivatives up to second order of functions with respect to the variables \( x\) , \( u\) or \( \sigma \) with indexes, on the model:
The variable \( \alpha\) defined by the equation (38) is viewed as a small quantity . In Appendix B we compute a formal expansion of the state \( x\) and the costate \( p\) at the first order with respect to \( \alpha\) .
These formal expansions lead to the definition of auxiliary variables \( \tilde{x_{1}}\) and \( \tilde{p_{1}}\) as follows:
Definition 5
We define the auxiliary variables \( \tilde{x_{1}}\) and \( \tilde{p_{1}}\) the following way:
(42)
and
The error terms \( \tilde{x_{1}}\) and \( \tilde{p_{1}}\) correspond to the deviation error due to the approximation of the dynamics with respect to their fast oscillating behavior.
In definition 8 of section 4.3 we define deviations that are defined in magnitude, by contrast to the sole averaging error (at the optimum) which is considered here.
A consequence of the definition of \( \alpha \tilde{x_{1}}\) in definition 5 and \( \alpha\) in definition 2 is that:
(43)
And a consequence of the definition of \( \alpha \tilde{p_{1}}\) in definition(5) and \( \alpha\) in definition 2 is that:
(44)
We define an auxiliary problem which is similar to the study of the second variation in [24].
We make two kinds of assumption on the problem data; the first is related to the regularity of the functions involved, and the second is a convexity assumption (Legendre-Clebsch) that is common in optimal control [24, 24, 6].
Assumption 2 (Regularity)
The derivatives, up to the third order, of \( f\) and \( L\) with respect to \( x\) and \( u\) are bounded by some \( k>0\) .
A consequence of that assumption is that the value \( \alpha\) defined by equation (38) is well defined and small if \( N\) is sufficiently big (or if the problem is periodic of small period).
As a consequence, we can bound the second derivative of the dynamics \( f\) :
Proposition 4
(45)
for any \( (X,U,Y,V)\) .
Proof
Moreover, as \( p_{0}\) is the solution of the ODE with final condition (34), it is differentiable and thus continuous of the bounded interval \( [0,T]\) , so that it is bounded. So that another consequence of the assumption 2 involves the derivatives of the Hamiltonian:
Proposition 5
The hamiltonian \( H(x,u,p_{0},t)\) and its derivatives up to the third order in \( u\) and \( x\) are bounded by a constant K.
Proof
Take \( K=(1+\|p_{0}\|_{\infty}) k\) .
As a consequence, the Hessian of the Hamiltonian is bounded:
Proposition 6
(46)
for any \( (X,U,Y,V)\) .
Proof
Similar as for \( f_{\sigma \sigma}\) .
Assumption 3 (Strong Legendre-Clebsch condition [24])
There exists \( \beta > 0\) so that for any \( (x,u)\) , the following holds:
(47)
and
(48)
The consequence of equation (47) is that \( H_{uu}\) is invertible and \( \|H_{uu}^{-1}\|_{\infty} \le \frac{1}{\beta} \) on any \( (x,u,p_{0})\) .
The consequence of equation (48) is :
Proposition 7
(49)
Proof
see Appendix C
Definition 6
Let’s define the following notations:
and
Then we define the auxiliary problem as the linear quadratic OCP for the state y steered by the control v:
(50)
(51)
where \( \tilde{x}_{1}\) and \( \tilde{p}_{1}\) are defined in definition 5 as the averaging errors on the dynamics of the optimal state and costate at the optimal solution of the averaged problem (as defined in section 2.8).
Proposition 8
There exists an optimal cost \( v_{1}\) for the auxiliary problem.
Proof
It is a convex linear quadratic problem because \( H_{0 \sigma \sigma}\) is non-negative (equation (49)).
Definition 7
We denote \( v_{1}\) an optimal control of the auxialty problem, \( y_{1}\) the trajectory corresponding to \( v_{1}\) and \( q_{1}\) the corresponding costate.
Proposition 9
\( y_{1}\) follows the dynamics:
(52)
Proof
This is the definition of the dynamics of the auxiliary problem.
Moreover, we have:
Proposition 10
The stationarity condition \( \frac{\partial H_{1}}{\partial u}(y_{1},v_{1},p_{1}) = 0\) may be written the following way:
(53)
Moreover, the costate \( q_{1}\) of the auxiliary problem follows the dynamics:
(54)
Proof
The hamiltonian of the auxiliary problem expands in:
(55)
Finally, we have:
Proposition 11
\( y_{1}\) , \( v_{1}\) and \( q_{1}\) are bounded by a constant \( M\) .
Proof
The auxiliary problem is smooth and convex.
Definition 8
For any \( u \in \mathbf{L}_{[0,T]}^{2}\) , \( x\) is the trajectory of the nominal dynamics (2).
Let’s then define the following notations:
Where \( (u_{0},x_{0})\) is a solution of the averaged problem and \( \alpha \tilde{x_{1}}\) is \( I[f,x_{0},u_{0}]\) (equation (42)).
The \( \delta\) variables are deviations in magnitude around the optimal control \( u_{0}\) , after a correction due to the pure averaging error on the various dynamics.
Definition 9
We define the following variables:
Here \( r\) and \( v\) are variations with respect to the solution of the averaged optimal control problem, with the extra corrective terms
The quantity \( z^{2}\) is repeatedly used in [24] to study perturbations in optimal control; it has been used later in [27] in a similar manner.
We will bound \( \|r\|_{\infty}^{2}\) and \( \|v\|_{2}^{2}\) with respect to \( z^{2}\) and \( \alpha^{2}\) . Bounding then \( z^{2}\) with respect to \( \alpha^{2}\) will lead to the main result.
Definition 10
We define the following constants:
with:
- \( k\) is introduced in assumption 2 about the boundedness of the derivatives of \( f(x,u,t)\) and \( L(x,u,t)\) .
- \( K=(1+\|p_{0}\|_{\infty}) k\)
- \( M\) is the upper bound of the optimal trajectory of the auxiliary problem introduced in section 4.2 (proposition 11).
- \( \alpha\) is the (hopefully) small quantity defined in equation (38).
- \( \beta\) is the convexity constant of \( H_{uu}\) introduced in equation (47) inside of the Legendre-Clebsch conditions of assumption 3.
Let us recall we we stand in the expansion of the state and of the control. We have
Proposition 12
The following inequalities hold:
(56)
and:
(57)
Proof
see appendix D.1
We now define an extra correction term that will be used furter on. We will show that this correction term is also bounded by values that depend only on \( z\) and \( \alpha\) .
Definition 11
\( r_{1}\) is defined by the following dynamics:
(58)
The state \( r_{1}\) is driven by the difference between the dynamics with state \( r_{1}\) and input \( v\) , and the nominal velocity of the system for the value of the state and controls that include all the previously introduced correction terms.
We will bound \( \|r-r_{1}\|_{\infty}\) in \( \alpha^{2}\) and \( \|r_{1}\|_{\infty}^{2}\) in \( z^{2}\) and \( \alpha^{2}\) . As a consequence, we will “anchor” our error estimates to the nominal term \( r_{1}\) .
Definition 12
We define the following constants:
with \( k\) , \( K\) and \( M\) as in definition 10.
Proposition 13
The following inequalities hold:
(59)
and
(60)
Proof
see appendix D.2
Definition 13
We define the following constants:
where:
- \( k\) , \( K\) , \( M\) , \( \alpha\) and \( \beta\) are as in definition 10.
- \( k_{r1}\) , \( k_{r2}\) , \( k_{v1}\) and \( k_{r2}\) are defined in definition 10
- \( k_{r3}\) , \( k_{r4}\) and \( k_{r5}\) are defined in definition 12.
Note that these constants depend only of \( k\) , \( K\) , \( M\) , \( \alpha\) , \( \beta\) , and the horizon \( T\) .
Assumption 4
This condition appears because \( \alpha\) is a numerical value determined by data, and not a parameter that tends to zero. In fact, the previous assumption is always satisfied if we can chose \( \alpha\) arbitrarily small.
Theorem 3 (Main Theorem)
Considering the nominal problem in section 2.1, let \( H(x,u,p,t)\) its Hamiltonian, and let \( J^{*}= \inf_{u}J(u)\) be its infimum cost.
Let \( u_{0}\) a solution of the averaged problem described in section 2.8 with its trajectory \( x_{0}\) . Such a solution exists by assumption 1.
Let \( \alpha\) be the small quantity defined in equation (38) and let \( \beta\) be the constant introduced in assumption 3.
Let the set of constants (\( k_{J1}, k_{J}, k_{x}, k_{u})\) introduced in definition 13.
Then, under the set of assumptions listed in section 4.1 and the assumption 4, the following inequalities hold:
- the suboptimality of the real system commanded by \( u_{0}\) is limited to:
(61)
- any trajectory \( x\) of the nominal problem for a \( u\) better than \( u_{0}\) (\( J(u) \leq J(u_{0})\) ), is close to \( (x_{0},u_{0})\) , with:
(62)
(63)
To prove the main theorem, we proceed the following way.
The section 6.2 is devoted to the search of a lower bound of any real cost \( J(u)\) of the nominal problem. That lower bound contains two integral terms that do not depend on \( u\) , a term in \( z^{2}\) that is the only one depending on \( u\) , and a term in \( \alpha^{2}\) , that dos not depend on \( u\) either.
The section 6.3 is devoted to the search of an upper bound of the real cost \( J(u_{0})\) of the nominal problem controlled by \( u_{0}\) . That upper bound contains the same two integral terms as in the lower bound of \( J(u)\) and a term in \( \alpha^{2}\) .
Then the section 6.4.1 uses the assumption \( \alpha \leq \frac{\beta}{k_{J1}}\) to obtain a lower bound of \( J(u)\) independent of \( u\) , so that it is also a lower bound for \( J^{*}\) . That lower bound is combined with the upper bound of \( J(u_{0})\) to bound the suboptimality \( J(u_{0})-J^{*}\) as a function of \( \alpha^{2}\) , as stated in the equation (61) of the first part of the main theorem 3.
Then the section 6.4.1 uses the stronger assumption \( \alpha \leq \frac{\beta}{2k_{J1}}\) to obtain, for any \( u\) better than \( u_{0}\) , i.e. so that \( J(u) \leq J(u_{0})\) , an upper bound of \( z^{2}\) . With that bound of \( z^{2}\) , upper bounds for \( \|x-x_{0}\|_{\infty}\) and \( \|u-u_{0}\|_{2}\) are found in the equations (62) and (63) of the second part of the main theorem, with the help of the definitions and bounds of \( r\) and \( v\) in section 4.3.
We use an expansion with integral remainder to compute the variation of the cost with respect to the nominal integral cost for the nominal system being driven by \( u_{0}\) .
Proposition 14
The cost \( J(u) = \int_{0}^{T} L(x,u,t) dt\) for any command \( u \in \mathbf{L}_{[0,T]}^{2}\) expands the following way:
(64)
Proof
see appendix E.
We are going now to study carefully the various terms involved in the expansion (64).
Proposition 15
The following inequality holds:
(65)
Proof
See appendix F.
Proposition 16
The following inequality holds:
(66)
Proof
see appendix G.
Proposition 17
Let’s define \( R\) as the sum of the integral terms of the right hand sides of equations (65) and (66):
Then the following inequality holds:
(67)
Proof
see appendix H.
We capitalize on the previous results to obtain a lower bound on \( J(u)\) for any control \( u\) .
Lemma 2
A lower bound of the cost \( J(u)\) of the nominal system for any control \( u\) is given by:
(68)
where \( k_{J1}\) and \( k_{J2}\) are defined in definition 13.
Proof
This is a consequence of equations (65), (66) and (67)
As in the previous section, we first compute an expansion with integral remainder of the cost; then we study each term of the expansion to obtain a global bound in section 6.3.5.
Definition 14
Let \( x^{0}\) be the trajectory of the nominal problem controlled by \( u_{0}\) . It is defined by the dynamics:
(69)
For that trajectory, we set the notations:
Where \( (u_{0},x_{0})\) is the solution of the averaged problem and \( \alpha \tilde{x_{1}}\) is \( I[f,x_{0},u_{0}]\) (equation (42)).
Proposition 18
The cost \( J(u_{0}) = \int_{0}^{T} L(x^{0},u_{0},t) dt\) for the optimal command \( u_{0}\) of the averaged problem expands the following way:
(70)
Proof
It is a consequence of proposition 14 with \( u=u_{0}\) , so that \( \delta u = 0\) .
Proposition 19
The following inequality hold for \( \delta x^{0} = x^{0}-x_{0}\) :
(71)
Proof
As \( \delta x^{0} = x^{0}-x_{0}\) , \( x^{0}\) follows the dynamics (69) and \( x_{0}\) follows the averaged dynamics (32), we have the following integral equation:
Thus the following inequality holds:
Equation (71) follows from Gronwall lemma.
Proposition 20
The following inequality holds:
(72)
Proof
By definition of \( \tilde{x}^{0}\) , we have
so that:
Thus, by Taylor expansion of \( f(x^{0},u_{0},t)\) with integral remainder, we have:
Thus, thanks to proposition 19, we have:
(73)
But an integration by part, together with the fact that \( HP[H_{x},u_{0}](x_{0},p_{0},t) =-\alpha \frac{d \tilde{p_{1}}}{dt}\) and that \( \tilde{x}^{0}(0)=\tilde{p_{1}}(T)=0\) leads to:
(74)
Including equation (73) and the fact that \( \|\tilde{p_{1}}\|_{\infty} \leq 1\) into equation (73) proves equation (72)
Proposition 21
The following inequality holds:
(75)
Proof
This is a consequence of the proposition 19.
Inserting equations (72) and (75) into equation (70 leads to:
Lemma 3
An upper bound of the cost \( J(u_{0})\) of the nominal system controlled by \( u_{0}\) is given by:
(76)
where \( k_{J0}\) is defined in definition 13.
The main theorem is proved in two steps. The first step compares the costs to estimate the suboptimality of the real system controlled by \( u_{0}\) . The second step compares the trajectories and controls which outperform \( u_{0}\) , to estimate how close they are from the trajectory and control driven by \( u_{0}\) .
Let’s now use the assumption \( \alpha \leq \frac{\beta}{2k_{J1}}\) of the first part of the Main Theorem 3 into the equation (68) of Lemma 2. The term in \( z^{2}\) is then non negative, and we get the lower bound independent of \( u\) :
As that lower bound holds for any \( u\) , it is also a lower bound for the infimum cost \( J^{*}= \inf_{u}J(u)\) :
so that:
Inserting this equation into the equation (76) of Lemma 3, together with the fact that \( J(u_{0}) \geq J^{*}\) , by definition of \( J^{*}\) , proves the suboptimality equation (61) in he Main Theorem 3, since \( k_{J}=k_{J0}+k_{J2}\) .
Let’s consider a control \( u\) better than \( u_{0}\) , i.e. such that \( J(u) \leq J(u_{0})\) .
Let’s now use in a stronger manner the assumption \( \alpha \leq \frac{\beta}{2k_{J1}}\) of the second part of the Main Theorem 3 into the equation (68) of Lemma 2. The coefficient \( z^{2}\) is then lower than \( \frac{\beta}{2}\) , and we get the lower bound dependent of \( z^{2}\) , that depends on \( u\) :
Thus, together with the equation (76) of Lemma 3, we have the list of inequalities:
So that, with the definition of \( k_{J}=K_{J0}+k_{J2}\) , we have
(77)
On the other hand, by definition of \( r\) , we have:
so that, together with the equation (56):
Introducing equation (77) into that equation leads to equation (62) of the second part of the Main Theorem 3.
Now let’s consider the fact that, by definition of \( v\) :
so that:
Introducing equation (77) into that equation leads to equation (63) of the second part of the Main Theorem 3.
We have shown that averaging on consecutive intervals is a flexible and efficient way to approximate signals while getting rid of rapidly varying features. We have extended this method to ordinary differential equations, control systems and optimal control problems. This provides a very flexible framework for the approximation of optimal control problems with rapidly varying features.
Indeed, if \( \alpha\) is a measure of the averaging error on the dynamics, we have shown that the level of suboptimality due to the use of the solution of the averaged problem into the original control system is bound by some \( k\alpha^{2}\) , where \( k\) depends on very general bounds on the dynamics and on the convexity of the system.
This makes it possible to use a coarser time mesh in the numerical computation of the solution of the optimal control problem, as explained in section 2.11.
However, this does not decrease the number of optimizations of the Hamiltonian. This stresses, if needed, the requirement for fast minization routines in this kind of optimal control problem solving.
As a final note, we can observe that the control here is not constrained. However, using an interior penalty approach as in [27] should be a reasonable method to generalize this article’s result to the constrained input case.
Let \( u_{0}\) be the optimal control of the averaged problem (31) and let \( x_{0}\) be the corresponding trajectory. \( x_{0}\) is defined by the ODE with initial condition (32). Let \( p_{0}\) be the costate of the optimal trajectory, defined by the ODE with final condition (34), with \( H\) the hamiltonian (33).
The system constituted of the of equations (32) and (34) is a two boundaries problem. It is defined as the two boundaries problem corresponding to the averaged optimal control problem.
Let \( \delta u \in \mathbf{L}_{[0,T]}^2\) a scalar square integrable function on \( [0,T]\) .
Let \( \epsilon > 0\) and let \( u_{\epsilon} = u_{0} + \epsilon \delta u\) the variation of \( u_{0}\) in the direction of \( \delta u\) .
Since \( J_{0}(u_{0})=\min_{u}J_{0}(u)\) , the following stationarity condition holds:
(78)
Let \( x_{\epsilon}\) be the trajectory corresponding to \( u_{\epsilon}\) , defined by the dynamics equation 31 with \( v=u_{\epsilon}\) :
(79)
Let \( \delta x\) be the variation trajectory corresponding to the direction \( \delta u\) given by \( \delta x = \left(\frac{dx_{\epsilon}}{d \epsilon}\right)_{\epsilon =0}\) .
Lemma 4
\( \delta x\) respects the following dynamics function:
(80)
Proof
Let \( k\) be so that \( \left[t_{k},t_{k+1}\right))\) . Then:
Let’s make use of Equation (78) in developing \( \left(\frac{dJ_{0}(u_{\epsilon})}{d \epsilon}\right)_{\epsilon=0}\) for a given \( \delta u\) .
Lemma 5
The derivative at \( 0\) of \( J_{0}(u_{\epsilon})\) in \( \epsilon\) is related to the hamiltonian by the following equation:
(81)
Proof
Let’s use the equation (30) and then commute the differentiation and integration:
For any \( k \in [0,N-1]\) and for any \( t \in [t_{k},t_{k+1})\) , we have the definition (29) of \( LP\) :
Thus if we commute again the integration and the differentiation:
But by definition of \( x_{\epsilon}\) , \( u_{\epsilon}\) , \( \delta x\) and \( \delta u\) , we have:
Thus by averaging on \( [t_{k},t_{k+1}]\) , the result is (\( LP\) is linear):
But because of the dynamics (34) of the averaged costate \( p_{0}\) , we have, with the definition (33) of the Hamiltonian:
Thus integrating \( \left(\frac{d}{d \epsilon}\left[ LP[L,u_{0}](x_{\epsilon},t)\right]\right)_{\epsilon=0}\) between \( 0\) and \( T\) , we obtain:
(82)
Let’s make an integration by part for the first term of that equation:
The variation of \( p_{0} \delta x\) between \( 0\) and \( T\) is null because \( \delta x(0) = 0\) and \( p_{0}(T) =0\) . Thus, with the dynamics of \( \delta x\) given by the Lemma 4, the following holds:
Let’s insert this equation in the first term of equation (82).It results in:
This proves the equation (81) by definition of the Hamiltonian.
Let’s now make use of equation (81). Let’s first fix \( t\) and let \( k\) be so that \( t \in [T_{k},t_{k+1})\) . Then we have:
Let’s specialize \( \delta u\) as a “needle variation”:
with \( \eta > 0\) so that \( t+\eta < t_{k+1}\) and \( \delta v \in \mathbf{L}_{[0,T]}^2\) .
Then we have:
More precisely, the limit is the value of the function at \( t\) everywhere the function is continue, that is for any \( t\) possibly except for a countable number of “jumps”. As any countable set is negligible, the limit holds almost everywhere.
Thus, because of the equations (78) and (81), we have:
and this is true for any \( \delta v \in \mathbf{L}_{[0,T]}^2\) . This proves the stationnarity result:
To prove this result, we have used the fact that the control is unconstrained. The extension of the main result to the constrained case can be considered using the same penalty approach as in [27]. In this approach, the constrained problem is approximated by a sequence of unconstrained problems, where the stationarity condition holds.
We formally expand the state, costate and control variables with respect to \( \alpha\) . This step leads to the definitions of the various terms which are used throughout the article to finally yield the bounds stated in the main theorem.
The state variable \( x\) is the solution of the original dynamics equation (2) driven by the control \( u\) . Let us define a formal expansion of \( x\) and \( u\) with regards to \( \alpha\) :
(83)
(84)
Note that the redundant definitions of \( x_{0}\) and \( u_{0}\) are consistent, as will be seen later.
We now separate the expansion terms \( x_{0}\) and \( x_{1}\) of the state into low (\( LP\) ) and high (\( HP\) ) frequencies components:
(85)
Because of the equations (83) and (85), and because the derivative of the high frequency signal \( \tilde{x_{1}}\) is in \( \frac{1}{\alpha}\) , the derivative of \( x\) has the following expansion in \( \alpha\) :
(86)
But, because \( x\) is the solution of the original dynamics equation (2), using the formal expansions in \( \alpha\) of \( x\) and \( u\) , and thanks to assumption 2, we have another expansion of \( \frac{dx}{dt}\) at the first order in \( \alpha\) :
(87)
Consequently, identifying the zero order terms in equations (86) and (87), we obtain:
But by definition of \( \bar{x_{0}}\) as the low frequency part of \( x_{0}\) , we have:
Consequently, by definition of \( HP\) , we have:
with initial value \( 0\) . Integration the previous equation yields
But \( I[f,\bar{x_{0}},u_{0}]\) is of order 1 in \( \alpha\) and \( \tilde{x_{0}}\) is of order 0. Thus \( \tilde{x_{0}}=0\) , that gives \( x_{0}=\bar{x_{0}}\) the solution of the averaged problem. A consequence of that is that both definitions of \( x_{0}\) and \( u_{0}\) are consistent.
Moreover, we have a definition of \( \tilde{x_{1}}\) consistent with definition 5:
so that so that the derivative of \( \tilde{x_{1}}\) is \( -\frac{1}{\alpha}HP[f_{x},u_{0}](x_{0},p_{0},t)\) , that is in \( \frac{1}{\alpha}\) .
The costate variable \( p\) is the solution of the costate dynamics equation with ending condition (41). We expand it at the first order with respect to \( \alpha\) :
(88)
We then separate the expansion terms \( p_{0}\) and \( p_{1}\) into low (\( LP\) ) and high (\( HP\) ) frequencies components:
(89)
Because of the equations (88) and (89), and because the derivative of the high frequency signal \( \tilde{p_{1}}\) is in \( \frac{1}{\alpha}\) , the derivative of \( p\) has the following expansion in \( \alpha\) :
(90)
On the other hand, \( p\) is the solution of the costate dynamics equation with ending condition (41). Moreover, we have defined the developments in \( \alpha\) of \( x\) , \( p\) and \( u\) .
Thus we have another development of \( \frac{dp}{dt}\) at the first order in \( \alpha\) :
(91)
Consequently, identifying the zero order terms in equations (90) and (91), we have:
But by definition of \( \bar{p_{0}}\) as the low frequency part of \( p_{0}\) , we have:
Consequently, by definition of \( HP\) , we have:
with final value \( 0\) . Integration of the previous equation yields
But \( I[-H_{x},x_{0},p_{0},u_{0}]\) is of order 1 in \( \alpha\) as \( \tilde{p_{0}}\) is of order 0. Thus \( \tilde{p_{0}}=0\) , that gives \( p_{0}=\bar{p_{0}}\) , the solution of the averaged costate equation. A consequence of that is that both definitions of \( p_{0}\) are consistent.
Moreover, we have a definition of \( \tilde{p_{1}}\) consistent with definition 5:
so that the derivative of \( \tilde{p_{1}}\) is \( -\frac{1}{\alpha}HP[H_{x},u_{0}](x_{0},p_{0},t)\) , that is in \( \frac{1}{\alpha}\) .
We make a proof by contradiction.
Let’s suppose that (49) is not true. Then there exists a negative eigenvalue \( - \gamma\) of \( H_{\sigma \sigma}(x,u,p_{0},t)\) , that is there exists an eigenvector \( \left[ \begin{array}{c} y \\ v \\ \end{array} \right]\) so that:
This implies that the two following equations hold:
(92)
(93)
But \( (\gamma Id + H_{uu)} \ge ( \gamma + \beta ) Id > 0 \) , so that it is invertible and the equation (93) can be solved in \( v\) , giving:
(94)
Then, replacing \( v\) by its value in the equation (92), we have:
(95)
But as \( \gamma > 0\) , we have the succession of inequalities:
then
and then
Thus the equation (95) can not hold, because \( -\gamma < 0\) can not be an eigenvalue.
The Proposition 7 is thus proved by contradiction.
Proposition 12 is important since it bounds \( r\) and \( v\) with respect to \( z\) and \( \alpha\) . To obtain a bound on \( r\) and \( v\) which only depends on \( \alpha\) , it will be enough to provide a bound of \( z\) involving \( \alpha\) only.
Proposition 22
The dynamics of \( r\) is the following:
(96)
Proof
By definition of \( r\) and \( v\) , we have:
Moreover:
(97)
But a Taylor expansion of \( f(x_{0}+ \alpha ( \tilde{x_{1}} + y_{1}), u_{0} + \alpha v_{1}, t)\) is so:
(98)
Introducing equation (98) in equation (97) proves equation (96).
Proposition 23
The following inequality holds:
(99)
Proof
Equation (45) about the upper bound of \( f_{\sigma \sigma}\) leads to:
Hence:
(100)
Moreover, the function f is Lipschitz in \( x\) and \( u\) with Lipschitz constant the bound of the derivatives \( k\) , so that:
(101)
The equations (100) and (101) in the equation (96) give, together with the fact that \( r(0)=0\) the following inequality:
(102)
But by definition of \( Z[\lambda,\mu]\) , we have:
(103)
so that:
But the assumption 3 proves that \( H_{uu}\) is invertible and that \( H_{uu}^{-1}\) is bounded by \( \frac{1}{\beta}\) , so that:
(104)
Including equation(104) in equation (102) leads to:
(105)
But Cauchy property leads to:
(106)
Including equation (106) in equation (105) lead to:
(107)
Equation (107), together with Gronwall lemma, proves equation (99).
Let’s take the square of equation (99):
Let’s now multiply by \( \lambda\) and integrate relatively to \( \lambda\) and \( \mu\) between \( 0\) and \( 1\) :
Multiplying by 2 that equation proves Equation (56).
Let’s apply the triangular inequality for the \( \mathbf{L}_{2}\) norm to the expression of \( v\) (103):
But:
Thus, taking the squares:
Let’s multiply by \( \lambda\) and integrate relatively to \( \lambda\) and \( \mu\) between \( 0\) and \( 1\) :
Multiplying by 2 that equation proves Equation (57).
We now turn our attention to the variable \( r\) , which can be approximated by the variable \( r_{1}\) , up to an error which is bounded by a term that depends only on \( z\) and \( \alpha\) .
The variable \( r\) follows the dynamic (96) with \( r(0)=0\) , and \( r_{1}\) follows the dynamic (58).
Thus \( r-r_{1}\) follows the dynamics:
Thus, in a similar way than for the upper bound of \( r\) , the following inequality holds:
The equation (59) follows from Gronwall lemma.
The equation (60) is then the consequence of equations (56) and (59), together with:
Proposition 24
\( L(x,u,t)\) expands the following way:
(108)
Proof
It is a Taylor expansion of \( L(x,u,t)\) with integral remainder.
Proposition 25
The dynamics \( \frac{d \tilde{x}}{dt}\) of \( \tilde{x}\) expands the following way:
(109)
Proof
By definition of \( \tilde{x}\) , we have
so that:
The equation (109) follows by Taylor expansion of \( f(x,u,t)\) with integral remainder.
Proposition 26
\( L(x,u,t))\) rewrites the following way:
(110)
Proof
We have the following identities:
and:
Moreover, because of the stationary condition of the averaged problem, we have:
Finally we change the integral remainder of the expansion of \( L(x,u,t)\) in equation (108) with:
Thus equation (108) leads to:
Inserting equation (109) in this equation proves proposition 26.
Proposition 27
The following equality holds:
(111)
Proof
Let’s make an integration by part:
This leads to the equation (111) because \( \tilde{x}(0)=0\) , \( p_{0}(T)=0\) and:
Now inserting the equation (111) in the equation () integrated between \( 0\) and \( T\) leads to equation (64), which ends the proof of proposition 14.
The remaining appendix sections are devoted to the obtention of a lower bound on \( J(u)\) . This is achieved by obtaining a lower bound on each term of the expansion (64) of \( J(u)\) . Only the complicated terms require an extensive study; this study is detailed in sections F, G, H and finally I.
Proposition 28
The following inequality holds:
(112)
Proof
By definition of \( r\) , we have:
So that
But because of equation (59), we have
Moreover, by definition of \( \alpha \tilde{p_{1}}\) , we have:
So that the following inequality holds:
If we make an integration by part and use the fact that \( \tilde{p_{1}}(T)=\tilde{x_{1}}(0)=y_{1}(0)=0\) , we get:
\( y_{1}\) follows the dynamics (52) and \( \|\tilde{p_{1}}\|_{\infty} \leq 1\) , so that:
So that the inequality (112) is proved.
Proposition 29
The dynamics of \( r_{1}\) expands the following way:
(113)
Proof
The dynamics of \( r_{1}\) (equation (58)) is the difference between the quantities \( f(r_{1}+x_{0}+\alpha(x_{1}+y_{1}),v+u_{0}+\alpha v_{1},t)\) and \( f(x_{0}+\alpha(x_{1}+y_{1}),u_{0}+\alpha v_{1},t)\) .
Let’s make the Taylor expansions of these quantities at \( (x_{0},u_{0})\) :
and
The simplifications between the two expansions while we take their differences gives the dynamics of \( r_{1}\) (113).
The consequence of that dynamics expansion is that:
(114)
Proposition 30
Let’s define the terms:
and
Then the following inequalities hold:
(115)
and
(116)
Proof
Equation (45) about the upper bound of \( f_{\sigma \sigma}\) leads to:
Thus, together with the fact that \( \tilde{p_{1}} \leq 1\) , we have:
Equations (60) and (57) then lead to equation (115).
Equation (45) about the upper bound of \( f_{\sigma \sigma}\) leads also to:
That proves equation (116) by triple integration and multiplication by \( \alpha^{3}\) .
The equations (115) and (116) included in equation (116) (114) lead to:
Introducing that equation into equation (112) leads to equation (65) and thus the proposition 15.
Proposition 31
The following inequality holds:
(117)
Proof
Let’s use the fact that:
to expand \( H_{\sigma \sigma}(\rho(\lambda,\mu),p_{0},t)(\delta \sigma)^{2}\) :
(118)
The second term is easily upper bounded in absolute value (equation (46) about the upper bound of \( H_{\sigma\sigma}\) ):
so that:
(119)
To upper bound the first term, let’s expand it in its components:
(120)
Now let’s use the definition of \( Z[\lambda,\mu](t) = v +[H_{uu}^{-1}H_{ux}](\rho(\lambda,\mu),p_{0},t)(r+\alpha \tilde{x_{1}}\) to expand the terms in \( v\) and \( v^{2}\) :
(121)
and
(122)
Introducing equations (121) and (122) in equation (120) leads to:
Because of assumption 3 on the convexity, this proves that:
so that:
(123)
Including equations (123) and (119) into equation (118) leads to equation (117).
Proposition 32
Let’s denote:
Then the following bound holds:
(124)
Proof
Let’s expand \( \left[y_{1},v_{1}\right] [H_{\sigma \sigma}(\rho(\lambda,\mu),p_{0}t)H_{\sigma \sigma}(w_{0},t)] \left[ \begin{array} {c} r+\alpha\tilde{x_{1}} \\ v \\ \end{array} \right]\) into its coordinates:
(125)
But the second derivatives of \( H(x,u,p_{0},t)\) are Lipschitz in \( (x,u)\) of Lipschitz constant \( K\) because the third derivatives of \( H(x,u,p_{0},t)\) are bounded by \( K\) . Thus with the definition of \( \rho(\lambda,\mu)\) and \( w_{0}\) , equation (125) leads to:
Thus, with a triple integration after multiplication by \( \lambda\) , we get:
Then, using equations (56) and (57) lead to equation (123).
Applying propositions 31 and 32 lead to:
That proves proposition 16 because \( H_{0 \sigma \sigma}=H_{\sigma \sigma}(w_{0},t)\) does not depend on \( \lambda\) and \( \mu\) .
Thanks to the fact that \( r+\alpha\tilde{x_{1}}= r_{1}+((r-r_{1})+\alpha\tilde{x_{1}})\) , we have \( R=R_{3}+R_{4}\) , where \( R_{3}\) and \( R_{4}\) are defined as:
and
Proposition 33
The following inequality holds:
(126)
Proof
Let’s develop \( \left[y_{1},v_{1}\right] H_{0 \sigma \sigma} \left[ \begin{array} {c} r_{1} \\ v \\ \end{array} \right]\) into coordinates:
so that:
Now let’s use the costate dynamics (54) of the auxiliary problem, together with its stationarity condition (53). Then we get:
An integration by parts, together with the fact that \( r_{1}(0)=q_{1}(T)=0\) leads to:
But \( r_{1}\) follows the dynamics (58), so that a Taylor expansion of \( f(r_{1}+x_{0}+\alpha(\tilde{x_{1}}+y_{1}),v+u_{0}+\alpha v_{1},t)\) at \( (x_{0}+\alpha(\tilde{x_{1}}+y_{1}),u_{0}+\alpha v_{1})\) leads to:
so that, thanks to equation (45) about the bound of \( f_{\sigma\sigma}\) :
An integration on \( [0,T]\) and a multiplication by \( \alpha\) leads to
That equation, together with the bounds on \( r_{1}\) (60) and \( v\) (57) lead to equation (126).
Proposition 34
The following inequality holds:
(127)
Proof
Let’s develop \( \left[y_{1},v_{1}\right] H_{0 \sigma \sigma} \left[ \begin{array} {c} r_{1}+\alpha\tilde{x_{1}} \\ 0 \\ \end{array} \right]\) into coordinates:
Thus, thanks to the bound on \( r-r_{1}\) (59), its absolute value can be bounded:
An integration between \( 0\) and \( T\) and a multiplication by \( \alpha\) lead to equation (127).
This is a consequence of equations (126) and (127), together with the fact that \( R=R_{3}+R_{4}\) .
[1] Averaging Methods in Nonlinear Dynamical Systems Springer-Verlag 1985
[2] Les méthodes nouvelles de la mécanique céleste Gauthier-Villars 1892
[3] Mathematical Methods of Classical Mechanics Springer-Verlag 1978
[4] Geometrical Methods in the Theory o f Ordinary Differential Equations Springer-Verlag 1983
[5] An Averaging Theorem for Two-Point Boundary Value Problems with Applications to Optimal Control Journal of Mathematical Analysis and Applications 1976 55 46-60 10.1016/0022-247X(76)90277-8
[6] Averaging and Deterministic Optimal Control SIAM Journal on Control and Optimization 1986 25 3 767-780 May 10.1137/0325044
[7] Averaging In Lagrange And Minimax Problems Of Optimal Control SIAM Journal on Control and Optimization 1993 31 6 1630-1652 November 10.1137/0331077
[8] Occupational Measures Formulation And Linear Programming Solution Of Deterministic Long Run Average Problems Of Optimal Control Proceedings of the 45th IEEE Conference on Decision and Control 2006 5012-5017 10.1109/CDC.2006.377568
[9] Linear Programming Approach To Deterministic Long Run Average Problems Of Optimal Control SIAM Journal on Control and Optimization 2006 44 6 2006–2037 10.1137/040616802
[10] Duality In Linear Programming Problems Related To Deterministic Long Run Average Problems Of Optimal Control SIAM Journal on Control and Optimization 2008 47 4 1667–1700 10.1137/060676398
[11] Duality in Linear Programming Problems Related to Deterministic Long Run Average Problems of Optimal Control with Applications to Periodic Optimization Proc. of the Joint 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference 2009 1207-1211 10.1109/CDC.2009.5400261
[12] Averaging of the Problem of Optimal Control on Time Scales Journal of Mathematical Sciences 2016 212 3 290-304 January 10.1007/s10958-015-2665-1
[13] Energy minimization of single input orbit transfer by averaging and continuation Bull. Sci. math. 2006 130 707–719 10.1016/j.bulsci.2006.03.005
[14] Averaging and optimal control of elliptic Keplerian orbits with low propulsion Systems and Control Letters 2006 55 755–760 10.1016/j.sysconle.2006.03.004
[15] Riemannian metric of the averaged energy minimization problem in orbital transfer with low thrust Ann. Inst. H. Poincaré, Analyse non linéaire 2007 24 395–411 10.1016/j.anihpc.2006.03.013
[16] Optimal Low-Thrust Transfers With Constraints-Generalization Of Averaging Techniques Acta Astronautica 1997 41 3 133-149 10.1016/S0094-5765(97)00208-7
[17] Optimal Control for Engines with Electro-Ionic Propulsion Under Constraint of Eclipse Acta Astronautica 2001 48 4 181–192 10.1016/S0094-5765(00)00158-2
[18] Model reduction and model predictive control of energy-efficient buildings for electrical heating load shifting Journal of Process Control 2019 74 23-34 10.1016/j.jprocont.2018.03.007
[19] Optimal bang-bang control of a mechanical double oscillator using averaging methods 9th IFAC Vienna International Confererence on Mathematical Modelling, MATHMOD 2018 2018 10.1016/j.ifacol.2018.03.009
[20] Average cost optimal policies for Markov control processes with Borel state space and unbounded costs Systems and Control Letters 1990 15 349-356 10.1016/0167-6911(90)90108-7
[21] Average optimal stationary policies: convexity and convergence conditions in linear stochastic control systems Proc. of the joint 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference 2009 3388-3393 10.1109/CDC.2009.5400501
[22] Averaging, Aggregation And Optimal Control Of Stochastic Hybrid Systems With Singularly Perturbed Morkovian Switching Behavior Proceedings of the American Conference 1994 1994 1868-1872
[23] On representation formulas for long run averaging optimal control problem Journal of Differential Equtaions 2015 259 5554–5581 10.1016/j.jde.2015.06.039
[24]
[25] Redundant wavelet processing on the half-axis with applications to signal denoising with small delays: Theory and experiments International Journal on Adaptive Control and Signal Processing 2006 20 9 447 - 474 10.1002/acs.911
[26] Transfert orbital d'un engin a faible poussee MINES ParisTech 2015
[27] Impact of regular perturbations in input constrained optimal control problems Optimal Control, Applications and Methods 2020 41 4 1321-1351 July 10.1002/oca.2605