Abstract
In this paper, the consensus problem for general linear time-invariant (LTI) multi-agent systems (MASs) with a single input is studied in a new optimal control framework. The optimal cooperative control law is designed from a modified linear quadratic regulator (LQR) method and an inverse optimal control formulation. Three cost function terms are constructed to address the consensus, control effort, and cooperative tracking, respectively. Three distinct features of this approach can be achieved. First, the optimal feedback control law is derived analytically without involving any numerical solution. Second, this formulation guarantees both asymptotic stability and optimality. Third, the cooperative control law is distributed and only requires local information based on the communication topology to enable the agents to achieve consensus and track a desired trajectory. The performance of this optimal cooperative control method is demonstrated through an example of attitude synchronization of multiple satellites.
1 Introduction
Cooperative control problem for multi-agent systems (MASs) has received tremendous attention in the last two decades owing to its wide range of applications in mobile robots, unmanned aerial vehicles, autonomous underwater vehicles, spacecraft, and automated highway systems [1]. The distributed multi-agent cooperative systems are enabled by rapid progress in communication, sensing, and actuation. The cooperative teamwork provides much more flexibility and robustness in performance to accomplish certain missions compared to single-agent systems. A comprehensive overview on the progress of multi-agent coordination has been provided by Cao et al. [2].
The technical core of cooperative control of MASs is the concept of consensus. The consensus among agents is equivalent to achieving a common state by neighbor-to-neighbor interaction and information exchange, which can be modeled by graph theory. The classical consensus problem and the associated graph and matrix theories were introduced and extensively studied for single or double integrator systems [3–7]. The basic consensus algorithm was generalized to the higher order systems in Refs. [8–10]. Other recent progress on the average consensus algorithms includes developing novel weighting strategies [11] and new classes of fixed-time protocols [12]. Considering the constraints of real applications such as limited communication ranges and limited bandwidth, the concept of switching topology (instead of time-fixed topology) emerged and the conditions on achieving consensus under switching or time-varying topologies were investigated [13,14]. Besides, the leader–follower algorithm as the most straightforward approach for solving the tracking problem has been studied [15–17].
Multi-agent cooperative control has also been investigated from the optimization perspective. The main research along this line focused on making the most efficient network configuration to attain the desired property. Kim and Mesbahi [18] derived the fastest convergence rate by maximizing the second smallest non-negative eigenvalue of the Laplacian matrix. Cao and Ren [19] investigated the optimal topology (optimal Laplacian matrix) for a given performance index via the linear quadratic regulator (LQR) method. However, only single-integrator systems are considered. It addressed the consensus problem from an inverse optimal control perspective in the sense that any symmetric Laplacian matrix is optimal with respect to a properly chosen cost function. In Ref. [20], synchronization of identical general linear systems based on the LQR was proposed and an active leader node or control node generated the desired tracking trajectories. The solution to the algebraic Riccati equation (ARE) was used to design the feedback control gain such that the synchronization error dynamics is asymptotically stable. However, the cost function is not clearly defined. Tuna [21] showed that a linear feedback control law obtained from the ARE based on LQR can synchronize agents if the linear systems of identical agents are stabilizable. Similar to Ref. [20], there was no discussion on the optimality and physical meaning of the cost function associated with the ARE. Wang and Xin [22] proposed an inverse optimal control (introduced by Bernstein [23]) for a system of agents to address not only consensus but also obstacle/collision avoidance. This similar approach was applied to cooperative control of multiple autonomous robots [24] and the flocking problem [25]. However, the same simplified double-integrator dynamics is assumed in these works for the MAS model. In Ref. [26], optimal Laplacian matrices can be obtained for second-order systems under independent position and velocity topology using LQR. Recently, many innovations have been made in solving the consensus problem by the optimal control theory. Xie and Lin [27] solved the problem of global optimal consensus for higher order integrators with bounded controls starting from any arbitrary initial states and in the presence of actuator saturation. Dehshalie et al. [28] used optimal control theory to design a fault tolerant control law for MASs with single and multi-input actuators under both directed and undirected communication topologies. A numerical approach was developed by Bailo et al. [29] to solve the consensus problem of the nonlinear multi-agent system of Cucker–Smale type and the first-order optimality conditions were obtained by using the Barzilai–Borwein gradient descent method. Recently, the model predictive control has been used to solve cooperative control problems such as controlling connected and autonomous vehicles in the intelligent transportation system [30].
In this study, the main contribution is generalizing the approach in Refs. [22,24], and [25] to solve the consensus problem for general single-input linear time-invariant (LTI) systems. The proposed method is developed from a modified LQR and an inverse optimal control formulation to achieve optimal consensus and trajectory tracking, respectively. The main features of this work can be summarized as follows: (1) the proposed optimal cooperative control can be applied to general LTI systems (rather than mere integrator systems); (2) the control law is guaranteed to be both stabilizing and optimal; (3) the optimal control law is obtained in an analytical form without using any iterative numerical algorithm; and (4) the cooperative control law is a linear function of the Laplacian matrix, and is thus distributed, which only needs neighboring agents' information.
The remainder of this paper is organized as follows: In Sec. 2, some preliminaries and fundamental concepts of graph theory are reviewed. The system is described and the problem is formulated in Sec. 3. Main results are presented in Sec. 4. The performance of the proposed method is demonstrated via an attitude synchronization example in Sec. 5 and conclusion remarks are given in Sec. 6.
2 Preliminaries
3 Problem Statement and Formulation
A general controllable LTI system is said to achieve consensus when all agents' states converge to the same common value, (i.e., ) at the same time [1], where is the state vector of agent i.
where represents the ith state of the jth agent.
where . is the dimension of the single agent's system and is the control input vector for the agents.
Consensus is said to be reached if the system (8) is asymptotically stable.
's are the coefficients of the characteristic equation of the state matrix .
The consensus problem becomes finding a feedback control input such that the system (20) is asymptotically stable.
where is a () diagonal and positive semidefinite matrix. An approach for constructing this matrix will be shown in Sec. 4.
where is positive definite and is the weighting parameter.
where is constructed by an inverse optimal control approach and contains the tracking penalty function, which will be described in Sec. 4.
4 Main Results
4.1 Optimal Control Solution.
The following Lemma is used in this paper to prove the asymptotic stability and optimality of the proposed cooperative control law.
where is the Hamiltonian function. The superscript denotes partial differentiation with respect to .
The solution of the closed-loop system is globally asymptotically stable.
Proof. Refer to Ref. [23].▪
and is the reference trajectory along which the system is expected to follow. G will be shown to be positive semidefinite after Lemma 4.2. Combining this tracking penalty function with the consensus cost function enables the system of agents to follow a specified desired trajectory consensually. Note that only one agent having access to the reference is sufficient to guarantee that the entire system follows the desired trajectory if the communication topology is connected.
In order to investigate the eigenvalues of G, the following lemma from exterior algebra is required.
where . is the kth exterior power of G and is the set of all linear mapping with rank . Furthermore, and for , where is the rank of G.
which implies that the set of eigenvalues contains zeros and ; therefore, G is positive semidefinite.
Before providing the main results, the following two Lemmas are introduced.
Lemma 4.3. is positive semidefinite andif the graph is undirected and connected.
Proof. Refer to Ref. [22].▪
where and are constant numbers, and is the ith eigenvalue of .
where is the ith eigenvalue of . Since is the Laplacian matrix for a connected and undirected graph, it is positive semidefinite and . Therefore, is positive definite if .
▪
The main result of this paper is presented in the following theorem.
where and is the derivative of the tracking penalty function with respect to the last states of the agents.
For to be a valid Lyapunov function, it should be continuously differentiable with respect to , which is obvious from the definition of .
The importance of Eq. (60) is that it allows us to determine and thus the cost function term .
The system of algebraic equations (62) allows us to solve the ARE analytically. is determined by the following steps.
- The equations for the entries on the main diagonal are first solvedand yield(63)(64)
- The equations for each row in the system (62) are solved one by one. For example, the first row leads to the following equation:(65)
All the row equations are solved in a similar way to obtain all the entries of . Note that all the entries are solved analytically without any numerical approximation and iteration.
From the above solutions, it is seen that all the entries of can be written as a linear combination of and (i.e., , where both and are functions of state and control weighting parameters). Therefore, according to Lemma 4.4, one can always choose weighting parameters such that is positive semidefinite and is positive definite.
Since is positive definite and is positive semidefinite, and the condition (29) is satisfied.
where is the vector of the kth state of all the agents. Equation (70) shows that one can always find proper weights such that . Specifically, for a given set of state weighting parameters (i.e., ), the control weight parameter can be chosen small enough such that the positive term is always greater than the other terms that are sign-indefinite.
The conditions (28) and (30) are satisfied if and when . Note that after reaching consensus, when the agents go along a desired trajectory, we have: , which indicates when . Therefore, according to the definition of , when , and the condition (28) holds.
It is obvious that when all the agents go along the desired trajectory, Eq. (71) is zero and the condition (30) holds.
Now, all the conditions (28)–(33) in Lemma 4.1 are verified and therefore, the control law is an optimal control to minimize the cost in Eq. (22). In the meanwhile, the closed-loop system (20) is asymptotically stable. Furthermore, it is evident that the Lyapunov function (54) satisfies as . Thus, the closed-loop system is globally asymptotically stable.
where and .
where is defined by Eq. (50).
▪
Remark 4.1. In the conventional optimal control approach, the cost functional is given a priori and the optimal control law is derived by minimizing it. It can be seen that the approach described in Theorem 4.1 is an inverse optimal control approach. A Lyapunov function is constructed first based on the stability conditions (28), (29), and (31) and then the optimal control law is derived from the optimality condition . By satisfying the optimality condition (32), in the cost functional is constructed while satisfying the stability condition (31). In other words, for this approach, the cost function is not specified a priori as the conventional optimal control design. It is constructed inversely from the stability and optimality conditions (28) to (33). The benefit of this design is that the resulting control law is guaranteed to be both stabilizing and optimal. In addition, according to Lemma 4.1, the minimum cost made by the derived control law is equal to the introduced Lyapunov function as shown in Eq. (35).
4.2 Distributed Feature of the Optimal Control Law.
where are the entries of . Note that is in the form of and thus the product of in (Eq. (50)) only keeps the last column of , i.e., , which is calculated by Eq. (64). Since 's are always linear functions of the Laplacian matrix as shown in Eq. (64), i.e., , when they are multiplied by the state vector X in Eq. (79), the feedback information exchange to implement the optimal control occurs only between the agent and its neighbors with whom it has the communication links defined by the Laplacian matrix . The other terms consisting of 's (coefficients of the characteristic equation of ) and identity matrices in Eq. (79) relate to the agent's own state. The second term of in Eq. (49) only relates to the state of the agent that has access to the reference. Therefore, from the above discussions, the entire optimal cooperative control law is distributed since each agent's control law only needs local information from its own and its communicating neighbors.
4.3 Discussion of Optimal Cooperative Control Design.
In this section, the principle of the proposed optimal cooperative control design including the motivation of the inverse optimal control is recapitulated. The optimal control formulation includes three cost functions (Eq. (23)), (Eq. (24)), and (Eq. (25)). Note that the first two cost functions are based on the normal LQR formulation to achieve the optimal consensus objective while only the third cost function is designed using the inverse optimal control approach to achieve the optimal tracking. This formulation is based on the following objectives: (1) the control law is in an analytical form without applying iterative numerical algorithms; (2) the designed control law must be distributed so that each agent only needs information from its neighbors according to the communication topology; (3) the control law must guarantee both stability and optimality; and (4) the agents can consensually track a designated desired trajectory.
The mission to achieve the above objectives is accomplished in two parts. In the first part, consensus is reached among agents and in the second part, tracking is achieved by the agents who have access to the reference trajectory communicating with the agents who do not in order to make the entire agent system track the desired trajectory. The first cost function is responsible for the first consensus objective while the third cost function accounts for the second tracking objective by including the tracking penalty function (38) in the construction of in . Note that in the absence of the tracking penalty function or minimizing , only consensus can be achieved without tracking capability. It is worth mentioning that as , but during the mission is only known to a subgroup of the agents and is not known a priori.
The motivation of adopting the inverse optimal control to construct the cost function can be recapitulated as follows: (1) optimizing both consensus and tracking in order to achieve the above four objectives is difficult to formulate with one unified cost function. In the formulation, the difference between the two forms of the error vectors, i.e., Eq. (7) for consensus and for tracking, makes it hard to combine the consensus cost and the tracking cost into one unified cost function and then use the conventional optimal control approach to derive an analytical distributed cooperative control law. (2) A distributed cooperative control law requires that it should be a function of the Laplacian matrix L that specifies the communication topology, and LX should appear in the feedback term of the control law. The specially constructed weighting matrix (Eq. (61)) in the consensus cost function enables the solution to the ARE, i.e., P, to contain the Laplacian matrix L. Specifically, (Eq. (64)) allows LX to appear in the final control law, as discussed in Sec. 4.2. However, to achieve the tracking objective, if including the tracking penalty (38) in the consensus cost in a conventional way, the control law will not have LX in the feedback term, and this crucial distributed property will be destroyed. (3) The inverse optimal control formulation can guarantee both stability and optimality as shown in Theorem 4.1. The Lyapunov function guaranteeing closed-loop stability is the optimal cost for the optimal controlled system, which is shown in Eq. (35) and is the main benefit of this design. (4) The inverse optimal control enables us to derive an analytical optimal control law that does not need any numerical iterations. Normally, for the conventional optimal control, numerical methods are needed to solve the ARE. With the proposed approach, constructing in a novel way of Eq. (61) allows one to solve ARE analytically, which is shown in Eqs. (62)–(65). If the conventional optimal control approach is applied with the tracking penalty function directly incorporated into the cost function, the above discussed benefits would not be achieved.
5 An Illustrative Example
In this section, the performance of the optimal cooperative control design is demonstrated through an example of attitude synchronization of a group of five identical satellites. A sketch of the satellite attitude control system and the model are shown in Fig. 1 [35,36]. The system can be considered as two separate masses (a large mass called “the body” and one “attached mass”) that are connected. The system can be modeled by a mass-spring-damper system.
where is the control torque, and are moments of inertia, is the spring constant, and is the viscous damping constant.
The communication topology is shown in Fig. 2. It is assumed that only satellite 1 has access to the reference trajectory. Satellites 4 and 5 receive information from 1 but the other two satellites do not have direct communication with Satellite 1. We assume that the angular rates are expected to reach a constant value of 0.35 rad/s. Therefore, the angles must increase along the ramp rad after reaching consensus.
Figures 3–7 show the results. Figures 3 and 5 show rotational angles of the attached masses and bodies, respectively. It can be seen that the rotational angles converge to the reference trajectory consensually and increase along the ramp of . Figures 4 and 6 show angular rates of the attached masses and bodies, respectively. The angular rates converge to the constant value of 0.35 rad/s. It is worth noting that the responses of the satellite 1 are very smooth and do not have oscillations because it has access to the reference trajectory. Figure 7 presents the control torque responses, which shows that the optimal control does not require a large control effort to achieve consensus and tracking.
6 Conclusions
In this paper, the multi-agent consensus tracking problem for a class of general linear time-invariant systems is investigated in an optimal control framework. The consensus is reached by a modified LQR formulation, and cooperative tracking is achieved by applying an inverse optimal control design to derive a proper cost function. The optimal control law has an analytical form and is a linear function of the Laplacian matrix such that the control implementation is distributed in that it only needs local information of the agent's own state and its neighbors with communication links. Both optimality and stability of the control law are proved. An attitude synchronization example is utilized to illustrate the effectiveness of the proposed optimal cooperative control.
Funding Data
USDA NIFA (Grant No. 2019-67021-28993; Funder ID: 10.13039/100005825).