The wide appeal of fuel-efficient transport solutions is constantly increasing due to the major impact of the transportation industry on the environment. Platooning systems represent a relatively simple approach in terms of deployment toward fuel-efficient solutions. This paper addresses the reduction of fuel consumption in platooning systems attainable by dynamically switching between two control policies: Adaptive Cruise Control (ACC) and Cooperative Adaptive Cruise Control (CACC). The switching rule is dictated by a Deep Reinforcement Learning (DRL) technique to overcome unpredictable platoon disturbances and to learn appropriate transient shift times while maximizing fuel efficiency. However, due to safety and convergence issues of DRL, our algorithm establishes transition times and minimum periods of operation of ACC and CACC controllers instead of directly controlling vehicles. Numerical experiments show that the DRL agent outperforms both static ACC and CACC versions and the threshold logic control in terms of fuel efficiency while also being robust to perturbations and satisfying safety requirements.
Fuel efficiency in platooning systems is a central topic of interest because of its significant economic and environmental impact on the transportation industry. In platoon systems, Adaptive Cruise Control (ACC) is widely adopted because it can guarantee string stability while requiring only radar or lidar measurements. A key parameter in ACC is the desired time gap between the platoon’s neighboring vehicles. A small time gap results in a short inter-vehicular distance, which is fuel efficient when the vehicles are moving at constant speeds due to air drag reductions. On the other hand, when the vehicles accelerate and brake a lot, a bigger time gap is more fuel efficient. This motivates us to find a policy that minimizes fuel consumption by conveniently switching between two desired time gap parameters. Thus, one can interpret this formulation as a dynamic system controlled by a switching ACC, and the learning problem reduces to finding a switching rule that is fuel efficient. We apply a Reinforcement Learning (RL) algorithm to find a time switching policy between two desired time gap parameters of an ACC controller to reach our goal. We adopt the proximal policy optimization (PPO) algorithm to learn the appropriate transient shift times that minimize the platoon’s fuel consumption when it faces stochastic traffic conditions. Numerical simulations show that the PPO algorithm outperforms both static time gap ACC and a threshold-based switching control in terms of the average fuel efficiency.
Imitation dynamics in population games are a class of evolutionary game-theoretic models, widely used to study decision-making processes in social groups. Different from other models, imitation dynamics allow players to have minimal information on the structure of the game they are playing, and are thus suitable for many applications, including traffic management, marketing, and disease control. In this work, we study a general case of imitation dynamics where the structure of the game and the imitation mechanisms change in time due to external factors, such as weather conditions or social trends. These changes are modeled using a continuous-time Markov jump process. We present tools to identify the dominant strategy that emerges from the dynamics through methodological analysis of the function parameters. Numerical simulations are provided to support our theoretical findings.
This paper proposes new conditions for the design of a robust partial sampled-data state feedback control law for Markov jump linear systems (MJLS). Although, as usual, the control structure depends on the Markov mode, only the state variable is sampled in order to cope with a specific network control structure. For analysis, an equivalent hybrid system is proposed and a two-point boundary value problem (TPBVP) that ensures minimum Hoo or H2 cost is defined. For control synthesis, it is rewritten as a convex set of sufficient conditions leading to minimum guaranteed cost of the mentioned performance classes. The optimality conditions are expressed through differential linear matrix inequalities (DLMIs), a useful mathematical device that can be handled by means of any available LMI solver. Examples are included for illustration.
Benelux Meeting
Changing the Perception of Social Power in Opinion Dynamics
Rafael Fernandes Cunha, Ben Ye, and Ming Cao
In 38th Benelux Meeting on Systems and Control, 2019
This paper aims at designing a partial sampled-data state feedback control law for Markov jump linear systems (MJLS). The interesting feature of the control structure is that only the state variable is sampled, while the stochastic parameter that defines the Markov mode of the system used for control purposes is free to change at any time between samples. The main goal is to provide sufficient convex conditions for the existence of a solution for this class of control design problems in the context of Hoo and H2 performances, which are expressed through Differential Linear Matrix Inequalities (DLMI). The proposed method is mplemented using LMIs facilities and provides a minimum guaranteed cost control in only one shot. An example is solved for illustration and comparison.