已阅读5页,还剩3页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
DeepControl Energy Effi cient Control of a Quadrotor using a Deep Neural Network Pratyush Varshney1 Gajendra Nagar2and Indranil Saha3 Abstract Synthesis of a feedback controller for nonlinear dynamical systems like a quadrotor requires to deal with the trade off between performance and online computation require ment of the controller Model predictive controllers MPC provide excellent control performance but at the cost of high online computation In this paper we present our experience in approximating the behavior of an MPC for a quadrotor with a feed forward neural network To facilitate the collection of training data we create a faithful model of the quadrotor and use Gazebo simulator to collect suffi cient training data The deep neural network DNN controller learned from the training data has been tested on various trajectories to compare its per formance with a model predictive controller Our experimental results show that our DNN controller can provide almost similar trajectory tracking performance at a lower control computation cost which helps in increasing the fl ight time of the quadrotor Moreover the hardware requirement for our DNN controller is signifi cantly less than that for the MPC controller Thus the use of DNN based controller also helps in reducing the overall price of a quadrotor I INTRODUCTION Autonomous dynamical systems like quadrotors rely on the effi cacy of the feedback controllers for their correct operations in uncertain environments Synthesizing feedback controllers for nonlinear dynamical systems poses tremen dous challenges to the control engineers as they need to deal with the trade off between the performance of the controller and the amount of online computation required for their effi cient operation For example Proportional Integral Derivative PID controller 1 has a simple structure leading to very low online computation overhead But the perfor mance of PID controllers for complex nonlinear systems is often not satisfactory On the other hand a Model Predictive Controller MPC 2 can provide excellent performance at the cost of a high online computation overhead even for a system with linear dynamics The optimization problems solved in the MPC become further complicated when the system is nonlinear 3 Although there exist optimization techniques like Sequential Convex Optimization 4 that can solve the nonlinear optimization problems they suffer from the lack of scalability for complex systems with a large number of state variables This work was supported by SERB Early Career Research Award ECR 2016 000474 1Pratyush VarshneyiswithDepartmentofComputerScience andEngineering IndianInstituteofTechnology Kanpur India pratyushvarshney cse iitk ac in 2Gajendra Nagar is with Department of Aerospace Engineering Indian Institute of Technology Kanpur India gajena iitk ac in 3Indranil Saha is with Department of Computer Science and En gineering Indian Institute of Technology Kanpur India isaha cse iitk ac in To use an MPC as an on board controller for a quadrotor we need to mount a high performance computer on the quadrotor However high energy requirement of the MPC control computation on the on board processor reduces the fl ight time Adding extra battery may help in increasing the fl ight time but reduces the payload capacity Employing the explicit MPC 5 6 strategy could help in reducing the online computation However as observed in 7 the computation time to synthesize an explicit controller grows exponentially both in the number of time steps and the size of state and input vectors thus making explicit MPC not suitable for a high dimensional nonlinear dynamical systems like quadrotor In this paper we explore the possibility of synthesizing a feedback controller for a quadrotor in the form of a neural network We employ supervised learning 8 where we generate training data capturing the state control mapping from the execution of a model predictive controller However generation of training data by fl ying a quadrotor is tedious as the battery of the quadrotor needs to be charged for several times in the process of generating the training data To alleviate this problem we create a faithful model of the quadrotor which can be used for simulation on the Gazebo Simulator 9 to generate the required training data We train a two layer Deep Neural Network DNN to approximate the behavior of an MPC Through a series of experiments we evaluate the effi cacy of the DNN feedback controller Our experimental results demonstrate that the trajectory tracking performance of the DNN based feedback controller is close to that of the MPC controller However the control cost of the DNN controller is consistently better than the control cost of the MPC controller as the DNN represents a smooth function We also measure the fl ight time for the quadrotor under both the MPC controller and the DNN controller Though our test vehicle is quite heavy and the power consumption to run the motors is signifi cantly more than the power required for computation our DNN controller helped in increasing the fl ight time upto 12 5 for various reference trajectories Finally one major advantage of DNN controller is that its hardware requirement is low The cost of the hardware required to run the DNN controller is only 20 of that of the cost of the hardware required to run the MPC This helps us reduce the overall cost of a quadrotor Related Work Various control mechanisms have been in troduced for quadrotors for example sliding mode con trol 10 feedback linearization 11 H robust control 12 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE43 adaptive control 13 and minimum snap trajectory based control 14 All these control mechanisms assume the avail ability of a faithful mathematical model of the quadrotor To address this limitation Bansal et al 15 has recently proposed a mechanism to model the dynamics of a quadrotor using a neural network and showed how this neural network based model can be used for control computation Li et al 16 have used a DNN to generate a feasible reference trajectory from a user given desired trajectory for a quadrotor Sanchez et al 17 have shown that deep neural networks can be used as controllers for spacecraft landing Their results are restricted to simulations with the assumption that they have access to prefect states without any sensor errors There have been some recent work on synthesizing neural network based feedback controllers for dynamical systems Recent work has shown that Reinforcement Learning based policy search can be successfully used to synthesize a controller for a nonlinear dynamical system like quadro tor 18 The authors have used Actor Critic Algorithm 19 to synthesize the controller Zhang et al 20 proposed a methodology to train a neural network through policy search guided by the data generated from an MPC under full state observations However the time required for such policy search is large as compared to that of supervised learning based algorithms Recently Chen et al 21 proposed a methodology for approximating an MPC using a Neural Network The authors show that using Dykstra s projection they are able to reduce the policy search space However their approach is restricted to linear systems To the best of our knowledge we for the fi rst time approximate the behavior of an MPC for a quadrotor with a deep feed forward neural network based feedback controller and carry out its detailed performance analysis Paper Organization The rest of the paper is organized as follows In Section II we provide detailed background required for this paper In Section III we provide the method ology for approximating the behavior of an MPC with a DNN controller In Section IV we describe our experimental setup and explain the experimental results about the performance of the controller with detailed analysis Finally we conclude the paper in Section V II BACKGROUND A Quadrotor Dynamics The quadrotor is a nonlinear dynamical system that can be represented with discrete time state space equation of the form st 1 f st ut where stis the state of the quadrotor and utis the control applied at time t The control applied is generated by a feedback controller based on the state A feedback controller can be defi ned as a function that takes the reference state sref t and current state stas inputs to generate the control ut at time t which when applied to the system reduces the error between the reference state and the current state ut st sref t Fig 1 Controller architecture of an autopilot The high level controller takes a trajectory as input and outputs roll pitch yaw rate and thrust Low level controller runs in the cascaded loop and outputs the motor speed required to maintain the desired orientation The state stof the quadrotor is a 12 dimensional vec tor 22 defi ned as st p v T where p is the position px py pz of the quadrotor px py and pzare the x y and z coordinate of the position v is the velocity vx vy vz captures the three attitude Euler angles 23 roll pitch yaw and is the body rates The dynamics of the quadrotor are such that roll and pitch movements are decoupled from the yaw movement 22 B Flight Stack Architecture The controller of the quadrotor is generally a two level cascaded controller high level controller and low level con troller The high level controller takes as input the reference trajectory given as Pref pref 1 pref 2 and gener ates the corresponding high level controls roll ref pitch ref yaw rate ref and thrust ref to reach the target position Here pref t denotes the reference position of the quadrotor at time t The orientation h ref ref refi is collectively termed as the attitude of the quadrotor The low level controller takes the reference attitude and thrust as input and outputs the motor speed that maintains the reference attitude and position The high level controller generally runs at a much slower rate than the low level controller C High Level Controller The high level controller can be implemented by many types of control algorithms from the classical control theory There is a wide spectrum of controllers that can be synthe sized for a quadrotor On the one side of the spectrum there is the PID controller that has low computation cost but also provides poor performance On the other side there is the MPC based on the optimal control theory The MPC provides high performance but has a high computational cost 1 Proportional Integral Derivative Controller The PID controller is one of the simplest linear controller It consists of three constants kP kI kD For the state stand desired state sref t at time t the control utis generated by PID controller as ut kP et kD det dt Z kI etdt where et st sref t Generally the three constants are found by observing the behaviour of the system and fi ne tuning them accordingly The run time computation cost 44 of the PID controller is very low compared to optimal controllers This comes with a drawback in performance As a PID controller does not take into account the dynamics of the system explicitly the performance of the controller somewhat depends on the fi ne tuning mechanism of the three constants which may be extremely time consuming 2 Model Predictive Controller MPC The MPC is an optimization based controller that generates the control signals by solving a constrained optimization problem for minimizing trajectory cost over a fi nite horizon The opti mization problem solved by MPC is defi ned as min ut J HT X t 0 s t s ref t TQ st sref t uT tRut subject tost 1 f st ut ut st sref t ut U st S umax ut umin where U is the control space and S is the state space of the dynamical system The symbols uminand umaxdenote the upper and lower bounds on the controls respectively and Q and R are two weight matrices The MPC generates the control signals for horizon HT using the error between the current state stand reference point sref t The system then applies the fi rst control and feedbacks its observation st 1 we can assume the state is fully observable to the MPC In the next step again an optimization problem is solved starting from state st 1 This is repeated for the complete trajectory As MPC requires to solve the optimization problem at every step the cost of the computation is signifi cantly high as compared to PID If the dynamics of the system st 1 f st ut is nonlinear then the problem becomes a nonlinear optimization problem This type of MPC is termed as Non Linear Model Predictive Controller NMPC The high level fl ight controller when implemented as an MPC provides satisfactory performance but requires ad ditional hardware This hardware is generally a high end computation processor that consumes additional power As a result the fl ight time of the quadrotor is reduced To address the trade off between performance and computation cost we propose a Deep Neural Network DNN based controller that approximates the MPC controller using supervised learning Below we briefl y introduce the DNN architecture D Neural Networks Neural Networks 24 have emerged as one of the most effi cient function approximators The advancements in the Deep Learning 25 have shown that Neural Networks are effective in areas like speech recognition 26 image recog nition 27 etc We use a Neural Network to approximate an online optimizer for MPC The Deep Neural Network consists of one or more hidden layers Each hidden layer consists of neurons that are fully connected to all the neurons in the next layer The connections between the neurons are weighted The aim is to learn the weights in such a fashion that the Network is able to output the control signal generated by MPC for a given state as input For a layer l with X as the input vector the output of the layer l yl is defi ned as yl WT l X Bl where Wlis the weight matrix between layer l and l 1 and Blis the bias The activation function performs a nonlinear transformation Examples of activation functions are tanh sigmoid 28 Rectifi ed Linear Unit ReLU 29 etc The parameters W and B are learned by minimizing the loss between the predicted value by the neural network f x and the desired output y One of the frequently used loss functions for regression is the huber loss 30 defi ned as L f x y 1 2 y f x 2for y f x y f x 1 2 2 otherwise where is the threshold from where the huber loss changes from quadratic to linear nature The gradients from the loss are used to learn the weights W by gradient descent methods 31 as follows W W W L f x y Here is the learning rate that controls the amount of gradient to be adjusted with respect to the loss gradients III METHODOLOGY In our method we learn the optimizer of an MPC offl ine using a Deep Neural Network DNN The neural networks are capable of approximating polynomials of high degrees We utilize this capability of a neural network to learn an optimizer for MPC Our method involves the following two steps 1 Create a simulation model of the quadrotor using the learned system dynamics 2 Learn the controller using supervised learning from simulation data A Simulation Model of Quadrotor We fi rst create a model of the quadrotor in the simulator whose performance is close to the real world quadrotor as there are multiple benefi ts of learning the system model For example to perform supervised learning in the absence of a model we need to collect data by fl ying the quadrotor in real world while ensuring that enough data is collected to learn the generalized deep neural network controller This can be proved to be a tedious task as the battery life of the quadrotor with an on board processor running on it is quite low To overcome this limitation we create a simulation model of the real world quadrotor by using the transfer function H s 32 The transfer function gives an approximation of the effect of the control on the low level dynamics of the quadrotor We create a model of quadrotor such that the 45 transfer function of the simulated quadrotor matches with that of the real world quadrotor To learn the transfer function we manually fl y the quadro tor for approximately one minute using a joystick that di rectly provides roll pitch yaw rate and thrust to the autopilot We record the commanded roll cmd pitch cmd along with the resultant orientation of the quadrotor We maintain the fl ight pattern in such a way that the effect of roll pitch and thrust is captured suffi ciently From the recorded data we calculate the transfer function H s of the real quadrotor as defi ned by H s Y s X s where Y s and X s are the output and input functions respectively For quadrotor Y s is the current orientation h i The input X s is the commanded roll and pitch h cmd cmdi We attempt to match the calculated transfer function with the simulator model For the simulator model we implement a high level MPC controller and a low level PID controller with inner loop dynamics defi ned by the Kamel et al as 33 1 k cmd 1 k cmd where k and k are the roll and pitch gains and and are the roll and pitch time constants The inner loop dynamics capture the model of the quadro tor by using system identifi cation in the form of time con stants and gains The gain constant is the ratio of the observed attitude and commanded attitude and captures the responsiveness of the dynamical system If the gains are high it indicates that the system is highly responsive to small perturbations Similarly the low gains indicate that the system is less responsive to changes We calculate the gains k and k as k cmd k cmd The time constant captures the property of the system about the time taken to follow the applied controls They are calculated using the pole function of the matlab 1 pole k 1 pole k For the dynamics of the simulation model of quadrotor some quantities like mass and arm length can be directly obtained from the real world quadrotor However there exist some other quantities that are diffi cult to model precisely f
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年脑外伤后综合征诊疗试题及答案(神经内科版)
- 滑膜皱襞综合征护理查房
- 古镇镇租房合同
- 采石承包合同
- 2026年河北省特种设备安全管理A证考试题库(含答案)
- 山西省临汾市2025-2026学年七年级下学期期中学业质量监测历史试卷(有答案)
- 2026 九年级上册《三国演义故事》课件
- 2026道德与法治五年级活动园 诗词大会
- 2026四年级上《蝙蝠和雷达》教学课件
- 幼儿园校舍出租合同
- 2026年交管12123驾照学法减分完整版练习题库及1套完整答案详解
- 2025中国经皮冠状动脉介入治疗指南课件
- 2026福建福州首邑产业投资集团有限公司招聘19人考试模拟试题及答案解析
- 江苏交通控股有限公司笔试内容
- 国家义务教育质量监测八年级劳动素养综合测试题
- (二模)温州市2026届高三第二次适应性考试地理试卷(含答案)
- 《公路水运工程施工安全标准化指南》
- 社区公共充电设施便民化改造建设方案
- 2026年中考《语文》作文10大主题抢分万能模板
- 社区信息化网络建设推广与应用项目可行性研究报告
- 阿里员工绩效考核制度
评论
0/150
提交评论