基于STM32的六足侦察机器人的控制系统设计含4张CAD图
收藏
资源目录
压缩包内文档预览:(预览前20页/共27页)
编号:145463339
类型:共享资源
大小:4.35MB
格式:ZIP
上传时间:2021-09-21
上传人:QQ14****9609
认证信息
个人认证
郭**(实名认证)
陕西
IP属地:陕西
100
积分
- 关 键 词:
-
基于
STM32
侦察
机器人
控制系统
设计
CAD
- 资源描述:
-
基于STM32的六足侦察机器人的控制系统设计含4张CAD图,基于,STM32,侦察,机器人,控制系统,设计,CAD
- 内容简介:
-
Gait planning of hexapod robot based on reinforcement learning under constrained conditionsAbstract: hexapod robot has multiple redundant degrees of freedom in structure and high adaptability to terrain environment. How to improve the mobility of hexapod robot in unstructured environment has always been a challenging subject. In this paper, the walking problem of hexapod robot in unstructured environment is modeled as discrete foothold selection and path optimization under constrained conditions. The gait planning and path optimization problems in unknown unstructured and complex road conditions are solved by intensive learning. Experimental verification is carried out by MATLAB-ADAMAS joint simulation. Keywords: hexapod robot; reinforcement learning; gait planning; path optimization1 IntroductionThere are irregular terrain, rugged terrain, high obstacles or wading pavement in mountainous area, which limits the application of wheeled and crawler transportation tools. The hexapod robot has many redundant degrees of freedom in structure, so it has high adaptability to terrain and environment. It has a very broad application prospect in the fields of forest cutting, mining, underwater construction, nuclear industry, military transportation and detection, planet detection and so on. Therefore, the research on hexapod robot has been concerned by experts and scholars all over the world, but how to improve the mobility of hexapod robot in unstructured environment is still a challenging subject. This2Author: Tang Kaiqiang (1992-), male, Luzhou, Sichuan, Masters degree, research field: intelligent robot; Hong Jun (1995-), male, Fuzhou, Jiangxi, undergraduate, research field: intensive study, Liu Canghai (1966-), male, Quanzhou, Fujian, Professor, Ph. D., research field: robotics, etc. Chen Chunlin (1979.5-), male, Bozhou, Anhui, Professor, Ph. D., research field: intelligent control, etc.In this paper, the walking problem of hexapod robot in unstructured environment is reduced to the selection of discrete foothold and the optimization of centroid moving path under constrained conditions.In order to realize the gait planning and path optimization of hexapod robot, it is necessary to optimize the foothold under the constraint condition. Unstructured road conditions are complex, traditional pre-programming methods are difficult to achieve, need to use machine learning skills. Reinforcement learning can gradually accumulate experience and obtain optimal strategies through continuous interaction with the environment. Therefore, reinforcement learning is applied to gait planning and path optimization of hexapod robots. Through learning , the hexapod robot can quickly solve the optimal path, and can select the appropriate foothold according to the optimal path, so as to realize the efficient walking in the unstructured environment.The traditional hexapod robot adopts the method of fixed gait, Such as three-legged gait, four-legged gait, fluctuating gait, The researchers made several sets of gait data from different gait parameters, Called when there are different requirements. Using this gait, while showing good walking ability on the plane, But in irregular terrain, especially in unknown environments, Robots cant walk steadily, Therefore, a gait planning method which can adapt to unknown terrain is urgently needed. During recent years, the research of robot control methods mainly includes local rule-based control method and Central Pattern Generator, Central mode generator) control method. The method based on local rules is to realize the motion control of the robot according to the motion rules between the legs of the robot and the interaction between the robot and the environment, It has certain flexibility and robustness. A rule between the legs of a hexapod robot based on artificial leg detection constraints proposed by Fielding et al, Determine the state of the leg according to the measured value of the robot leg. German Cruse and other populations can propose six basic rules that act between adjacent legs, That is, the state of a leg of a hexapod robot needs to meet the six basic rules between its adjacent legs. The method of generating gait suitable for robot walking by adjusting the phase sequence between the limit positions of hexapod robot, and satisfies the stability of robot motion. CPG based control method is to simulate the biological low-level nerve center so that the robot can spontaneously generate a certain regular gait to control the robot walking. That is, the control of the foot robot through the periodic rhythm signal, The calculation is relatively small, Suitable for online gait generation, But its control signal is not related to the motion of the robot, Its hard to precisely control the motion of a robot, There are limitations in gait planning of complex unstructured road conditions. At the end of the 20th century, this method was first introduced into the control of hexapod robot by Venkataraman network. Recently Kassim a CPG, with adaptive function has been proposed The kinematics information of the robot is effectively integrated into the control model, Thus, a continuous autonomous jump robot is realized.In order to adapt hexapod robot to complex unstructured road conditions, machine learning and intelligent control are needed. Machine learning helps robots actively adapt to some new environments, thus avoiding researchers programming for different scenarios. The control method based on machine learning improves the performance of the robot by constantly interacting with the environment and gradually accumulating experience. It has the following advantages: first, some difficult parameters in the motion model can be adjusted by robot learning; secondly, some actions of the robot can not be realized by manual adjustment, but it can be realized by making corresponding learning rules. Finally, through continuous learning and accumulation of knowledge, the robot can also respond correctly to untrained scenes. At present, it has been successfully applied to robot path planning, such as wheeled robot and quadruped robot. Kolter and others, for example, optimize the displacement of the body backward by differential symbol strategy gradient method on the step climbing task of the Little dog robot. Move forward quickly after the body tilts backward at a certain angle, and lift the front leg to the next step to complete the climbing.Intensive learning is a kind of technology based on environmental feedback. Intensive learning agent identify their own state, decide the action according to some strategy, and adjust the strategy to the best according to the reward provided by the environment. Reinforcement learning is an important branch of machine learning because of its characteristics of autonomous learning and online learning. In this paper, the strategy of optimal foothold selection and path planning is obtained by reinforcement learning method, so as to perfect the autonomous walking strategy of hexapod robot in rugged terrain and enhance the moving ability of hexapod robot in unstructured environment.2. Problem descriptionIn this paper, the walking problem of hexapod robot in unstructured environment is reduced to the selection of discrete foothold and the optimization of path under constrained conditions. Based on the terrain information collected by the known robot vision system, the landing point of the hexapod robot is planned with full consideration of its own constraints. That is, in the complex unstructured environment, according to the parameters of the hexapod robot, the stability and constraints of the three-legged gait, a series of suitable foothold are selected to ensure that the robot can reach the target position smoothly from the starting position.2.1 Movement planningThe hexapod robot has three degrees of freedom per leg, with six legs and 18 degrees of freedom. Through the coordinated motion of 18 degrees of freedom, the robot produces different moving gait. The gait of a hexapod robot refers to the set of the relationship between the sequence change and time of the supporting phase and the swing phase of the robot. Because the hexapod robot itself has a more complex structure, in order to better adapt the robot to different working environments, it is necessary to do a good job of walking gait planning. In order to ensure the stability of the prototype in the walking process, it is stipulated that the adjacent walking feet can not be in the swing phase state at the same time. That is, the adjacent feet of the robot can not start swinging at the same time and must ensure at least three feet as the supporting phase state.Tripod gait is the fastest gait when insects walk stably. It is also the most widely used gait in hexapod robot. Its biggest characteristic is that each step has three legs to support the ground, forming a stable triangular support structure, while the other three legs lift, swing, land and form a new triangular support, so alternately. Because there are only two leg states of tripods gait: support state and swing state, it is relatively simple to realize, so it is widely used.As shown in figure 1, the gait planning diagram of a hexapod robot walking in a straight plane, The matrix box in the middle represents the robot body, The center of mass of the robot is set at the center of the geometry, Six dots representing the tip of the robot, Hollow dots represent swing feet, Solid dots represent support feet. the initial state of the robot is shown in figure 1(a) below. As the robot moves forward: as shown in Figure 1(b), With 2,4,6 feet as the supporting point, 1.3,5 feet up, Swing back, Then put it down; As shown in Figure 1(c), With 1,3,5 feet as the supporting point, Lift 2,4,6 feet up, Swing back, Then put it down; As shown in Figure 1(d), The robot moves forward; As shown in Figure 1(e), The robot returns to its initial state, Thus a cycle of forward walking is completed. Figure 1 is a complete three-legged gait cycle of a hexapod robot, It can be seen that the center of gravity of the robot is always in the triangular region composed of supporting feet in the tripod gait, The robot is stable, There will be no dumping. But if the center of gravity of the robot is not in the triangular area of support, The robot will be unstable and will dump. As all the legs of the tripods move forward at equal distances, Considering the limitation of the size of the hexapod robot and the stability judgment. Therefore, A hexapod robot walks in an unstructured environment with a three-legged gait, Its foothold is constrained by gait and stability2. Problem descriptionIn this paper, the walking problem of hexapod robot in unstructured environment is reduced to the selection of discrete foothold and the optimization of path under constrained conditions. Based on the terrain information collected by the known robot vision system, the landing point of the hexapod robot is planned with full consideration of its own constraints. That is, in the complex unstructured environment, according to the parameters of the hexapod robot, the stability and constraints of the three-legged gait, a series of suitable foothold are selected to ensure that the robot can reach the target position smoothly from the starting position.2.1 Movement planningThe hexapod robot has three degrees of freedom per leg, with six legs and 18 degrees of freedom. Through the coordinated motion of 18 degrees of freedom, the robot produces different moving gait. The gait of a hexapod robot refers to the sequence and time of the supporting and swinging phases of the robotSchematic illustration of a tripods gait2.2 Footpoint planningIn this paper, the walking problem of hexapod robot in unstructured environment is reduced to the selection of discrete foothold and the optimization of path under constrained conditions. To this end, the first need to complete the foothold planning. All the supporting positions in the unstructured environment are regarded as some discrete foothold, and then the foothold satisfying the tripod gait of the hexapod robot is selected. The distance, height difference and optimal position between adjacent gait foothold are restricted in the process of screening foothold.As shown in figure 2, the non-structural terrain environment, first, marks the supporting position in the environment as a discrete foothold, and classifies the foothold as shown in figure 3. According to the constraint conditions of leg length and joint angle of hexapod robot, the blue foothold is the one whose height does not meet the constraint conditions. According to the constraint of the position of the foothold, the optimal foothold is further screened. The yellow foothold meets the height requirement, but it does not meet the motion requirement of the tripod gait, and the red falls(a) Main view(b) OverheadLanding modelingFigure 3IE(a) Regional excellence(b) Robot placement modelFigure 4 Modeling of Walking ProblemThe foot point is the starting and ending position of the robot. The remaining foothold not only satisfies the leg motion parameters, but also is the best one in the four directions of the robot. In the front, rear, left and right directions, after the most human vision that meets the motion requirements is perfected, the hexapod robot can be designed according to the vision of the robot to independently complete the screening of the foothold.27gait planning based on reinforcement learning3.1 Overall programme designThe selection of the robots foothold and the path planning of centroid motion under the constraint condition of the tripods gait. Tripod gait all legs swing at the same distance and direction. According to the motion characteristics of hexapod robot, the direction of motion in path planning is forward, backward, left and right. The goal of its motion is to select the optimal path from the starting position to the end position. Through the planning of the foothold, this paper designs a discrete foothold to meet the requirements of motion. In order to further simplify the experimental difficulty, we set the step size of the robot moving in the front, back, left and right directions to the same step size. In the control system, the foothold is identified and arranged to remove the foothold which does not meet the sports requirements, and a series of foothold which meets the three-legged gait is retained. After the completion of the foothold screening, the specific foothold selection and path planning will be carried out through intensive learning.3.2 Intensive learningreinforcement learning is a machine learning method to learn optimal strategies by interacting with the environment. Trial-and-error search and delay return are the two key features of reinforcement learning. In order to obtain a large return value, the hexapod robot needs to tend to choose the actions that have been tried in the past and can bring a large return value in the process of reinforcement learning, but in order to discover the new strategy, the hexapod robot must choose the actions that have not been tried in the past. The hexapod robot needs to use the information it knows to get the return value, but it also needs to explore new space to make better action choices in the future.The basic model of reinforcement learning is Markov decision process , and its model M includes a quaternion VS, T,R, in which the S is a finite set of states, and the effective position of the hexapod robot at the current moment is a state s. TheA is a finite set of action a, a is the six-legged robot according to the three-legged gait in the current position can choose the legal action; T:SX4XS-O, 1 Defines the state transfer function, And it describes how s, robot Probability P (s |s,) to state s after performing action e). A:SX4XS -IR defined the return value, And it describes the state s, the robot Action a, The real-time return value r (s) from the state transition to the s process, a,s).The goal of reinforcement learning is to find k;S- optimal strategy. When the discrete time step is t, the action is selected s, a given state so that the cumulative expected discount return value is r.+yr. Thel+i +y2r,+2=S/r,+iA maximum of 7 G 1,0 is a discount factor.Suppose that the environment of dynamic programming is a finite Markov decision process, that is, state space S and action space A (s), s S, are limited. The dynamic nature of the problem is based on a series of transfer probabilities,P source =Prs,+i =s|s,=s,at=a(1) and expected real-time return valuesFire =E (r)+1 |a,=a,s,=s,s,+l =s(2)and given. Dynamic programming is suitable for discrete state action space. In continuous state action space, a general method is to quantify state action space and then apply finite state method. The core idea of dynamic programming is(3)or policy iteration with a policy host. For any strategy, a state value function is calculated V, which is called policy estimation in dynamic programming theory. V policy function also satisfies the Berman optimal equation:y*(S)=E,(r),+m+y 1+m+22r,+i+s,-sE,rl+i +7 V(s ,+i) is ,+is2i +(s,a), P? R5,+/(sR5,+)(4)asr (s, a) is the probability of taking action a in the case of a s state in the policy element, and the subscript of the expected value E is force used to indicate that they are under the scenario of taking policy 7.Q-learning algorithm dish is used to realize the foothold selection of hexapod robot.Q-learning can get the optimal strategy from the delayed return value even without prior knowledge about the environment. Qlearning algorithm assumes that state set S and action set A can be divided into discrete values for discrete problems. The hexapod gets a return value after performing the action a, s, which reflects the extent to which the action a, is performed in the short term; after performing the action a, the system moves from state to state s,+ then the system selects the action a, based on the best known knowledge+i=Q-learning goal is to learn a strategy sweat: S x 1/ this 4(,)-0,to maximize the sum of the expected return values obtained after the discount from each state:Q(s, a)=r (s, a)+P*(s, a)s,其中S是状态s 的有限集,六足机器人当前时刻所处的有效位置即为一个 状态sA是动作a的有限集,a是六足机器人依据三足步 态在当前位置可以选择的合法动作;T:SX4XS-O, 1 定义了状态转移函数,它描述机器人从在状态s,执行动 作a之后转移到状态s的概率P(s|s,e)。A:SX4XS -IR定义了回报值,它描述机器人在状态s,执行动作 a,状态转移到s这个过程中所得到的实时回报 值r(s, a, s)。强化学习的目标是找到一个最优策略k;S-厶,在离 散时间步t的时候,给定状态s,选择动作a,使得累加 的期望折扣回报值r. + yrl+i + y2r,+2 = S/r,+i最大, 7 G 1,0为折扣因子。假设动态规划的环境是一个有限的马尔科夫决策过 程,也就是说状态空间S和动作空间A(s),sS,是有限 的。问题的动态性是根据一系列转移概率,P源=Prs,+i =
- 温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

人人文库网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。