IROS2019国际学术会议论文集1524_第1页
IROS2019国际学术会议论文集1524_第2页
IROS2019国际学术会议论文集1524_第3页
IROS2019国际学术会议论文集1524_第4页
IROS2019国际学术会议论文集1524_第5页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

End-to-End Driving Model for Steering Control of Autonomous Vehicles with Future Spatiotemporal Features Tianhao Wu1, Ao Luo1, Rui Huang1, Member, IEEE, Hong Cheng1, Senior Member, IEEE, Yang Zhao1 AbstractEnd-to-end deep learning has gained considerable interests in autonomous driving vehicles in both academic and industrial fi elds, especially in decision making process. One critical issue in decision making process of autonomous driving vehicles is steering control. Researchers has already trained different artifi cial neural networks to predict steering angle with front-facing camera data stream. However, existing end-to-end methods only consider the spatiotemporal relation on a single layer and lack the ability of extracting future spatiotemporal information. In this paper, we propose an end- to-end driving model based on Convolutional Long Short-Term Memory (Conv-LSTM) neural network with a Multi-scale Spa- tiotemporal Integration (MSI) module, which aiming to encode the spatiotemporal information from different scales for steer- ing angle prediction. Moreover, we employ future sequential information to enhance spatiotemporal features of the end-to- end driving model. We demonstrate the effi ciency of proposed end-to-end driving model on the public Udacity dataset with comparison of some existing methods. Experimental results show that the proposed model has better performances than other existing methods, especially in some complex scenarios. Furthermore, we evaluate the proposed driving model on a real- time autonomous vehicle, and results show that the proposed driving model is able to predict the steering angle with high accuracy compared to skilled human driver. Index Terms - End-to-End Driving Model, Future Spatiotem- poral Features, Multi-Scale Spatiotemporal Integration Module, Convolutional LSTM. I. INTRODUCTION Autonomous driving techniques have gained considerable interests in both academia and industrial R 2) Future sequential information is employed in the train- ing process of the driving model which aiming to enhance spatiotemporal features; 3) The proposed end-to-end driving model has been tested on both public Udacity dataset and a real-time au- tonomous vehicle. The proposed end-to-end driving model is validated on the public Udacity dataset with comparison of some existing 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 978-1-7281-4003-2/19/$31.00 2019 IEEE950 Fig. 1.The architecture of proposed end-to-end driving model. methods. Furthermore, we also evaluate the proposed driving model on a real-time autonomous vehicle in our campus with a collected UESTC dataset. Experimental results show that the proposed end-to-end driving model has better perfor- mance than existed end-to-end methods, and could achieve good steering angle prediction with high accuracy compared to skilled human driver on real-time autonomous vehicle. The reminder of this paper is organized as follows. Section II introduces the proposed end-to-end driving model, with details of the MSI module and the training process of future spatiotemporal features. In Section III, experimental results on both the public Udacity dataset and a real-time autonomous vehicle are presented and discussed. This paper ends with conclusions and future work in Section IV. II. METHOD This section presents the methodology details of the pro- posed end-to-end driving model. Section II-A lays down the architecture of the proposed end-to-end driving model, which combine an MSI module with Conv-LSTM. Then the training process of the proposed model are introduced in Section II- B, in which the future sequential information are employed to enhance future spatiotemporal features. A. Architecture of the End-to-End Driving Model The architecture of proposed end-to-end driving model is illustrated in Fig. 1. As shown in Fig. 1, past n frames (from frame t n+1 to frame t) are set as inputs of proposed driving model. The proposed driving model can be divided into two parts: the feature extracting network and the steering angle prediction network. The feature extracting network consists of an encoder (6 convolution layers) and an MSI module with Conv-LSTM. As shown in Fig. 1, the proposed MSI module is combined with 4 spatiotemporal modules, and each module has 3 con- volution layers and 1 convolutional LSTM. Spatiotemporal modules are successively added on top of 3rd, 4th, 5th and 6th layers of the encoder, respectively., which aiming to encode the spatiotemporal information from different scales. In each spatiotemporal module, the fi rst convolution layer is designed to fi lter the redundant spatial information of input features, then the convolutional LSTM is employed to generate the temporal information of past n frames, other two convolution layer are aiming to obtain key features for steering angle prediction. After each spatiotemporal module, we employ a fully-connected layer to regulate the dimension and then merge the extracted spatiotemporal feature with each other from different scales. With the obtained spatiotemporal features, fully-connected layers are utilized to fi nally predict steering angles from current time step t. We also exploit ground truths of time step t +1 to t +k to guide the network for training(each predictor is combined with fully-connected layers), but do not use them to make predictions(more details are provided in next subsection). Table I gives output sizes of all layers of the proposed end-to-end driving model (The size of input images is 3480640 ). Corresponding to Fig. 1, we give a short name in Table I for all layers. For the encoder module, 6 convolution layers are numbered from left side to the right (Conv 1 to Conv 6). Four spatiotemporal branches of the MSI module are numbered from bottom to the top (Scale 1 to Scale 4). After designing the architecture of proposed end-to-end driving model, the training details for network optimization will be given in the next subsection. 951 TABLE I OUTPUT SIZES OF ALL LAYERS OF THE PROPOSED END-TO-END DRIVING MODEL. ModuleNameOutput sizeNameOutput size Encoder Conv 124238318Conv 236117157 Conv 3485777Conv 4642838 Conv 5761318Conv 69868 MSI Module Conv 1-1245777Covn 2-1322838 Conv-LSTM 1245777Conv-LSTM 2322838 Conv 1-2242838Conv 2-2321318 Conv 1-3122838Conv 2-3161318 FC 13744FC 2912 Conv 3-1381318Covn 4-14968 Conv-LSTM 3381318Conv-LSTM 44968 Conv 3-23868Conv 4-24923 Conv 3-31968Conv 4-32423 FC 3144FC 464 PredictorFC 116FC 21 B. Spatiotemporal Features Enhancement with Future Se- quential Information With the designed end-to-end driving model with MSI module, future sequential information is utilized to enhance spatiotemporal features during the training process. Fig. 2 shows the training process of the driving model with future sequential information. As depicted in Fig. 2, past n frames (from frame t n+1 to current frame t) of camera images are set as the input of the driving model. Fig. 2.The training process of the end-to-end driving model. In order to enhance spatiotemporal features of proposed driving model, the ground truth of steering angle from current frame t to the kth frame after current frame. The fi nal cost function is designed as follows: J = Loss(t)+ k i=1 (i)Loss(t +i),(1) where Loss(t) and Loss(t + k) denote the loss of angle prediction at time step t and t+k, respectively. (i)=1/i(i 1,k) are weight parameters of loss at different time steps. In this paper, we adopt a simple form of squared loss as follows: Loss(t) = 1 N N n=1 |st,n st,n|2,(2) where N indicates the number of samples for model updating each time, st,ndenotes the learned models prediction at time t with sample n, and st,nis ground truth of steering angle. With the designed cost function described in Eqn. 1, the model can be trained through back propagation. Note that the ground truth of t +1 to t +k frames are auxiliary labels for training, and after training, the corresponding predictions are not used in testing procedure. Once trained, the end-to-end driving model can generate steering a single steering angle, i.e. the steer angle of time step t, from camera images of every past n frames. Fig. 3 shows this confi guration. Fig. 3. The trained end-to-end driving model is utilized to generate a single steering angle from camera images of every past n frames. III. EXPERIMENTS In this section, we fi rstly validate the proposed end-to-end driving model on the public Udacity dataset with comparison of some existing end-to-end methods. Then we also validate the proposed driving model on a real-time autonomous vehicle in our campus with collecting a UESTC dataset. Next two subsections will express the experimental results and discussions in detail. A. Experiments on Udacity dataset 1) The Udacity Dataset: The Udacity dataset is originally provided for a series of self-driving challenges 13. In this paper, we adopt a subset of the Udacity dataset - Udacity Challenge-II for experimental purpose. The Udacity Challenge-II dataset contains total 33808 frames for model training and 5614 frames for model testing, in which vehicle speed, torque, steering angle and video streams from three front view cameras are recorded. The resolution of images in Udacity is 480640. 2) Experiment Details: The experiments are implemented on a workstation with 4 GeForce GTX Titan GPUs. All code is written and implemented under the pytorch framework. We randomly sampling 15% of the training data for validating models and always choose the best model on this validation set. The number of past frames n is chosen as 10. ADAM is utilized as the optimizer. The initial learning rate is set as 1104for experiments on Udacity dataset. In order to elevate the generalization ability of end-to-end driving mod- els, we employ a widely-adopted data augmenting scheme on Udacity dataset by mirroring images 12. 952 3) Experimental Results and Analysis: In experiments on the public Udacity dataset, we fi rstly evaluate the proposed end-to-end driving model with different future sequential information. In this experiment, the proposed end-to-end model without future sequential information is set as the baseline of our model (called MSINet in experiments). We evaluate both the MSINet and MSINet with future t +k frames. For models with future frames, we set the steering angle prediction of future frames as side tasks of our model during the training process. Considering the computational cost for model training, we set the number of future frames 1 k 5. Table II gives results comparison of the MSINet and MSINet with future t+k frames. In this paper, the Root Mean Square Error (RMSE) is utilized as the evaluation index of driving models. TABLE II EXPERIMENTAL RESULTS OF THE PROPOSED DRIVING MODEL ON THE UDACITY CHALLENGE-IIDATASET. ModelRMSE (rad) MSINet0.0613 with t +1 frame0.0574 with t +2 frames0.0545 with t +3 frames0.0519 with t +4 frames0.0491 with t +5 frames0.0504 As shown in Table II, experimental results indicate that models involved future sequential information gained better steering angle prediction than the MSINet. From the results we can see, MSINet with t + 4 frames achieve the best steering angle prediction than other models on the Udacity challenge-II dataset (RMSE is 0.0491 rad). Fig. 4.Comparison of the MSINet and the MSINet with t +4 frames in three curve road situations. For further analysis the proposed driving model with future spatiotemporal features which enhanced by future sequential information, we analysis performances of our models in different road scenarios. Finally we found that the proposed driving model with future spatiotemporal features has better steering prediction in curve road situations. Fig. 4 shows comparison of the MSINet and the MSINet with t+4 frames in three curve road situations. (a) (b) (c) Fig. 5.Comparison of steering angle prediction between the MSINet and the MSINet with t +4 frames when passing curve roads. As depicted in Fig. 4, the MSINet with t + k frames achieves better steering angle prediction than the MSINet in curve road situations. We also extract the results of steering angle prediction in passing curve road situations. Fig. 5 illustrate the comparison of steering angle prediction between the MSINet and the MSINet with t+4 frames when passing curve road. Three fi gures in Fig. 5 (a,b,c) are corresponded with three curve road situations of Fig. 4 (situations from top to bottom). The results depicted in Fig. 5 show that the MSINet with future spatiotemporal features (t +4 frames) has better performance when passing curve roads. In experiments of the Udacity dataset, we also compare our model with some existed end-to-end methods. We fi rstly 953 Fig. 6.Example video frames in the UESTC dataset. reproduced CgNet, NVIDIAs PilotNet and ST-LSTM net- work, then train these models on the training set of Udacity Challenge-II dataset. Brief introduction of these methods for comparison are given as follows: 1) CgNet: The CgNet is published as a baseline for Udacity Challenge2 14, which combined with 3 convolution layers and a fully-connected layer; 2) PilotNet: The PilotNet is proposed by NVIDIA 8, which combined with 5 convolution layers and 4 fully- connected layers; 3) ST-LSTM network: The ST-LSTM network is pro- posed by Chi et al. 12, which combine spatiotemporal convolution layers with LSTM. Table III gives the comparison of steering angle prediction between our models and other end-to-end driving models. Experimental results in Table III show that our models have better performances than other end-to-end driving models on the public Udacity Challenge-II dataset. TABLE III COMPARISON OF STEERING ANGLE PREDICTION BETWEEN OUR MODEL AND OTHER END-TO-END DRIVING MODELS. ModelRMSE (rad) CgNet0.1779 NVIDIAs PilotNet0.1589 ST-LSTM Network0.0622 MSINet0.0631 MSINet with t +4 frame0.0491 B. Experiments on a Real-time Autonomous Vehicle In this subsection, the proposed end-to-end driving model is evaluated on a real-time autonomous vehicle within our campus. Firstly, we will give a brief introduction of the dataset collection of our campus, which called the UESTC dataset. Then the driving system with a real-time autonomous vehicle will be introduced, with some experimental details of model training process. Results and discussions are given at the end of this subsection. 1) The UESTC Dataset: In order to training our model for road testing in our campus, we make a driving dataset (called UESTC dataset) with total 30878 frames of images and steering angles. The resolution of each image is 1280 1024, with 15 frames per second sampling rate. Fig. 6 shows some example video frames of the collected UESTC dataset. With the collected UESTC dataset, the training process of the driving model is set as the same as on Udacity dataset (Input images of training process are resized to 3480640). 2) The Driving System of Autonomous Vehicle: The driv- ing system of our autonomous vehicle is built based on the Robot Operation System (ROS). Fig. 7 shows the au- tonomous vehicle and the block diagram of our driving system. As shown in Fig. 7(b), two ROS nodes (green blocks) are implemented in our driving system, where the driving model is embedded in the steering angle prediction node. Inputs of the prediction node are image stream from front view camera, and outputs the prediction value of steering angles. Another CAN analysis node is designed to convert predicted steering angles to control messages. The driving system is implemented on the on-board computing platform with a single GeForce GTX Titan GPU. The driving system is built depend on an FAW A70E electric car. 3) Results and Discussions: In the experiment on the real-time autonomous vehicle, we choose the MSINet with t +4 frame for model training and on road testing. For the safety consideration, the whole experiment last about 50 minutes and the vehicle driving about 4.2km (average speed is 5km/h)in our campus. As the same as in experiments on Udacity, we calculate the RMSE between the steering angle of the prediction of our model and human driver, with the 954 (a)(b) Fig. 7.The autonomous vehicle with the driving system: (a) The autonomous driving system; (b) Block diagram of the driving system. result of 0.0544 rad. Fig. 8 shows a fragment of the whole experiment, with comparison of our models prediction with the human driver. As shown in Fig. 8, our model has the ability to achieve good prediction compared with the human driver. Fig. 8.The autonomous vehicle with the driving system. IV. CONCLUSIONS ANDFUTUREWORK This paper has proposed a novel end-to-end driving model for steering angle prediction of autonomous vehicle. The proposed driving model is designed based on the Conv- LSTM neural network, with combining an MSI module for encoding the spatiotemporal information on multiple layers. In order to enhance spatiotemporal features of the driving model for better steering angle prediction, we employ future sequential information in the model training process. The performance of proposed driving model has been validated on both public Udacity dataset and a real-time autonomous vehicle. Experimental results show that our model has better performance than other existing methods on the public Udacity dataset, and achieve good steering angle prediction on a real-time autonomous vehicle testing. In the future, we will improve our model for smooth s- teering control, which aiming to improve the vehicle stability with higher speed. Moreover, the visualization of proposed model will be considered in the future work for improving the performance of driving model. ACKNOWLEDGMENT This work was made possible by support from the Na- tional Key Research and Development Program of Chi- na (2017YFB1302300, 2017YFB0102603), National Natural Scien

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论