IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain

上传人：我*** IP属地：北京上传时间：2020-06-11 格式：PDF 页数：8 大小：4.63MB 积分：12 举报 版权申诉

IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain_第2页

IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain_第3页

IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain_第4页

IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain_第5页

已阅读5页，还剩3页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain Hao Sun, Marcelo H. Ang Jr. and Daniela Rus AbstractIn this paper, we focus on a rain removal task from a single image of the urban street scene for autonomous driving in rain. We develop a Convolutional Neural Network which takes a rainy image as input, and directly recovers a clean image in the presence of rain streaks, atmospheric veiling effect (haze, fog, mist) caused by distant rain streak accumulation. We propose a synthetic dataset containing images of urban street scenes with different rain intensities, orientations and haziness levels for training and evaluation. We evaluate our method quantitatively and qualitatively on the synthetic data. Experiments show that our model outperforms state-of-the- art methods. We also test our method qualitatively on the real-world data. Our model is fast and it takes 0.05s for an image of 1024 512. Our model can be seamlessly integrated with existing image-based high-level perception algorithms for autonomous driving in rain. Experiment results show that our deraining method improves semantic segmentation and object detection largely for autonomous driving in rain. I. INTRODUCTION Driving in rain is challenging for both humans and au- tonomous vehicles. For autonomous vehicles, their vision- based perception functions, e.g. object detection, recognition, and semantic/instance segmentation, require accurate feature learning of images of urban street scenes. As the most common bad-weather condition, rain drastically degrades the visual quality of images and blocks the background objects. These visibility degradations have negative impacts on image feature learning and cause many computer vision systems to likely fail. In addition to autonomous driving, many other applications such as outdoor surveillance systems also degrade when they are presented with images containing artifacts such as rain and haze. These all make it a highly desirable technique to remove undesired visual effects caused by rain from images. While rain streaks create the blurring effect which oc- cludes and deforms the background scene, distant rain streak accumulation generates the atmospheric veiling effect, which further reduces the visibility. In addition, rain and fog often happen at the same time, especially during heavy rain. The effect of fog is signifi cant in images of street scenes and degrades the high-level perception functions of autonomous vehicles. In the last few decades, many methods have been proposed for rain removal from a single image. Despite the success of past methods, most of them suffer from several limitations: Hao Sun is with Singapore-MIT Alliance for Research and Technology (SMART), . Marcelo H. Ang Jr. is with Depart- ment of Mechanical Engineering, National University of Singapore (NUS), .sg. Daniela Rus is with Massachusetts Institute of Tech- nology (MIT), . Fig. 1.A derainig example of our deraining method which removes the rain streaks and haze for improving the visibility for autonomous driving in rain. Left: the input image with rain effects. Right: the output image of our method. The effects of rain are complex. Most of the deraining methods 15 only solve the effect of individual rain streaks without the consideration of the haze/fog caused by rain streak accumulation. Method 5 includes the global atmospheric light in their model but they do not solve it in their algorithm. Methods 68 concatenate their proposed deraining models with existing dehazing methods, which makes them not end-to-end. Existing methods are slow and cannot be used for real- time applications. Furthermore, existing methods are computationally expensive and many of them can only work for low-resolution images which are not enough for high-level perception. Considering these limitations, we propose an end-to-end Convolutional Neural Network (CNN) for deraining from a single image for autonomous driving in rain. Our main contributions are: 1) Our network takes a rainy image of the urban street scene as input, and directly recovers a clean image. Our model is able to remove the individual rain streaks, atmospheric veiling effect (haze, fog, mist) caused by distant rain streak accumulation. 2) Our network performs deraining and dehazing from the global context of an image. Compared to past methods which separately estimate the parameters of the rainy model, we optimize the model parameters jointly and generate a better solution. 3) Our method is fast. For an image of 1024 512, its processing time is only 0.05s. It can be easily inte- grated with existing image-based high-level perception algorithms for autonomous driving in rain. 4) Based on CityScapes dataset 9 and Foggy CityScapes dataset 10, we propose a dataset containing synthetic rainy images of urban street scenes for training and evaluation. Figure 1 shows an example of deraining results of our model. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 978-1-7281-4003-2/19/$31.00 2019 IEEE962 II. RELATED WORK During the last few decades, many methods have been proposed for rain removal. They can be basically divided into two groups: video-based methods and single image- based methods. In particular, single image-based methods can be categorized into traditional methods and deep learning methods. We briefl y review these methods as follows. A. Video-Based Rain Removal For video-based methods, rain can be removed by lever- aging the temporal information and analyzing the difference between adjacent frames. Compared to single image-based methods, it is relatively easy to remove the rain from videos 1114. In 12, they use the average pixel values from the neighboring frames to remove rain streaks. Garg and Nayar et al. 12, 15, 16 use photometric properties and temporal dynamics to describe rain streaks. In 11, they detect and remove the rain streaks by minimizing the registration error between frames. A review of video-based deraining is presented in 17. B. Traditional Single Image-based Rain Removal Although video-based methods work well, they heavily rely on the temporal information of videos. In this paper, we focus on rain removal from a single image. It is much more challenging but there is still much room for improvement. In traditional methods, kernel regression, non-local mean fi l- tering, dictionary learning, Gaussian mixture model (GMM), and low-rank representation are widely used. In 18, kernel regression and non-local mean fi ltering are used for rain streak detection and removal. Dictionary learning is used in methods 3, 1922. Method 3 uses discriminative sparse coding for rain removal. Method 20 decomposes a rainy image into a rain layer and a non-rain layer, and then sparse coding is applied to remove rain streaks from the rain layer. In 23, GMM is used as a prior to separate the rain streaks. C. Deep Learning Single Image-based Rain Removal Recently, a lot of deep learning methods are proposed for single image-based deraining and achieve superior perfor- mances 2, 48, 24. The idea is to learn a mapping between input rainy images and their corresponding ground- truths. Method 24 focuses on raindrop and dirt removal. Methods 2, 8 apply guided image fi ltering 2527 to decompose a rainy image into a detail layer and a base layer. A CNN is applied for deraining in the detail layer. Then the derained detail layer and base layer are combined to generate the output. Their methods are far from real-time performance. Rain and fog often happen at the same time. In 6, the authors propose a multi-task deep learning model that learns the rain streak mask and the appearance of rain streaks. They conduct deraining fi rst, then conduct dehazing using 28 and conduct deraining again. Method 7 proposes a convolutional and Recurrent Neural Network for single image deraining. They perform deraining fi rst, then use the dark channel method 29 for haze removal. Many dehazing methods such as 28, 30, 31 also have the potential usage for deraining. Generative Adversarial Network (GAN) meth- ods such as 32 also have the potential usage for deraining, however, they usually generate unexpected artifacts in the output and not fast enough for real-time processing. III. APPROACH Our network takes a rainy image as input, and directly recovers a clean image. In this section, we fi rst present the physical model for the rainy image with the effects of nearby rain streaks and distant fog/haze. Then we present the CNN model for estimating the model parameters and generating the clean image. We introduce a dataset containing syn- thetic rainy images of urban street scenes for training and evaluation. We integrate our model with existing semantic segmentation and object detection solutions for high-level perception tasks of autonomous driving in rain. A. Rainy Image Model Our rainy image model is developed based on the classical atmospheric scattering model of the hazy image generation 33, 34. Under rainy conditions, rain streaks have various shapes and directions which occlude, deform and blur the background scene. Meanwhile, distant rain streaks accumu- late and generate haze/fog effect which further reduces the visibility. Our rainy image model takes consideration of all these effects, and can be formally written as: I(x) = (J(x)+R(x)t(x)+A(1t(x)(1) where I(x) is the input rainy image and x indexes pixels in the image, J(x) is the clean image to be recovered, R(x) models the nearby rain streak effect, A is the global atmospheric light which models the haziness level, t(x) is the medium transmission which describes the light portion that is not scattered and reaches the camera. t(x) is defi ned as: t(x) = ed(x)(2) where d(x) is the distance from the scene point to the camera, and is the scattering coeffi cient of the atmosphere. t(x) decreases as d(x) increases, indicating that the longer distance from the scene point to the camera is, the more rain streaks accumulate, thus the larger haziness level and the less visibility are. In order to recover J(x) from I(x), past methods estimate the values of t(x), R(x), A separately using techniques such as sparse dictionary and Gaussian mixture model. They regard A as a global constant and set a value for A heuristically. However, global atmospheric light A and medium transmission t(x) are correlated, and they should be learned together. Estimating them independently provides a suboptimal solution, the value of A is usually overestimated and overexposure can happen 28, 30. To learn the global atmospheric light A jointly with the medium transmission t(x), inspired by 30, we model t(x), 963 R(x), A in two new values K1(x) and K2(x). Equation 1 is re-expressed as: J(x) = 1 t(x)I(x)A 1 t(x) +AR(x)(3) Equation (3) is then formulated as: J(x) = (K1(x)K2(x)I(x)(K1(x)K2(x)(4) where K1(x) = 1 t(x)(I(x)A)+A I(x)1 K2(x) = R(x) I(x)1 (5) In this way, t(x) and A are learned jointly by estimating the value of K1(x), while R(x) is learned in K2(x). B. Convolutional Joint Rain and Haze Removal To learn the values of K1(x) and K2(x), we design a CNN model which takes the rainy image as input, outputs the optimal values of K1(x) and K2(x) from the global context of the image, and generates the clean image through end- to-end learning. Our network contains two branches, where K1(x) branch takes the input rainy image I(x) as input and outputs the optimal value of K1(x), and K2(x) branch takes I(x) as input and outputs the optimal value of K2(x). After estimating the values of K1(x) and K2(x), the network generates the clean image using Equation 4. The K1(x) branch is to learn the media transmission t(x) and the global atmosphere light A jointly by estimating the value of K1(x), where both t(x) and A depend on the global scene. We use dilated convolution to increase the receptive fi eld of our network in order to learn the contextual information. While a larger receptive fi eld can encode more contextual features for learning, it leads to coarse features for details. To solve this, we fuse features at different resolutions by concatenating network responses under different receptive fi elds. The K2(x) branch is to learn the value of R(x) by es- timating the value of K2(x). Similar to the K1(x) branch, multi-receptive fi eld fusion is used for learning both global and local information. We use batch normalization after each convolutional layer to alleviate the problem of proper initialization. After each batch normalization layer, the leaky rectifi ed linear unit is used as the activation function. After generating the values of K1(x) and K2(x), we perform elementwise addition and multiplication to generate the clean image. More details of the network are shown in Figure 2. During training, we optimize the network parameters by minimizing the construction error (mean squared error) between the generated clean image and the clean ground- truth. We add additional supervisions on training the K2(x) branch: minimizing the construction error between the last convolutional map of K2(x) branch and the ground-truth rain mask. The ground-truth rain mask is generated when we synthesize the rainy data. Conv(3, 16, 1, 1, 1, 1) Conv(16, 16, 3, 1, 1, 1) Conv(16, 16, 3, 1, 1, 1) Conv(32, 16, 3, 1, 2, 2) Conv(16, 16, 3, 1, 2, 2) Conv(32, 16, 3, 1, 3, 3) Conv(16, 16, 3, 1, 3, 3) Conv(16, 3, 1, 1, 0, 1) Conv(32, 16, 1, 1, 0, 1) Conv(16, 3, 3, 1, 0, 1) concate concate concate Conv(8, 16, 3, 1, 1, 1) Conv(16, 16, 3, 1, 1, 1) Conv(16, 32, 3, 1, 1, 1) Conv(3, 8, 1, 1, 1, 1) concate Conv(48, 32, 1, 1, 0, 1) Conv(96, 16, 1, 1, 0, 1) J(x) = (K1(x) - K2(x)I(x) - (K1(x) - K2(x) K1 BranchK2 Branch Rain Streak Mask Conv(8, 16, 3, 1, 2, 2) Conv(16, 16, 3, 1, 2, 2) Conv(16, 32, 3, 1, 2, 2) Conv(8, 16, 3, 1, 3, 3) Conv(16, 16, 3, 1, 3, 3) Conv(16, 32, 3, 1, 3, 3) K2(x) = R(x)/(I(x) - 1) R(x) K1(x) Fig. 2.Network architecture. Taking a rainy image as input, our network directly recovers a clean image. The left branch is the K1(x) branch while the right branch is the K2(x) branch. The convolution parameters are shown as (number of input and output fi lters, kernel size, stride, padding, dilation). C. Datasets It is very expensive to collect a large number of real- world clean/rainy image pairs for training and benchmarking our model. Based on CityScapes dataset 9 and Foggy CityScapes dataset 10, we synthesize images with rain and haze for experiments. The CityScapes images represent urban street scenes acquired by a vehicle camera in different cities. Foggy CityScapes dataset is built based on the CityScapes dataset and simulates a collection of foggy images gener- ated by their proposed fog simulation. For each image of CityScapes dataset, the Foggy CityScapes dataset provides three different versions of three different fog densities with a constant attenuation coeffi cient being 0.005, 0.01 and 0.02 (light, medium, and heavy haziness levels). For each fog variant, we use Photoshop 1 to create two different versions of varied rain streak intensities (light and heavy rain). For each rain intensity, we further vary the rain streak orientation for 2 versions. In total, for each clean CityScapes image, we generate 12 (3 2 2) rainy images of different haziness levels, rain streak intensities and directions. The Foggy CityScapes dataset provides 8925 images for training, 1500 images for validation and 4575 images for testing. After augmenting the Foggy CityScapes dataset by introducing different variants with different rain features, our synthetic dataset contains 35700 images for training, 6000 images for validation and 18300 images for testing. D. Training We implement our approach in Pytorch. We train our network on the synthetic data using Adam optimization with 1 964 MetricsFu et al. 8RESCAN 5AOD-Net 30Ours PSNR12.8318.6315.3220.50 SSIM0.610.840.830.84 TABLE I QUANTITATIVE DERAINING EVALUATIONS ON SYNTHETIC RAINY IMAGES. MetricsFu et al. 8RESCAN 5AOD-Net 30Ours PSNR14.0320.0217.7021.56 SSIM0.640.880.870.88 TABLE II QUANTITATIVE DERAINING EVALUATIONS ON SYNTHETIC IMAGES OF LIGHT RAIN. MetricsFu et al. 8RESCAN 5AOD-Net 30Ours PSNR9.5616.5313.0219.16 SSIM0.520.790.790.80 TABLE III QUANTITATIVE DERAINING EVALUATIONS ON SYNTHETIC IMAGES OF HEAVY RAIN. an initial learning rate of 103, weight decay of 0.0005, and momentum of 0.9 on a NVIDIA Titan X (PASCAL) GPU for 140 epochs. We divide the initial learning rate by 10 at 60 and 80 epochs, respectively. The original image of our dataset has a dimension of 2048 1024, and due to computation limitation, we resize the image to 1024 512 for both training and testing. We only train our network on images with the heavy rain intensity with one rain streak orientation and all haziness levels (light/medium/heavy). We evaluate our network on images with all rain orientations/intensities (light/heavy) and haziness levels (light/medium/heavy). So there are 8925 images for training, 6000 for validation and 18300 images for testing. E. Integration with High-level Perception Algorithms Robust perception under rainy conditions is important for the safety and sustainability of autonomous driving. Past deraining/dehazing methods only focus on evaluating the image restoration performance and there is rare work to study the impact of rain and haze removal on high- level perception tasks. Method 30 integrates their proposed dehazing method with Faster RCNN 35 for object detection but their approach only focuses on hazy conditions. In this paper, we focus on a single image-based rain removal problem of the urban street scene, with the goal to apply this deraining model in real-world autonomous driving. Consequently,

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain

文档简介

温馨提示

最新文档

评论

IROS2019国际学术会议论文集A Convolutional Network for Joint Deraining and Dehazing from A Single Image for Autonomous Driving in Rain

文档简介

温馨提示

最新文档

评论

相关文档