已阅读5页,还剩3页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Edge Preserving Camera Trajectories for Improved Optical Character Recognition on Static Scenes with Text Rohan Katoch1and Jun Ueda1 Abstract Camera systems in fast motion suffer from the effects of motion blur which degrades image quality and can have a large impact on the performance of visual tasks The degradation in image quality can be mitigated through the use of image reconstruction Blur effects and the resulting reconstruction performance are highly dependent on the point spread function resulting from camera motion This work fo cuses on the motion planning problem for a camera system with boundary conditions on time and position with the objective of improving the performance of optical character recognition Tuned edge preserving trajectories are shown to result in higher recognition accuracy when compared to inverse error and linear trajectories Simulation and experimental results provide quantitative measures to verify edge preservation and greater recognition performance Index Terms Computer vision Motion planning Image processing I INTRODUCTION Camera sensors are widely used by robotic systems as a rich source of information about the surrounding environ ment helping to resolve a broad array of tasks including localization object recognition path planning optical char acter recognition OCR 1 2 Performing these tasks suc cessfully in unstructured environments requires processing visual signals at a high level and in an effi cient manner The quality of the signal captured in general has a signifi cant impact on task performance Camera motion is one source of degradation in visual signals especially when considering fast moving systems This setting may occur for example when scanning a large scene with suffi cient image resolution in a short period of time 3 4 5 In this case it is desirable to capture images sequentially without making frequent stops Images captured by systems in motion suffer from degra dation in the visual signal due to two main factors i camera motion and ii scene motion 6 The combination of these effects results in the phenomenon known as motion blur Camera sensors have fi nite exposure duration to allow suf fi cient development of charge in the array of photosensitive elements Any relative motion between camera and scene during this exposure period causes different point sources to be integrated on an individual element This results in motion blur that can be described by the path a point source takes over the array of elements This work was supported by the National Science Foundation under Grant No 1662029 1Rohan KatochandJunUedaarewithBio RoboticsandHu man Modeling Lab George W Woodruff School of Mechanical En gineering Georgia Institute of Technology Atlanta GA 30332 USA rkatoch3 gatech edu Fig 1 An example of an application of motion controlled cameras A drone taking a route panorama while fl ying along a street Images blurred under camera motion exhibit blur effects that are spatially invariant as the motion is applied to the sensor globally In contrast objects in motion within a scene can cause spatially varying local blur Only the case of camera motion is considered in this work therefore the scene being imaged is assumed to be stationary and depth variations are negligible orthographic scene Under stationarity assumptions a blurry image can be represented as the convolution of a point spread function PSF with a latent image Where the PSF also known as a blur kernel represents how the energy of an ideal point source disperses over the sensor array This blur model is applicable only for camerasI with a global shutter in planar motion and does not apply to cameras with rolling shutter or non planar motion Non planar camera motion can exist however since exposure times are relatively short the motion during exposure can be assumed to be planar Also note that both displacement and velocity of the camera trajectory are required to determine the resulting PSF Mitigating motion blur effects is possible using three methods i controlling optical parameters exposure aper ture focal length ii controlling camera motion and iii motion deblurring Short exposure can be achieved with high speed cameras which have very fast shutter speeds and sensors with high photo sensitivity This eliminates the possibility of signifi cant motion occurring at the exposure time scale While capable of producing high quality images these camera systems are costly require very high data rates and do not operate well in low intensity lighting IEEE Robotics and Automation Letters RAL paper presented at the 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 Copyright 2019 IEEE conditions Other methods that involve active control of optical parameters include coded exposure fl utter shutter 7 and coded aperture 8 These methods use intelligent control of optical parameters during image capture followed by post processing The ability to actively control optical parameters is not present in most commercially available cameras Furthermore these methods generally require a stationary camera and are not applicable under the constraints of the motion planning problem considered in this paper Motion compensation involves providing feed forward control signals to stabilize a camera sensor relative to the scene being captured 9 Prior knowledge of how the imaging device will move without compensation is required in order to use this method If performed successfully the imaging sensor should remain stationary with respect to the scene In practice this is rarely the case due to disturbances or imperfect state knowledge and additional image processing is required to further restore the latent image Furthermore the use of this method is constrained by hardware limitations of the compensation mechanism i e speed and displacement limits Motion deblurring refers to the use of image processing techniques to remove blur effects after an image has been captured In general this is an ill posed inverse problem that requires either prior knowledge or feature based information for application 10 11 12 13 Currently all motion deblurring algorithms that deal with spatially invariant blur fi rst evaluate or estimate the PSF and then utilize it for a deconvolution process The motion planning problem for a fast moving camera system capturing an image is considered with the objective of improving OCR performance This is relevant when using a camera for scanning or taking route panoramas of a scene with text as in Figure 1 14 Instead of focusing on stabilizing the camera relative to scene as in motion compensation a scenario is considered where a system needs to move from its current state to a desired fi nal state within a fi xed time horizon The problem now is to determine the camera trajectory that meets boundary conditions while generating images that preserve salient features for OCR The camera trajectory chosen can be used to evaluate the expected PSF prior to image capture reducing computation time Prior literature has dealt with goal oriented blind image reconstruction 15 16 17 and trajectory generation for enhanced image reconstruction 18 13 19 but not goal oriented trajectory generation While blind reconstruction methods offer improved OCR performance without requiring prior knowledge of camera motion the computation time required for these methods prohibit them from being used in real time applications The authors have previously studied the use of dynamics based deblurring for optical character recognition 20 which provides real time performance that outperforms other image reconstruction methods However the methods attempts tot stabilize the camera while taking the image rather than generating a specifi c trajectory Levin et al propose moving the camera using parabolic trajectories which generate PSFs that are invariant to constant velocity motion in one direction 18 This method was extended by Cho et al to include all planar motion directions by taking two orthogonal parabolic exposures of the same scene 21 Bando et al propose using circular trajectories which generate PSFs that are orientation invariant for linear motion 19 This work fi rst demonstrates the positive correlation be tween edge features and recognition accuracy and proposes a parametric trajectory which can be tuned for edge preser vation The concept of residence time distributions RTDs introduced in prior work by the authors 22 is used as a mapping between trajectories and PSFs Using RTDs it was shown that inverse error trajectories result in Gaussian PSFs see appendix A Images of natural scenes blurred under inverse error trajectories and then reconstructed resulted in lower mean square error MSE values when compared to parabolic and linear trajectories Furthermore Gaussian PSFs result in reconstructed images that are robust to additive noise OCR performance however is dependent on image gradients which are not well preserved by Gaussian PSFs In contrast to inverse error functions which generate PSFs robust to noise this work investigates PSFs that preserve edges in text images II MOTIONBLURANALYSIS A Problem Setting Consider a camera motion system in planar space X R2with position x x y X There are two objectives to accomplish i reach fi nal position xfat tf and ii capture image I with a desired PSF The orientation of the camera does not change The exposure window T t t T with duration T is the time period during which the camera sensor captures information and is considered to be prior information B Image Formation The formation of images in digital cameras can be mod eled as a noisy integration process with two sources of noise i shot noise and ii thermal noise 11 Shot noise refers to the variance in the number of photons captured by photo sensitive elements over time and is proportional to the square root of the signal intensity on a per pixel basis While thermal noise refers to the general uncertainty of reading an electrical signal which is thermally agitated Shot noise is modeled as a stationary Poisson process P with intensity and thermal noise as an additive zero mean Gaussian process N with variance 2 The Poisson process generates a blurry image B to which Gaussian noise N is added resulting in captured image I Therefore for latent image L and exposure period T the captured image is described by 1 and 2 B P Z T L x t dt N N 0 2 1 I B x t T N 2 Fig 2 Example of a planar camera trajectory for a motion controlled camera The exposure position x is defi ned to be the average position for the trajectory x t during the exposure window T x 1 T Z T x t dt 3 C Non blind Motion Deblurring Reconstructed image L x is generated by deconvolution with an estimated blur kernel K representing the expected PSF The real kernel K is assumed to accurately represent the time dependent motion blur process This assumption holds when considering spatially invariant blur The image model in 2 can now be expressed using the convolution operator as shown in 4 I K x t T L x N 4 The PSF can be directly estimated from the command sig nals generated and then used for deconvolution This process is called dynamics based motion deblurring as described in prior work by the authors 23 24 D Residence Time Distributions Consider a planar camera trajectory x t defi ned on expo sure window T as shown in Figure 2 Residence time is the length of time spent at a particular position by the camera while moving along the trajectory providing the mapping r x R2 R which will be called a RTD To guarantee a continuous and smooth trajectory the RTD must be twice differentiable The trivial case when the camera is stationary leads to a RTD consisting of a Dirac delta function at position x with value T While this results in a clear image such a trajectory would not meet the desired kinematic constraints There are an infi nite number of possible cyclic trajectories that map to the same RTD Restricting the analysis to noncyclic trajectories that is x 6 0 t T is necessary to guarantee a unique solution The expression for r x can then be found using 5 r x 1 k x x k2 5 The relation above can be used to numerically or ana lytically construct a RTD For the purposes of image recon struction the RTD is useful due to its proportionality with the expected PSF In fact a PSF can be generated by normalizing a RTD and discretizing according to the kernel size The compact support of the distribution is determined by the minimum x and maximum x position values of the trajectory during exposure For monotonically increasing trajectories the minimum and maximum values are defi ned by the displacement parameter x and exposure position x III EDGEPRESERVINGTRAJECTORIES The problem of character recognition in images has been widely studied and is an active area of research 25 26 Many approaches have been proposed which include unsu pervised learning convolutional neural networks conditional random fi elds and belief propagation 27 In general the text recognition process consists of character segmentation and classifi cation These processes perform best when the image has the following properties i sharp edges ii high contrast iii well aligned characters and iv low pixel noise 28 29 While contrast and character alignment cannot be affected by camera motion the presence of sharp edges and noise can Post processing methods for enhancing these features exist and are known as edge preserving smoothing fi lters 30 Examples of such fi lters include median bilat eral and anisotropic diffusion 31 However these fi lters are spatially varying and therefore cannot be replicated by camera trajectories They also require extensive computation and are not effi cient enough for real time use 32 Instead camera trajectories are desired that preserve edges and satisfy position and velocity constraints This paper proposes to use fourth order polynomial trajectories which meet the desired constraints and generate PSFs that preserve edges Since camera exposure periods are generally very short the following trajectory generation method will assume that the mobile platform carrying the camera follows a straight line path Therefore the two dimensional case can be reduced to one dimension where the new axis will be aligned with the tangent of the general path shown in Figure 2 A Polynomial Trajectories Fourth order polynomial trajectories are considered as a candidate for generating tuned edge preserving PSFs while meeting the desired kinematic constraints This choice is made so that there are just enough parameters to meet boundary conditions and a free parameter for tuning These trajectories are expected to generate PSFs that have low variance resulting in high RTD values near the exposure position Equation 6 presents a parametric fourth order polynomial with fi ve parameters ci There are two position constraints and two velocity constraints that need to be sat isfi ed leaving one free parameter that must be defi ned This last parameter is resolved by constraining the acceleration at the exposure position x to be equal to a user defi ned value R Now all parameters can be found using 7 9 1 251 31 351 41 451 51 551 61 651 71 75 Time s Position mm 62 64 67 63 65 66 68 69 0 50 100 150 Fig 3 Fourth order polynomial trajectories for various values 6263646566676869 Position mm Residence Time ms 0 2 4 6 10 8 12 14 0 50 100 150 Fig 4 RTDs corresponding to fourth order polynomial trajectories for various values x t 4 X i 0 ci t t i 6 c0 x c1 v c2 2 7 c3 2 T2 8 x T T 16 T 6v 2ve 8 c4 2 T3 12 x T T 8v 4ve 9 The RTDs of fourth order polynomials exhibit a double peak structure with maximal values at neighboring pixels rather than the central pixel As is increased the distance between the RTD peaks reduces and the maximal value increases For the trajectories in Fig 3 this trend is demon strated in Fig 4 Note that small changes in trajectories can result in large changes in RTDs due to the inverse relationship between velocity and residence time B Image Gradients Text edges can be characterized by the gradient infor mation of pixel intensity values Image gradients can be evaluated using Sobel operators 33 which determine the directional change in image intensity values Text images generally exhibit high gradient intensities and low gradient variances Therefore the mean and variance of image gradi ents can be used as a metric to characterize the degradation of textual edge information The mean is evaluated as the average edge intensity over the two dimensional image while the variance is evaluated over the average edge intensities across each row c a b 0500100015002000 0 0 2 0 4 0 6 0 8 1 0500100015002000 0 0 2 0 4 0 6 0 8 1 0500100015002000 0 0 2 0 4 0 6 0 8 1 Linear blur Restored image Pixel Normalized Intensity PixelPixel Ideal edge Fig 5 Image of ideal edge unit step with one dimensional represen tation for a clear image b image blurred by linear RTD c image reconstructed by Wiener deconvolution Fig 6 Image of ideal edge unit step with one dimensional representation for a clear image b image blurred by 4th order polynomial RTD c image reconstructed by Wiener deconvolution c a b 0500100015002000 0 0 2 0 4 0 6 0 8 1 Gaussian blurRestored image Pixel Normalized Intensity PixelPixel Ideal edge 0200400600800100012001400160018002000 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0200400600800100012001400160018002000 0 2 0 0 2 0 4 0 6 0 8 1 1 2 Fig 7 Image of ideal edge unit step with one dimensional representation for a clear image b image blurred by Gaussian RTD c image reconstructed by Wiener deconvolution C Spectral Analysis In the frequency domain image reconstruction by de convolution is essentially a regularized inverse operation which involves the kernel K The regularization term is dependent on the method used For Wiener deconvolution the reconstruction fi lter R is given as R K S K 2S N 10 where K is the complex conjugate of the kernel S is the mean power spect
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年家居改造安全生产培训协议
- 2026年度专项保温隔热工程协议书
- 2026年半导体配送培训服务合同
- 2025-2026学年中国工笔画教案
- 智能仓储与配送技术探讨
- 慢性肾病的水分摄入控制
- 福建省泉州市高中数学 第二章 点、直线、平面之间的位置关系 2.3 直线、平面垂直的判定及其性质 2.3.1 直线与平面垂直的判定教案 新人教A版必修2
- 履行社会责任与慈善承诺书(9篇)
- 通风空气调节系统消防验收记录
- 临川用血审核制度-试题及答案
- GB/T 18422-2013橡胶和塑料软管及软管组合件透气性的测定
- GA/T 497-2016道路车辆智能监测记录系统通用技术条件
- 安全生产管理制度汇编(水利行业)
- 湖南省长沙市长郡教育集团2021-2022学年中考三模数学试题含解析
- 硬笔书法全册教案共20课时
- 脱挂式索道(检测)课件
- 地下室防水工程做法课件
- 审理商品房买卖合同纠纷案件司法解释的理解与适用
- 北师大版生物初一下册期末知识点归纳总结
- 短视频与直播电商 第7章 短视频+直播 整合运营
- 2022年新高考全国I卷英语读后续写讲解
评论
0/150
提交评论