基于神经网络的印刷体字母识别.doc

上传人：精*** IP属地：广东上传时间：2020-01-01 格式：DOC 页数：18 大小：434KB 积分：28 举报 版权申诉

已阅读5页，还剩13页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

。基于BP神经网络的印刷体字母识别1背景随着社会的发展，英语作为国际通用语言得到了日益广泛的应用，因此有大量的英文文档整理、查询、统计的工作需要完成，而英文字母识别系统可以轻而易举地完成很多以前难以想象的工作。智能控制作为一门新兴的交叉学科，在许多方面都优于传统控制，而智能控制中的人工神经网络由于模仿人类的神经网络，具有感知识别、学习、联想、记忆、推理等智能，更是有着广阔的发展前景。人工神经网络理论的应用主要在人工智能,自动控制,模式识别,机器人,信息处理,CAD/CAM等方面。如:(1)空间科学。航空飞行器及汽车的自动驾驶导航系统,飞行路径模拟，飞行器制导和飞行程序优化管理等。(2)控制和优化。机器人运动控制,各种工业过程控制和制造过程控制，如集成电路布线设计,生产流程控制等等。(3)模式识别和图像处理。如人脸识别,语言识别,指纹识别,签名识别,手写体和印刷体字符识别,目标检测与识别,图像复原,图像压缩等等。(4)智能信息管理系统。如股价预测,不动产价格预测,外汇,黄金等大宗产品价格预测,公司财务分析,地震及各种自然灾害预报等等。其中最核心的是反向传播网络(Back Propagation Network),简称BP网络。本文介绍了运用matlab工具箱确定隐层神经元的个数和构造BP神经网络，并用两组样本对该神经网络进行训练，然后运用训练后的网络对字母进行识别。2 BP网络介绍 BP神经网络又称误差反向传递神经网络。它是一种依靠反馈值来不断调整节点之间的连接权值而构建的一种网络模型。它的整个体系结构分为输入层、隐藏层和输出层,其中隐藏层根据具体情况的需要,可以是一层结构也可为多层结构。BP算法的基本思想是:学习过程由信号的正向传播与误差的反向传播两个过程组成。正向传播时,输入样本从输入层传入,经各隐藏层逐层处理后，传向输出层。若输出层的实际输出与期望的输出(教师信号)不符,则转入误差的反向传播阶段。误差反传是将输出误差以某种形式通过隐藏层向输入层反传,并将误差分摊给各层的所有单元,从而获得各层单元的误差信号,此误差信号即作为修正各单元权值的依据。这种信号正向传播与误差反向传的各层权值调整过程,是周而复始地进行的。权值不断调整的过程,也就是网络的学习训练过程。此过程一直进行到网络输出的误差减少到可接受到的程度,或进行到预先设定的学习次数为此。3系统实现思想字母识别系统一般分为预处理、特征提取和分类器三部分。其中，预处理包括将图片从模拟图像到进行二值化、归一化等过程；特征提取和分类器的设计是整个系统的核心部分。通过对各个部分分别进行编程处理来实现，将每部分编为可调用的函数，最后统一对函数进行调用，清晰方便。3.1字母识别整体框图a BP神经网络训练过程b BP神经网络识别过程图2.1 BP神经网络识别系统3.2 预处理及其特征提取方法本文使用傅里叶描述符及其反变换进行图片的二值化、字母轮廓提取，之后进行归一化，将其特征变成1*120的矩阵，之后选取里面的六十个点变为1*60的矩阵。特征提取程序：function FD=Feature_Building(RGB)%RGB=imread(d:A.bmp);%figure(1),inshow(RGB)B=outline(RGB);%figure(2)%subplot(221),draw_outline(B);%title(outline of object);m,n=size(B);FD=fsd(B,30,m,4);其中outline、fsd为傅里叶描述及其反变换程序。outline程序：%Function for extracting outline of object; Q.K., 2008.4.29%Deaprtment of Automation, Tsinghua Univ. Beijing 100084, China.function outline=outline(RGB)I=rgb2gray(RGB);junk threshold = edge(I, sobel);fudgeFactor = .5;BWs = edge(I,sobel, threshold * fudgeFactor);%Step 3: Dilate the imagese90 = strel(line, 3, 90);se0 = strel(line, 3, 0);BWsdil = imdilate(BWs, se90 se0);%Step 4: Fill interior gapsBWdfill = imfill(BWsdil, holes);%Step 5: Remove connected objects on borderBWnobord = imclearborder(BWdfill, 4);%Step 6: Smoothen the objectseD = strel(diamond,1);BWfinal = imerode(BWnobord,seD);BWfinal = imerode(BWfinal,seD);bw = bwareaopen(BWfinal,30); % % fill a gap in the pens capB,L = bwboundaries(bw,noholes);outline = B1;fsd程序见程序清单。3.3 BP神经网络结构3.3.1 输入层神经元个数的确定将图像的特征向量作为神经网络的输入，所以神经网络的输入层神经元个数等于特征向量的维数，即160=60个输入神经元。3.3.2 隐含层神经元个数的确定隐层节点数对网络的学习和计算特性具有非常重要的影响，是该网络结构成败的关键。若隐层节点数过少，则网络难以处理复杂的问题；但若隐层节点数过多，则将使网络学习时间急剧增加，而且还可能导致网络学习过度，抗干扰能力下降。本文根据实际的实验，确定隐含层神经元的个数为15个。3.3.3 输出层神经元个数的确定因为要识别26个英文大写字母，因此输出选择261的矩阵，即输出层神经元的个数为26个。当26个字母输入神经网络后，在对应的位置上输出1，其他位置上输出零。当网络进入识别过程时，哪个位置上输出的期望值最大，认为识别出的是这个位置上的字母。3.3.4 BP神经网络的构造建立一个前向BP神经网络函数newff:net=newff(minmax(P),S1,S2, logsig,logsig , trainlm);net.LW2,1=net.LW2,1*0.01;net.b2=net.b2*0.01;其中minma(P)为神经网络的对它的60个输入元素的最大值和最小值的限制。P为训练样本集合。S1、S2分别为该神经网络的隐含层和输出层的神经元个数。logsi g,logsig为神经网络的各层的转移函数 ,均设置为对数S型激活函数。训练函数采用trainlm 。3.4 BP神经网络的训练3.4.1 训练样本集合和目标值集合字母图片归一化后的图像为601的矩阵，用6026的矩阵形成一个训练样本；目标矢量是希望每一个数字输入神经网络后在输出神经元对应的位置上为1,其他的位置为0。为此取目标矢量为对角线上为1的2626的单位阵 ,用matlab命令实现为:T=eye(26);3.4.2 网络训练隐含层神经元的传递函数采用s型对数函数logsig，输出层神经元传递函数也采用s型对数函数，训练函数采用trainlm，性能函数采用sse，训练步数设置为最大5000，性能目标值为0.05,。BP训练程序：net.performFcn=sse;%设置目标性能函数net.trainParam.goal=0.05;%性能目标值net.trainParam.show=20;%显示间隔次数net.trainParam.epochs=5000;%最大训练次数net.trainParam.mc=0.95;net,tr=train(net,P,T);BP网络训练流程图：使用第一组样本进行训练的结果：TRAINLM, Epoch 0/5000, SSE 169.303/0.05, Gradient 39.0748/1e-010TRAINLM, Epoch 20/5000, SSE 9.07917/0.05, Gradient 0.647529/1e-010TRAINLM, Epoch 40/5000, SSE 5.45171/0.05, Gradient 0.465044/1e-010TRAINLM, Epoch 60/5000, SSE 3.85999/0.05, Gradient 1.13736/1e-010TRAINLM, Epoch 80/5000, SSE 3.37108/0.05, Gradient 0.970379/1e-010TRAINLM, Epoch 100/5000, SSE 1.43394/0.05, Gradient 0.27961/1e-010TRAINLM, Epoch 120/5000, SSE 1.13878/0.05, Gradient 0.661835/1e-010TRAINLM, Epoch 140/5000, SSE 0.561939/0.05, Gradient 0.497918/1e-010TRAINLM, Epoch 160/5000, SSE 0.537153/0.05, Gradient 0.0963243/1e-010TRAINLM, Epoch 180/5000, SSE 0.518194/0.05, Gradient0.00990168/1e-010TRAINLM, Epoch 200/5000, SSE 0.461637/0.05, Gradient 11.4576/1e-010TRAINLM, Epoch 206/5000, SSE 0.0350697/0.05, Gradient 0.265104/1e-010TRAINLM, Performance goal met.可见经过206次训练后，网络误差达到要求，误差曲线如下图：使用第二组样本进行训练的结果：TRAINLM, Epoch 0/5000, SSE 168.635/0.05, Gradient 33.7987/1e-010TRAINLM, Epoch 20/5000, SSE 3.28669/0.05, Gradient 40.5407/1e-010TRAINLM, Epoch 32/5000, SSE 0.0441687/0.05, Gradient 0.0844925/1e-010TRAINLM, Performance goal met.可见经过26次训练之后，网络误差达到要求。误差曲线如下图所示：3.5 字母的识别以上所介绍为网络的学习期，学习过程结束后，网络进入工作期，即可以进行字母的识别。单一字母识别程序如下：RGB=imread(D:Program FilesMATLAB71work新建文件夹 1A11.bmp);%工作期 A11为大写字母A略带噪声的图片FDB=Feature_Building(RGB);%提取字母轮廓特征FDB=reshape(FDB,1,120);FDB=FDB(1:2:120);%归一化处理a,b=max(sim(net,(FDB)% 字母识别 a为网络工作后输出层输出的最大值，b为所识别字母的行数即如果识别为A，则b为1；识别为B，则b为2，以此类推。识别结果如下：a=0.8316b=1可见能够正确识别字母A。RGB=imread(D:Program FilesMATLAB71work新建文件夹 1B11.bmp);%工作期 B11为大写字母B略带噪声的图片FDB=Feature_Building(RGB);%提取字母轮廓特征FDB=reshape(FDB,1,120);FDB=FDB(1:2:120);%归一化处理a,b=max(sim(net,(FDB)% 字母识别 a为网络工作后输出层输出的最大值，b为所识别字母的行数即如果识别为A，则b为1；识别为B，则b为2，以此类推。识别结果如下：a=0.9741b=2可见也能够正确的识别字母B。本文使用两组样本进行BP神经网络的训练，一组样本进行字母的识别。识别程序及其结果如下：load(D:Program FilesMATLAB71work新建文件夹 1index1.mat) for i=1:26 RGBi=imread(D:Program FilesMATLAB71work新建文件夹 1,index1i); FDi=Feature_Building(RGBi); FDi=reshape(FDi,1,120); FDi=FDi(1:2:120); a,b=max(sim(net,(FDi) end结果：a =0.8316b =1a =0.9741b =2a =0.8805b =3a =0.9315b =4a =0.6114b =5a =0.9755b =6a =0.9715b =7a =0.9780b =8a =0.9770b =9a =0.9958b =10a =0.8759b =11a =0.9610b =12a =0.9695b =13a =0.8119b =8a =0.9718b =15a =0.9752b =5a =0.9039b =17a =0.5457b =18a =0.8177b =6a =0.9728b =20a =0.2953b =18a =0.8482b =22a =0.9092b =23a =0.9743b =24a =0.9534b =25a =0.9764b =26由以上结果可知：识别了22个字母，有四个字母未被正确识别（N P S U）。为了使识别准确率更高，训练更多的样本，尽量选择一些略带有噪声的图片，识别时准确率更高。程序清单BP网络训练程序：clcclearload(D:Program FilesMATLAB71work新建文件夹 1index.mat) for i=1:52 RGBi=imread(D:Program FilesMATLAB71work新建文件夹 1,indexi); FDi=Feature_Building(RGBi); FDi=reshape(FDi,1,120); FDi=FDi(1:2:120); endP1=(FD1) (FD2) (FD3) (FD4) (FD5) (FD6) (FD7) (FD8) (FD9) (FD10) (FD11) (FD12) (FD13) (FD14) (FD15) (FD16) (FD17) (FD18) (FD19) (FD20) (FD21) (FD22) (FD23) (FD24) (FD25) (FD26);P2=(FD27) (FD28) (FD29) (FD30) (FD31) (FD32) (FD33) (FD34) (FD35) (FD36) (FD37) (FD38) (FD39) (FD40) (FD41) (FD42) (FD43) (FD44) (FD45) (FD46) (FD47) (FD48) (FD49) (FD50) (FD51) (FD52);%P=P1;P2;T=eye(26);S1=15;S2=26;for n=1:2 %学习期net=newff(minmax(Pn),S1 S2,logsig logsig,trainlm);net.LW2,1=net.LW2,1*0.01;net.b2=net.b2*0.01;net.performFcn=sse;net.trainParam.goal=0.05;net.trainParam.show=20;net.trainParam.epochs=5000;net.trainParam.mc=0.95;net,tr=train(net,Pn,T);end识别程序：load(D:Program FilesMATLAB71work新建文件夹 1index1.mat) for i=1:26 RGBi=imread(D:Program FilesMATLAB71work新建文件夹 1,index1i); FDi=Feature_Building(RGBi); FDi=reshape(FDi,1,120); FDi=FDi(1:2:120); a,b=max(sim(net,(FDi) end傅里叶变换程序：function FD=Feature_Building(RGB)%RGB=imread(d:A.bmp);%figure(1),inshow(RGB) B=outline(RGB);%figure(2)%subplot(221),draw_outline(B);%title(outline of object);m,n=size(B);FD=fsd(B,30,m,4);%Function for extracting outline of object; Q.K., 2008.4.29%Deaprtment of Automation, Tsinghua Univ. Beijing 100084, China. function outline=outline(RGB) I=rgb2gray(RGB);junk threshold = edge(I, sobel);fudgeFactor = .5;BWs = edge(I,sobel, threshold * fudgeFactor); %Step 3: Dilate the imagese90 = strel(line, 3, 90);se0 = strel(line, 3, 0);BWsdil = imdilate(BWs, se90 se0); %Step 4: Fill interior gapsBWdfill = imfill(BWsdil, holes); %Step 5: Remove connected objects on borderBWnobord = imclearborder(BWdfill, 4); %Step 6: Smoothen the objectseD = strel(diamond,1);BWfinal = imerode(BWnobord,seD);BWfinal = imerode(BWfinal,seD);bw = bwareaopen(BWfinal,30); % % fill a gap in the pens capB,L = bwboundaries(bw,noholes);outline = B1;function rFSDs = fsd(outline,H,b,bN)% Forward elliptical Fourier transform - see Kuhl FP and Giardina CR% Elliptic Fourier features of a closed contour Computer Graphics and% Image Processing 18:236-258 1982 for theory.% Returns a shape spectrum of input x,y data outline with% iNoOfHarmonicsAnalyse elements.% The output FSDs will be normalised for location, size and orientation% if bNormaliseSizeState and bNormaliseOrientationState are TRUE % Pre-calculate some constant arrays% n * 2 * pi% n2 * 2* pi2% where n is the number of harmonics to be used in the analysis %H = iNoOfHarmonicsAnalyse %b = bNormaliseSizeState %m n = size(outline), b = m; %bN = bNormaliseOrientationState rTwoNPi = (1:1:H)* 2 * pi;rTwoNSqPiSq = (1:1:H) .* (1:1:H)* 2 * pi * pi; iNoOfPoints = size(outline,1) - 1; % hence there is 1 more data point in outline than iNoOfPointsrDeltaX = zeros(iNoOfPoints+1,1); % pre-allocate some arraysrDeltaY = zeros(iNoOfPoints+1,1);rDeltaT = zeros(iNoOfPoints+1,1); for iCount = 2 : iNoOfPoints + 1 rDeltaX(iCount-1) = outline(iCount,1) - outline(iCount-1,1); rDeltaY(iCount-1) = outline(iCount,2) - outline(iCount-1,2);end % Calculate time differences from point to point - actually distances, but we are% carrying on the fiction of a point running around the closed figure at constant speed. % We are analysing the projections on to the x and y axes of this points path around the figurefor iCount = 1 : iNoOfPoints rDeltaT(iCount) = sqrt(rDeltaX(iCount)2) + (rDeltaY(iCount)2);endcheck = (rDeltaT = 0); % remove zeros from rDeltaT, rDeltaX.rDeltaT = rDeltaT(check);rDeltaX = rDeltaX(check);rDeltaY = rDeltaY(check); iNoOfPoints = size(rDeltaT,1) - 1; % we have removed duplicate points % now sum the incremental times to get the time at any pointrTime(1) = 0;for iCount = 2 : iNoOfPoints + 1 rTime(iCount) = rTime(iCount - 1) + rDeltaT(iCount-1);end rPeriod = rTime(iNoOfPoints+1); % rPeriod defined for readability % calculate the A-sub-0 coefficientrSum1 = 0;for iP = 2 : iNoOfPoints + 1 rSum2 = 0; rSum3 = 0; rInnerDiff = 0; % calculate the partial sums - these are 0 for iCount = 1 if iP 1 for iJ = 2 : iP-1 rSum2 = rSum2 + rDeltaX(iJ-1); rSum3 = rSum3 + rDeltaT(iJ-1); end rInnerDiff = rSum2 - (rDeltaX(iP-1) / rDeltaT(iP-1) * rSum3); end rIncr1 = (rDeltaX(iP-1) / (2*rDeltaT(iP-1)*(rTime(iP)2-rTime(iP-1)2) + rInnerDiff*(rTime(iP)-rTime(iP-1); rSum1 = rSum1 + rIncr1;end rFSDs(1,1) = (1 / rPeriod) * rSum1) + outline(1,1); % store A-sub-0 in output FSDs array - this array will be 4 x iNoOfHarmonicsAnalyse % calculate the a-sub-n coefficientsfor iHNo = 2 : H rSum1 = 0; for iP = 1 : iNoOfPoints rIncr1 = (rDeltaX(iP) / rDeltaT(iP)*(cos(rTwoNPi(iHNo-1)*rTime(iP+1)/rPeriod) - cos(rTwoNPi(iHNo-1)*rTime(iP)/rPeriod); rSum1 = rSum1 + rIncr1; end rFSDs(1,iHNo) = (rPeriod / rTwoNSqPiSq(iHNo-1) * rSum1;end % foriHNo = 1 :. rFSDs(2,1) = 0; % there is no 0th order sine coefficient % calculate the b-sub-n coefficientsfor iHNo = 2 : H rSum1 = 0; for iP = 1 : iNoOfPoints rIncr1 = (rDeltaX(iP) / rDeltaT(iP)*(sin(rTwoNPi(iHNo-1)*rTime(iP+1)/rPeriod) - sin(rTwoNPi(iHNo-1)*rTime(iP)/rPeriod); rSum1 = rSum1 + rIncr1; end rFSDs(2,iHNo) = (rPeriod / rTwoNSqPiSq(iHNo-1) * rSum1;end % foriHNo = 1 :. % calculate the C-sub-0 coefficientrSum1 = 0;for iP = 2 : iNoOfPoints + 1 rSum2 = 0; rSum3 = 0; rInnerDiff = 0; % calculate the partial sums - these are 0 for iCount = 1 if iP 1 for iJ = 2 : iP-1 rSum2 = rSum2 + rDeltaY(iJ-1); rSum3 = rSum3 + rDeltaT(iJ-1); end rInnerDiff = rSum2 - (rDeltaY(iP-1) / rDeltaT(iP-1) * rSum3); end rIncr1 = (rDeltaY(iP-1) / (2*rDeltaT(iP-1)*(rTime(iP)2-rTime(iP-1)2) + rInnerDiff*(rTime(iP)-rTime(iP-1); rSum1 = rSum1 + rIncr1;end rFSDs(3,1) = (1 / rPeriod) * rSum1) + outline(1,2); % store C-sub-0 in output FSDs array - this array will be 4 x iNoOfHarmonicsAnalyse % calculate the C-sub-n coefficientsfor iHNo = 2 : H rSum1 = 0; for iP = 1 : iNoOfPoints rIncr1 = (rDeltaY(iP) / rDeltaT(iP)*(cos(rTwoNPi(iHNo-1)*rTime(iP+1)/rPeriod) - cos(rTwoNPi(iHNo-1)*rTime(iP)/rPeriod); rSum1 = rSum1 + rIncr1; end rFSDs(3,iHNo) = (rPeriod / rTwoNSqPiSq(iHNo-1) * rSum1;end % foriHNo = 1 :. rFSDs(4,1) = 0; % there is no 0th order sine coefficient % calculate the D-sub-n coefficientsfor iHNo = 2 : H rSum1 = 0; for iP = 1 : iNoOfPoints rIncr1 = (rDeltaY(iP) / rDeltaT(iP)*(sin(rTwoNPi(iHNo-1)*rTime(iP+1)/rPeriod) - sin(rTwoNPi(iHNo-1)*rTime(iP)/rPeriod); rSum1 = rSum1 + rIncr1; end rFSDs(4,iHNo) = (rPeriod / rTwoNSqPiSq(iHNo-1) * rSum1;end % foriHNo = 1 :. % the non-normalised coefficients are now in rFSDs% if we want the normalised ones, this is where it happensif (b = 1) | (bN = 1) % rTheta1 is the angle through which the starting position of the first % harmonic phasor must be rotated to be aligned with the major axis of % the first harmonic ellipse rFSDsTemp = rFSDs; rTheta1 = 0.5 * atan(2 * (rFSDsTemp(1,2) * rFSDsTemp(2,2) + rFSDsTemp(3,2) * rFSDsTemp(4,2) / . (rFSDsTemp(1,2)2 + rFSDsTemp(3,2)2 - rFSDsTemp(2,2)2 - rFSDsTemp(4,2)2); % calculate the partially normalised coefficients - normalised for % starting point for iHNo = 1 : H rStarFSDs(1,iHNo) = cos(iHNo-1) * rTheta1) * rFSDsTemp(1,iHNo) + sin(iHNo-1) * rTheta1) * rFSDsTemp(2,iHNo); rStarFSDs(2,iHNo) = -sin(iHNo-1) * rTheta1) * rFSDsTemp(1,iHNo

人人文库> 全部分类> 应用文书 > 事务文书

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

基于神经网络的印刷体字母识别.doc

文档简介

温馨提示

最新文档

评论

基于神经网络的印刷体字母识别.doc

文档简介

温馨提示

最新文档

评论

相关文档