语音信号的采集与处理毕业论文.doc

上传人：简*** IP属地：湖北上传时间：2020-02-04 格式：DOC 页数：68 大小：1.63MB 积分：9.6 举报 版权申诉

已阅读5页，还剩63页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 I 语音信号的采集与处理毕业论文语音信号的采集与处理毕业论文目目录录前言 1 第一章绪论 2 第一节研究背景和意义 2 第二节研究现状 2 第三节发展方向 4 第四节本章小结 5 第二章系统方案设计 6 第一节系统性能指标 6 第二节方案设计 6 第三节本章小结 8 第三章系统硬件设计 9 第一节系统总体结构框图 9 一系统结构总框图 9 二功能模块设计 10 第二节处理器模块 11 一 51 单片机 11 二 SPCE061A 芯片 12 三电源模块 17 四键盘电路 18 第三节语音采集模块 18 第四节语音处理芯片 19 第五节显示模块 22 第六节控制模块 24 第七节本章小结 25 第四章系统软件设计 26 第一节系统软件结构 26 第二节主程序流程图 26 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 II 第三节 ISD1730 语音采集 27 第四节凌阳单片机语音处理 30 一凌阳音频压缩编码 30 二语音播报流程图 31 第五节 LCD 显示子程序 33 第六节本章小结 36 第五章系统测试 37 第一节仿真测试 37 第二节硬件测试 37 第三节系统测试 38 第四节本章小结 38 结论 39 致谢 40 参考文献 41 附录 42 一英文原文 42 二英文翻译 48 三工程设计图纸 52 A 方案 51 单片机 52 B 方案凌阳 61 单片机 53 四源程序 54 五其他 64 部分仿真截图 64 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 1 前前言言语音识别技术已经发展成为涉及声学语言学数字信号处理统计模式识别等多学科技术的一项综合性技术经过数十年的发展语音识别技术已经经历了从特定人小词汇量孤立词的语音识别到非特定人大词汇量自然语音识别的发展过程取得了辉煌的成就近年来语音识别技术取得了显著进步逐渐由实验室走向商用市场随着其技术本身的发展与不但完善语音技术将会在未来的发展中活得更多的应用语音识别技术将会在工业控制家电行业通信服务汽车电子医疗服务等领域得到广泛的应用其中语音在机器人控制中得应用更为突出随着工业技术的不断发展机器人在生产中的地位越来越重要语音控制机器人也有着更为重要的地位语音控制有着其他控制算法不可比拟的优势尽管其他的控制算法在控制方面已经做得很完善但是实际应用中的突发情况对机器人的处理还存在不少漏洞引入语音控制就可以很好的避免一些问题的出现语音机器人就是通过语音控制机器人的动作通过特定人语音识别技术实现对机器人在语音控制本文主要介绍用于语音控制机器人的竞赛方案为了更好的实现对目标物的控制设计 2 组对比方案 A 方案主要基于 51 单片机和语音芯片 ISD1730 B 方案主要是基于语音处理凌阳单片机通过方案对比设计出一套更适合机器人语音控制的方案重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 2 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 3 第一章第一章绪论绪论第一节第一节研究背景和意义研究背景和意义语音是语言最基本最自然最直接的表现形式语音的识别过程就是将空气中传播的生变转变成为包含语音信息并且记载着声波物理性质的模拟电信号语音信号处理时建立在语音学和数字信号处理的基础之上的其中数字信号处理是指对离散信号用数字方式进行增强压缩滤波变换和识别处理语音信号处理技术的发展大致经过了三个阶段萌芽阶段 20 世纪中叶人们对语音处理的研究主要由语音学知识从中提取特征参数用其模拟人的发音过程用于实现简单语音处理发展阶段 20 世纪 70 年代集成电路技术和计算机技术的发展为语音识别技术奠定了基础语音处理技术也得到了较大的发展日趋完善和成熟实用阶段 20 世纪 80 年代至今超大规模集成电路的发展和 PC 机的广泛应用促进了计算机技术和人工智能技术的发展也促进了语音识别技术的发展语音处理也不断走向商业化实用化近年来语音识别技术取得显著进步逐渐由试验走向商用专业人士预计在未来数年时间内语音识别技术将会广泛进入工业控制家用电器通信设备制造汽车电子消费电子产品服务行业等各个领域语音识别技术所涉及的领域包括信号处理模式识别概率论和信息论发声机理和听觉机理人工智能等等机器人是具有一些类人的机械电子装置随着技术的发展机器人的发展趋势表现为更加智能化和人性化机器人的定义是能够感知环境能够学习和对外界环境有一种逻辑判断思维的机器其中语音必然成为人和机器人之间交流的最自然同时也是最方便的手段之一语音机器人就是通过语音控制机器人的动作通过特定人语音识别技术实现对机器人的语音控制 5 第二节第二节研究现状研究现状语音识别的研究工作大约开始于 20 世纪 50 年代当时 AT 2 等待 TPUD 上电延时重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 31 3 发地址值为 00 的 SETPLAY 命令 4 发 PLAY 命令器件会从此 00 地址开始放音当出现 EOM 时立即中断停止放音如果从 00 处录音则按以下时序 1 发 POWER UP 命令 2 等待 TPUD 上电延时 3 发 POWER UP 命令 4 等待 2 倍 TPUD 5 发地址值为 00 的 SETREC 命令 6 发 REC 命令器件便从 00 地址开始录音一直到出现 OVF 存贮器末尾时录音停止指令表指令 5 位控制码操作摘要 POWERUP 00100 上电等待 TPUD 后器件可以工作 SET PLAY 11100 从指定地址开始放音必须后跟 PLAY 指令使放音继续 PLAY 11110 从当前地址开始放音直至 EOM 或 OVF SET REC 10100 从指定地址开始录音必须后跟 REC 指令录音继续 REC 10110 从当前地址开始录音直至 OVF 或停止 SET MC 11101 从指定地址开始快进必须后跟 MC 指令快进继续 MC 11111 执行快进直到 EOM 若再无信息则进入 OVF 状态 STOP 0X110停止当前操作 STOP WRDN 0X01X 停止当前操作并掉电 RINT 0X110 读状态 OVF 和 EOM 三 SPI 端口的控制位重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 32 图 4 3 SPI 时序图四 SPI 控制寄存器 SPI 控制寄存器控制器件的每个功能如录放录音信息检索快进上电掉电开始和停止操作忽略地址指针等第四节第四节凌阳单片机语音处理凌阳单片机语音处理一一凌阳音频压缩编码凌阳音频压缩编码语音压缩编码中的数据量的计算方法数据量采样频率量化位数 8 字节数声道数目压缩编码的目的是通过对资料的压缩达到高效率存储和转换的结果即在保证一定声音质量的条件下以最小的资料率来表达和传送声音信息压缩编码是必要的实际应用中未经压缩编码的音频资料量很大进行传输或存储是不现实的所以要通过对信号趋势的预测和冗余信息处理进行资料的压缩这样就可以使我们用较少的资源建立更多的信息通过对资料的压缩达到高效率存储和转换资料的结果即在保证一定声音质量的条件下以最小的资料率来表达和传送声音信息而常见的几种音频压缩编码波形编码将时间域信号直接变换为数字代码力图使重建语音波形保持原语音信号的波形形状其特点是压缩比大计算量大音质不高但廉价参数编码参数编码又称为声源编码是将信源信号在频率域或其他正交变换域提取特征参数并将其变换成数字代码进行传输其特点是压缩重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 33 比大计算量大音质不高但廉价混合编码混合编码使用参数编码技术和波形编码技术计算机的发展为语音编码技术的研究提供了强有力的工具大规模超大规模集成电路的出现则为语音编码的实现提供了基础 20 世纪 80 年代以来语音编码技术有了实质性的进展产生了新一代的编码方法这就是混合编码它将波形编码和参数编码组合起来克服了原有波形编码和参数编码的弱点结合各自的长处力图保持波形编码的高质量和参数编码的低速率二二语音播报流程图语音播报流程图本次毕业设计采用凌阳的 SACM S480 音频格式正是这种混合编码方式综合了参数和波形编码的优点该压缩算法压缩比较为 80 3 存储量大音质介于 A2000 和 S240 之间适用于语音播放 SACM S480 自动方式主程序流程如图所示图 4 4 S480 自动播放流程图其相关 API 函数如下所示 int SACM S480 Initial int Init Index 初始化 void SACM S480 ServiceLoop void 获取语音资料填入译码队列重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 34 void SACM S480 Play int Speech Index int Channel int Ramp Set 播放 void SACM S480 Stop void 停止播放 void SACM S480 Pause void 暂停播放 void SACM S480 Resume void 暂停后恢复 void SACM S480 Volume Volume Index 音量的控制 unsigned int SACM S480 Status void 获取模块的状态 Call F FIQ Service SACM S480 中断服务函数三三凌阳语音的凌阳语音的 API 函数函数凌阳 SPCE061A 将语音识别做成模块并通过 API 调用来实现 API 的 C 语言格式的定义在 BSRSD H 文件中汇编格式的定义在 BSRSD H 文件中常见的 API 函数为 RAM 初始化 int BSR DeleteSDGroup int SDGroupNo 训练函数 int BSR Train int World int TrainMode 语音识别器初始化 int BSR InitRecognizer int AudioSource 获取识别结果 int BSR GetResult void 停止识别 void BSR StopRecognizer void 启动实施控制 void BSR EnableCPUIndicator void 四四特定人语音命令识别特定人语音命令识别特定人语音识别是指使用的语音模块由单人训练对训练人的语音信息识别准确率较高但是对于其他的识别准确率则相对较低甚至是无法识别选用特定人识别系统时使用单片机提供的 API 函数库即可实现自行设计要识别的语音命令和播放的应答语音相关信息其结构图流程图为重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 35 开始初始化调用提示信息训练命令开始识别启动实时监控调用语音识别播放识别主循环函数获取识别结果图 4 5 特定人语音识别流程图第五节第五节 LCD 显示子程序显示子程序 LCD1602 液晶模块内部的控制器共有 11 条控制指令如表 10 14 所示序号指令 RSR WD7D6D5D4D3D2D1D0 1 清显示 0000000001 2 光标返回 000000001 3 置输入模式 00000001I DS 4 显示开关控制 0000001DCB 5 光标或字符移位 000001S C R L 6 置功能 00001DLNF 7 置字符发生存贮器地址 0001 字符发生存贮器地址重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 36 8 置数据存贮器地址 001 显示数据存贮器地址 9 读忙标志或地址 01BF 计数器地址 10 写数到 CGRAM 或 DDRAM 10 要写的数据内容 11 从 CGRAM 或 DDRAM 读数 11 读出的数据内容 LCD1602 液晶模块的读写操作屏幕和光标的操作都是通过指令编程来实现的说明 1 为高电平 0 为低电平指令 1 清显示指令码 01H 光标复位到地址 00H 位置指令 2 光标复位光标返回到地址 00H 指令 3 光标和显示模式设置 I D 光标移动方向高电平右移低电平左移 S 屏幕上所有文字是否左移或者右移高电平表示有效低电平则无效指令 4 显示开关控制 D 控制整体显示的开与关高电平表示开显示低电平表示关显示 C 控制光标的开与关高电平表示有光标低电平表示无光标 B 控制光标是否闪烁高电平闪烁低电平不闪烁指令 5 光标或显示移位 S C 高电平时移动显示的文字低电平时移动光标指令 6 功能设置命令 DL 高电平时为 4 位总线低电平时为 8 位总线 N 低电平时为单行显示高电平时双行显示 F 低电平时显示 5x7 的点阵字符高电平时显示 5x10 的点阵字符指令 7 字符发生器 RAM 地址设置指令 8 DDRAM 地址设置指令 9 读忙信号和光标地址 BF 为忙标志位高电平表示忙此时模块不能接收命令或者数据如果为低电平表示不忙指令 10 写数据指令 11 读数据芯片时序表如下读状态输入RS L R W H E H输出D0 D7 状态字写指令输入RS L R W L D0 D7 指令码 E 高脉冲输出无读数据输入RS H R W H E H输出D0 D7 数据写数据输入RS H R W L D0 D7 数据 E 高脉冲输出无重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 37 读写操作时序如图所示图 4 6 读操作时序图 4 7 写操作时序其显示过程的流程图为重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 38 开始初始化写入第二行数据设置第二行位置写入第一行数据设置第一行位置图 4 8 显示流程图第六节第六节本章小结本章小结软件是一个系统能否正常运行的关键因素本章先从总体结构描述了系统的软件的结构然后重点介绍了用于语音处理的语音芯片的软件结构介绍凌阳单片机用于语音处理的的运用方法最后介绍 LCD 显示模块的子程序设计方法指导源代码的编写为后续的系统调试做准备重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 39 第五章第五章系统测试系统测试第一节第一节仿真测试仿真测试在整个设计之初电路设计之前采用电路仿真软件 Multisim 对电路进行仿真测试电路的性能在完成系统硬件和软件设计之后就可以利用仿真软件 Protuse 对系统进行仿真测试检测其是否符合设计要求方便进行修改软件简介 Multisim 是美国国家仪器 NI 有限公司推出的以 Windows 为基础的仿真工具适用于板级的模拟数字电路板的设计工作它包含了电路原理图的图形输入电路硬件描述语言输入方式具有丰富的仿真分析能力 Proteus 软件是英国 Labcenter electronics 公司出版的 EDA 工具软件它不仅具有其它 EDA 工具软件的仿真功能还能仿真单片机及外围器件它是目前最好的仿真单片机及外围器件的工具虽然目前国内推广刚起步但已受到单片机爱好者从事单片机教学的教师致力于单片机开发应用的科技工作者的青睐 Proteus 是世界上著名的 EDA 工具仿真软件从原理图布图代码调试到单片机与外围电路协同仿真一键切换到 PCB 设计真正实现了从概念到产品的完整设计是目前世界上唯一将电路仿真软件 PCB 设计软件和虚拟模型仿真软件三合一的设计平台其处理器模型支持 8051 HC11 PIC10 12 16 18 24 30 DsPIC33 AVR ARM 8086 和 MSP430 等 2010 年即将增加 Cortex 和 DSP 系列处理器并持续增加其他系列处理器模型在编译方面它也支持 IAR Keil 和 MPLAB 等多种编译器设计之初对于一些模拟电路就可以在仿真工具里面构建电路结构利用软件提供的丰富资源对电路进行仿真指导硬件电路的设计第二节第二节硬件测试硬件测试系统硬件平台设计并制作完成后就需要对其进行测试首先是硬件的电气特性连接利用万用表对每根线的连接进行检测检查是否有短路断路电路检测所选用的元器件焊接后是否有损坏情况的发生重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 40 第三节第三节系统测试系统测试系统的测试主要包括单元测试和系统测试单元测试主要包括控制芯片和语音芯片测试和显示测试控制芯片测试主要是指单片机是否能够正常工作震荡电路能够正常工作检测其能否控制外围电路语音芯片主要是其能否正常工作采用按键模式能否使其正常工作显示测试主要是指 LCD 在正常的控制程序下能否正常显示出设计结果调节对比度使其亮度在一个较为合适位置第四节第四节本章小结本章小结完成硬件和软件的设计后就需要对系统进行测试本章主要是对系统进行测试包括设计前期的仿真和后期的软硬件联合测试根据测试结果进行修改实现系统功能重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 41 参考文献参考文献 1 李晶皎嵌入式语音技术及凌阳 16 位单片机应用 M 北京北京航空航天大学出版社 2003 2 张培仁张志坚高修峰十六位单片微处理器原理及应用凌阳 SPCE061A M 北京清华大学出版社 2005 3 楼然苗李光飞 51 系列单片机设计实例第 2 版 M 北京北京航空航天大学出版社 2005 4 铃木雅臣晶体管电路设计 M 北京科学出版社 2004 5 刘钰马艳丽董蓓蓓语音识别技术概述 J 计算机光盘软件与应用 2010 5 98 99 6 刘惠强基于凌阳 61A 板单片机的多路舵机平滑控制 J 技术交流 2008 10 30 58 59 7 孙行伟贾春梅基于孤立词语音识别定位系统的研究与设计 J 宁波工程学院学报 2010 年 9 月第 22 卷第 3 期 8 刘康康柯有安林茂庸采样率可变的连续语音采集系统的设计 J 数据采集与应用 1991 年 3 月第 6 卷第一期 9 陈发新陈亚骏基于 ADSP2181 的实时语音采集与处理板的原理与组成 J 数据采集与处理 1999 年第 14 卷第 1 期 10 百度百科 11 百度百科 12 Dou Suk Kim Soo Young Lee Intelligent judge neural network for speech recognition J Neural Processing Letters Voi 1 No 1 17 20 1994 13 齐子元谢桂海刘毅一种实时语音信号采集处理系统的设计与实现 J 计算机工程与应用 2005 9 105 107 14 张志勇宋阳基于嵌人式下的语音机器人的设计与实现 J 长春师范学院学报自然科学版 2008 年 10 月第 27 卷第 5 期 39 41 15 俞铁城语音识别的发展现状 J 通信市场 2005 年 5 月 36 37 附附录录重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 42 一英文原文一英文原文 Intelligent Judge Neural Network for Speech Recognition Abstract An intelligent judge neural network IJNN is developed to make decisions out of contradictory arguments which may come from different classifiers with different characteristics and or input features For speech recognition applications a multi layer perceptron classifies the word as a spectro temporal pattern while a neural prediction model or hidden control neural network relies on dynamic nature of the speech signal The judge accepts input values from the lower level neural network classifiers and provides ruling verdicts Two intelligent judges have been investigated The neuro judge rules by extracting decision rules from training data i e disputes between the two classifiers while the fuzzy judge just utilizes min max operations The IJNN demonstrates better recognition rates More importantly its performance is much less sensitive to the choice of training data 1 Introduction Classification of complex patterns by adaptive learning makes neural networks very attractive for speech recognition applications Also due to its inherent parallelism neural networks can take advantage of special hardwares for real time applications A number of neural network models has been successfully applied to speech recognition problems 1 Many of them regard the speech signal as spectro temporal pattern and utilize classification function of neural networks with proper time alignment while only a few utilizes dynamic nature of speech signal The former includes multi layer perceptron MLP 2 Self Organizing Feature Map 3 and Time Delay Neural Networks 4 These approaches assume that separate utterances of the same word should follow similar paths in the feature space and only the time taken to traverse the path should differ The latter regards speech signal as output of nonlinear dynamic system and trie to model the system with recurrent neural networks Recurrent connections may come from hidden layer 5 6 or output layer 7 This time dynamics may also be modelled by hidden control neural networks HCNN which combinesMLP with state transition of hidden Markov model 8 9 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 43 Although both approaches have been quite successful they had started with different nature of speech signal and showed advantages and disadvantages It has been shown that spectrotemporal pattern approaches are better for certain phonemes while dynamic temporal flow approaches provide better results for the others In this paper we report a hierachial neural network approach which combines both nature of the speech signal and demonstrates improved performance both in recognition rates and on robustness to training data Unlike other modular neural networks which utilize sub modules with identical architecture for different sets of patterns 10 each lower level classifier submodule here has unique architecture to look at the problem with different insights 2 Intelligent judge neural network architecture The Intelligent Judge Neural Network IJNN is composed of a lower level classifier module and an upper level verdict module The lower level classifier module consists of several neural network submodules which try to classify the input patterns based on different aspects and features For the speaker independent isolated word recognition applications we choose only two classifier submodules i e multilayer perceptron MLP 2 for spectro temporal patterns and neural prediction model NPM 7 or hidden control neural networks I ICNN 8 9 for nonlinear system dynamics As shown in figure 1 output values of these two neural network classifiers are fed to the verdict module to combine both characteristics The lower level MLP submodule classifies input speech signals into M words One bipolar output neuron with Sigmoidal nonlinearity is assigned to each word For time alignment we adopted simple trace segmentation scheme which provides 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 44 reasonable performance without serious computation time 11 This simple time normalization procedure removes variations of speech periods especially for steady long pronounced vowels The neural prediction model NPM is a recurrent neural network where the output value is fed back to the input Dynamics of each word is modelled by I0 NPMs in series where each NPM identifies subphoneme The hidden control neural network I ICNN is a recurrent neural network with Viterbi segmentation and quite similar to HMM The neural value x t is representing a state of speech signal at time t and the network may provide transition from one state to another at each time interval To control the state transitions another vector c t is added in the input layer and consists of hidden control neurons For isolated word recognition applications one HCNN is trained to identify dynamics of one word category Since the identification error is a function of both the synaptic weights and control vector c t the adaptive learning for minimum identification error consists of re estimation of the synaptic weights and segmentation of the control vector at each iteration epoch Provided each classifying submodule had small misclassifing rate possibility of misclassification by both submodules becomes much smaller The upper module serves as an intelligent judge and provides verdict for disputes between the classifier submodules Two different intelligent judges are investigated The neuro judge itself is an MLP to extract the decision rules from dispute cases The fuzzy judge first gets smaller values from the two classifier modules for each pattern class and then selects the class with maximum values The output values of the lower level classifiers are fed to the input of the upper level judge network For recognition of M words the judge modules have 2M inputs and M outputs 3 Experiments for Korean digit recognition We had applied the IYNN for Korean digit recognition application Although only 10 words from 0 to 9 are aimed for recognition it turned out to be very tough problem All the Korean digits have only one syllable which does not have enough features for accurate classification To make things worse some of them are very close to each other For example number 1 pronounced as il and number 7 chil are only different at the initial consonant Number 1 il and number 2 i and also number 3 sam and number 4 sa are only different at 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 45 the last consonant We had collected 60 sets of speech data from 12 speakers 5 sets each and used 30 sets 6 speakers for training and the other 30 sets for testing This small number of training data also poses very tough generalization problem To check sensitivities of the recognition system on training data 3 experiments are conducted by randomly selecting different training data sets At the first experiment we use the lYNN with the MLP and NPM lower level classifiers Ten cepstrum coefficients are used as input features Number of input for the upper level judge is 20 All 3 neural modules have 10 outputs Number of neurons for hidden layer is set to 10 Although both the lower level MLP and NPM classifiers are trained successfully recognition rates for test patterns are poor especially for the NPM network The poor performance of the NPM may come from incorrect assignments of the sub module to sub phonemes The neuro judge MLP is trained by error back propagation with the same raining data used for the lower level classifiers Before applied to the inputsof the upper level judge the NPM output valuesare mapped into 1 1 region By applying neurojudge as shown in table 1 the misclassificationrates are greatly reduced and much less sensitive to the choice of training set At the second experiment the IJNN with the MLP and HCNN lower level classifiers are used For the input features 14 delta Cepstrum coefficients and 1 delta power magnitude are used As shown in table 2 the lower level MLP and HCNN classifiers show 98 4 and 92 1 correct average recognition rates for the 重庆邮电大学本科毕业设计论文重庆邮电大学本科毕业设计论文 46 test data respectively The relative low recognition rates of the HCNN m y be contributed from two stage process which first segments the word into intervals and later calculates error at the subregions Errors at the segmentation stage can never be corrected in this model After training the judge MLP the recognition rates increase to 98 8 in average Although the reduction of misclassification rate is 25 only it clearly demonstrates usefulness of the proposed IJN N The fuzzy judge does better jobs than neuro judge i e 98 9 average recognition rate but the difference is too small to come into conclusion Actually the MLP alone has very small 1 6 misclassification rate and it is difficult to increase the recognition rate in this range It is also worth noting that deviations of the recognition rates among different training sets become much smaller for the IJNNs We believe this lower sensitity to trainingg data comes from higher generalization capability due to combination of spectro temporal pattern aspects and dynamic characteristics of speech signal In practical applications training data sets are always limited and no a priori knowledge is given to the validity of the data this low sensitivity to training data is extremely important Performance of this IJNN may be further improved by several techniques First performance of thelower level classifiers can be improved by using more training data The lower level classifier networks and the upper level judge network may also be trained with different data which actually generates disputes among lower level classifiers during training of the judge network Most im

人人文库> 全部分类> 应用文书 > 研究报告

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

语音信号的采集与处理毕业论文.doc

文档简介

温馨提示

最新文档

评论

语音信号的采集与处理毕业论文.doc

文档简介

温馨提示

最新文档

评论

相关文档