已阅读5页,还剩45页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1 1 基于基于 GUI 的音频采集处理系统的音频采集处理系统 注 本实验是对 东 北 大 学 中 荷 学 院 孤立文字的 识别 首先是 GUI 的建立 拖动所需控件 双击控件 修改控件的参数 主要有 string Tag 这个是回调函数的依据 其中还有些参数如 value style 也是需要注意的 这个在实际操作中不能忽视 这里需要给说明一下 图中所示按钮都是在一个按钮组里面 都属 于按钮组的子控件 所以在添加回调函数时 是在按钮组里面添加 的 也就是说右击三个按钮外面的边框 选择 View Callback SelectionChange 则在主函数中显示该按钮的回调函数 function uipanel1 SelectionChangeFcn hObject eventdata handles 以第一个按钮 录音 为例讲解代码 2 下面是 播放 和 保存 的代码 以上就是语音采集的全部代码 程序运行后就会出现这样的界面 3 点击录音按钮 录音结束后就会出现相应波形 点击保存 完成声音的保存 保存格式为 wav 这就完成了声音的 采集 4 2 声音的处理与识别声音的处理与识别 2 12 1打开文件打开文件 语音处理首先要先打开一个后缀为 wav 的文件 这里用到的不是按 钮组 而是独立的按钮 按钮 打开 的回调函数如下 function pushbutton1 Callback hObject eventdata handles 其中 pushbutton1 是 打开 按钮的 Tag 在回调函数下添加如下代码 运行结果如图 5 6 2 22 2预处理预处理 回调函数如下 function pushbutton2 Callback hObject eventdata handles 运行结果如图 7 2 32 3短时能量短时能量 短时能量下的回调函数 function pushbutton3 Callback hObject eventdata handles 其回调函数下的代码是 8 9 2 42 4端点检测端点检测 这里要先声明一点 为了避免在以后的函数调用中 不能使用前面 的变量 所以其实后面的函数都包含了前面的部分 显而易见这样 程序就会显得很冗长 这也是值得以后修改的地方 function pushbutton4 Callback hObject eventdata handles 10 11 12 13 2 52 5生成模版生成模版 本功能和上面重复的部分省略掉了 现在只补充添加的代码 14 2 62 6语音识别语音识别 将打开的语音与提前录好的语音库进行识别 采用的是 DTW 算法 识别完后就会在相应的文本框里显示识别的文字 代码如下 15 程序运行前后的对比图 16 GUI 的整体效果图 17 总结 实验已经实现了对 东 北 大 学 中 荷 学 院 文字 的识别 前提是用模版的语音作为样本去和语音库测试 这已经可 以保证 的正确率 这说明算法是正确的 只是需要优化 而现场录音和模版匹配时 则不能保证较高的正确率 这说明特征 参数的提取这方面还不够完善 特征参数提取的原则是类内距离尽 量小 类间距离尽量大的原则 这是需要以后完善的地方 也需要优化 先生成一个模版库 然后用待测语音和模版库语音识 别 让这个模版库孤立出来 不需要每次测试都要重复生成模版库 提高运算速率 以后有机会可以实现连续语音的识别 18 附件附件 这是全部代码文件 mfcc mat 文件是程序运行过程中生成的 test 文件夹里面存放了录音的模版 这里是 6 个 M 文件 如下 1 WienerScalart96 m function output WienerScalart96 signal fs IS output WIENERSCALART96 signal fs IS 19 Wiener filter based on tracking a priori SNR usingDecision Directed method proposed by Scalart et al 96 In this method it is assumed that SNRpost SNRprior 1 based on this the Wiener Filter can be adapted to a model like Ephraims model in which we have a gain function which is a function of a priori SNR and a priori SNR is being tracked using Decision Directed method Author Esfandiar Zavarehei Created MAR 05 if nargin 3 nfft IS nfft wnd IS window if isfield IS IS IS IS IS else IS 25 end end UP TO HERE pre emph 0 20 signal filter 1 pre emph 1 signal NIS fix IS fs W SP W 1 number of initial silence segments y segment signal W SP wnd This function chops the signal into frames Y fft y YPhase angle Y 1 fix end 2 1 Noisy Speech Phase Y abs Y 1 fix end 2 1 Specrogram numberOfFrames size Y 2 FreqResol size Y 1 N mean Y 1 NIS initial Noise Power Spectrum mean LambdaD mean Y 1 NIS 2 initial Noise Power Spectrum variance alpha 99 used in smoothing xi For Deciesion Directed method for estimation of A Priori SNR NoiseCounter 0 NoiseLength 9 This is a smoothing factor for the noise updating G ones size N Initial Gain used in calculation of the new xi Gamma G X zeros size Y Initialize X memory allocation h waitbar 0 Wait for i 1 numberOfFrames VAD and Noise Estimation START if i NIS If initial silence ignore VAD SpeechFlag 0 NoiseCounter 100 else Else Do VAD NoiseFlag SpeechFlag NoiseCounter Dist vad Y i N NoiseCounter Magnitude Spectrum Distance VAD end if SpeechFlag 0 If not Speech Update Noise Parameters 21 N NoiseLength N Y i NoiseLength 1 Update and smooth noise mean LambdaD NoiseLength LambdaD Y i 2 1 NoiseLength Update and smooth noise variance end VAD and Noise Estimation END gammaNew Y i 2 LambdaD A postiriori SNR xi alpha G 2 Gamma 1 alpha max gammaNew 1 0 Decision Directed Method for A Priori SNR Gamma gammaNew G xi xi 1 X i G Y i Obtain the new Cleaned value waitbar i numberOfFrames h num2str fix 100 i numberOfFrames end close h output OverlapAdd2 X YPhase W SP W Overlap add Synthesis of speech output filter 1 1 pre emph output Undo the effect of Pre emphasis function ReconstructedSignal OverlapAdd2 XNEW yphase windowLen ShiftLen Y OverlapAdd X A W S Y is the signal reconstructed signal from its spectrogram X is a matrix with each column being the fft of a segment of signal A is the phase angle of the spectrum which should have the same dimension as X if it is not given the phase angle of X is used which in the case of real values is zero assuming that its the magnitude W is the window length of 22 time domain segments if not given the length is assumed to be twice as long as fft window length S is the shift length of the segmentation process for example in the case of non overlapping signals it is equal to W and in the case of 50 overlap is equal to W 2 if not givven W 2 is used Y is the reconstructed time domain signal Sep 04 Esfandiar Zavarehei if nargin 2 yphase angle XNEW end if nargin 3 windowLen size XNEW 1 2 end if nargin 4 ShiftLen windowLen 2 end if fix ShiftLen ShiftLen ShiftLen fix ShiftLen disp The shift length have to be an integer as it is the number of samples disp shift length is fixed to num2str ShiftLen end FreqRes FrameNum size XNEW Spec XNEW exp j yphase if mod windowLen 2 if FreqResol is odd Spec Spec flipud conj Spec 2 end else Spec Spec flipud conj Spec 2 end 1 end sig zeros FrameNum 1 ShiftLen windowLen 1 weight sig 23 for i 1 FrameNum start i 1 ShiftLen 1 spec Spec i sig start start windowLen 1 sig start start windowLen 1 real ifft spec windowLen end ReconstructedSignal sig function Seg segment signal W SP Window SEGMENT chops a signal to overlapping windowed segments A SEGMENT X W SP WIN returns a matrix which its columns are segmented and windowed frames of the input one dimentional signal X W is the number of samples per window default value W 256 SP is the shift percentage default value SP 0 4 WIN is the window that is multiplied by each segment and its length should be W the default window is hamming window 06 Sep 04 Esfandiar Zavarehei if nargin 3 SP 4 end if nargin 2 W 256 end if nargin 4 Window hamming W end Window Window make it a column vector L length signal SP fix W SP N fix L W SP 1 number of segments 24 Index repmat 1 W N 1 repmat 0 N 1 SP 1 W hw repmat Window 1 N Seg signal Index hw function NoiseFlag SpeechFlag NoiseCounter Dist vad signal noise NoiseCounter NoiseMargin Hangover NOISEFLAG SPEECHFLAG NOISECOUNTER DIST vad SIGNAL NOISE NOISECOUNTER NOISEMARGIN HANGOVER Spectral Distance Voice Activity Detector SIGNAL is the the current frames magnitude spectrum which is to labeld as noise or speech NOISE is noise magnitude spectrum template estimation NOISECOUNTER is the number of imediate previous noise frames NOISEMARGIN default 3 is the spectral distance threshold HANGOVER default 8 is the number of noise segments after which the SPEECHFLAG is reset goes to zero NOISEFLAG is set to one if the the segment is labeld as noise NOISECOUNTER returns the number of previous noise segments this value is reset to zero whenever a speech segment is detected DIST is the spectral distance Saeed Vaseghi edited by Esfandiar Zavarehei Sep 04 if nargin 4 NoiseMargin 3 end if nargin 5 Hangover 8 end if nargin 3 NoiseCounter 0 end FreqResol length signal 25 SpectralDist 20 log10 signal log10 noise SpectralDist find SpectralDist 0 0 Dist mean SpectralDist if Dist Hangover SpeechFlag 0 else SpeechFlag 1 end 26 2 mfcc m function cc mfcc k cc mfcc k 计算语音 k 的 MFCC 系数 M 为滤波器个数 N 为一帧语音采样点数 M 24 N 256 归一化 mel 滤波器组系数 bank melbankm M N 22050 0 0 5 m figure plot linspace 0 N 2 129 bank title Mel Spaced Filterbank xlabel Frequency Hz bank full bank bank bank max bank DCT 系数 12 24 for i 1 12 j 0 23 dctcoef i cos 2 j 1 i pi 2 24 end 归一化倒谱提升窗口 w 1 6 sin pi 1 12 12 w w max w 预加重 AggrK double k AggrK filter 1 0 9375 1 AggrK 分帧 FrameK enframe AggrK N 80 加窗 for i 1 size FrameK 1 FrameK i FrameK i hamming N end FrameK FrameK 计算功率谱 S abs fft FrameK 2 disp 显示功率谱 figure plot S axis 1 size S 1 0 2 title Power Spectrum M 24 N 256 xlabel Frame 27 ylabel Frequency Hz colorbar 将功率谱通过滤波器组 P bank S 1 129 取对数后作离散余弦变换 D dctcoef log P 倒谱提升窗 for i 1 size D 2 m i D i w end 差分系数 dtm zeros size m for i 3 size m 1 2 dtm i 2 m i 2 m i 1 m i 1 2 m i 2 end dtm dtm 3 合并 mfcc 参数和一阶差分 mfcc 参数 cc m dtm 去除首尾两帧 因为这两帧的一阶差分参数为 0 cc cc 3 size m 1 2 28 3 getpoint m function StartPoint EndPoint getpoint k fs UNTITLED 此处显示有关此函数的摘要 此处显示详细说明 signal WienerScalart96 k fs sigLength length signal 计算信号长度 t 0 sigLength 1 fs 计算信号对应时间坐标 FrameLen round 0 012 max t sigLength 定义每一帧长度 FrameInc round FrameLen 3 每一帧的重叠区域 选为帧长的 1 3 1 2 tmp enframe signal 1 end FrameLen FrameInc signal signal max abs signal signal double signal signal filter 1 0 9735 1 signal tmp1 enframe signal 1 end 1 FrameLen FrameInc tmp2 enframe signal 2 end FrameLen FrameInc 调用分帧函数 Framesize size tmp1 window 1 Framesize 1 1 Framesize 2 0 a hamming Framesize 2 对原信号进行加窗操作 这里用 hamming 窗 for i 1 Framesize 1 window i 1 Framesize 2 a end tmp1 tmp1 window 获得加窗后信号 tmp1 tmp2 tmp tmp2 tmp2 window tmp tmp window signs tmp1 tmp2 0 02 zcr sum signs diffs 2 FrameLen zcr 保存过零率结果 FrameNB Framesize 1 保存数据帧个数 clear tmp1 tmp2 signs diffs a window Framesize 清除无用变量 计算语音信号的短时幅度 amp sum abs tmp 2 开始进行端点检测 定义变量 amp1 6 amp2 2 最大与最小能量幅度阈值 maxsilence 5 最大沉默帧数目 5 长度 5 12ms 72ms minlen 15 最小语音长度 15 12ms 180ms status 0 初始状态 静音段 0 语音段 1 结束段 2 此算法忽略了过度段的判断 count 0 记录语音长度 29 求前 5 帧与后 5 帧的能量幅度平均值与过零率均值 认为前 5 帧与后 5 帧不为信号有效部分 a mean amp 1 5 mean amp FrameNB 4 FrameNB b mean zcr 1 5 mean zcr FrameNB 4 FrameNB 对求得的过零率与能量幅度进行修正 amp abs amp a zcr abs zcr b 设定阈值 amp1 min amp1 max amp 4 amp2 min amp2 max amp 8 设定两个能量门限 其中 amp1 为高能量门限 amp2 为 低能量门限 zcr1 0 001 过零律阈值 for i 6 maxsilence FrameNB switch status case 0 语音信号处于静音段 if amp i amp1 帧能量大于高能量门限时 确信进入语音段 x1 i count count 1 for j i 1 1 6 进一步找到准确起始点 if zcr j zcr1 count count 1 else break end end status 1 end case 1 语音信号处于语音段 if zcr i zcr1 count count 5 else for j i 1 i 4 进一步向前搜索 找到准确终止点 if zcr j zcr1 count count 1 else 30 break end end if count minlen 语音信号长度小于最小语音长度时 认为信号为无效噪声 重新 初始化变量搜索 status 0 count 0 x1 0 x2 0 else status 2 语音信号有效时进入结束段 end end case 2 break end end StartPoint x1 EndPoint x2 31 4 dtw m function dist dtw test ref global x y min y max global t r global D d global m n t test r ref n size t 1 m size r 1 d zeros m 1 D ones m 1 realmax D 1 0 如果两个模板长度相差过多 匹配失败 if 2 m n 3 2 n mxa xb xa 按下面三个区域匹配 1 xa xa 1 xb xb 1 N for x 1 xa y max 2 x y min round 0 5 x 32 warp end for x xa 1 xb y max round 0 5 x n m y min round 0 5 x warp end for x xb 1 n y max round 0 5 x n m y min round 2 x n m warp end elseif xa xb xa xb 按下面三个区域匹配 0 xb xb 1 xa xa 1 N for x 1 xb y max 2 x y min round 0 5 x warp end for x xb 1 xa y max 2 x y min round 2 x n m warp end for x xa 1 n y max round 0 5 x n m y min round 2 x n m warp end elseif xa xb xa xb 按下面两个区域匹配 0 xa xa 1 N for x 1 xa y max 2 x y min round 0 5 x warp end for x xa 1 n y max round 0 5 x n m y min round 2 x n m 33 warp end end 返回匹配分数 dist D m function warp global x y min y max global t r global D d global m n d D for y y min y max D1 D y if y 1 D2 D y 1 else D2 realmax end if y 2 D3 D y 2 else D3 realmax end d y sum t x r y 2 min D1 D2 D3 end D d 34 5 record function varargout record varargin RECORD MATLAB code for record fig RECORD by itself creates a new RECORD or raises the existing singleton H RECORD returns the handle to a new RECORD or the handle to the existing singleton RECORD CALLBACK hObject eventData handles calls the local function named CALLBACK in RECORD M with the given input arguments RECORD Property Value creates a new RECORD or raises the existing singleton Starting from the left property value pairs are applied to the GUI before record OpeningFcn gets called An unrecognized property name or invalid value makes property application stop All inputs are passed to record OpeningFcn via varargin See GUI Options on GUIDE s Tools menu Choose GUI allows only one instance to run singleton See also GUIDE GUIDATA GUIHANDLES Edit the above text to modify the response to help record Last Modified by GUIDE v2 5 01 Nov 2014 19 16 55 Begin initialization code DO NOT EDIT gui Singleton 1 gui State struct gui Name mfilename gui Singleton gui Singleton gui OpeningFcn record OpeningFcn gui OutputFcn record OutputFcn gui LayoutFcn 35 gui Callback if nargin end if nargout varargout 1 nargout gui mainfcn gui State varargin else gui mainfcn gui State varargin end End initialization code DO NOT EDIT Executes just before record is made visible function record OpeningFcn hObject eventdata handles varargin This function has no output args see OutputFcn hObject handle to figure eventdata reserved to be defined in a future version of MATLAB handles structure with handles and user data see GUIDATA varargin command line arguments to record see VARARGIN Choose default command line output for record handles output hObject Update handles structure guidata hObject handles UIWAIT makes record wait for user response see UIRESUME uiwait handles figure1 Outputs from this function are returned to the command line function varargout record OutputFcn hObject eventdata handles varargout cell array for returning output args see VARARGOUT hObject handle to figure eventdata reserved to be defined in a future version of MATLAB handles structure with handles and user data see GUIDATA Get default command line output from handles structure 36 varargout 1 handles output Executes when selected object is changed in uipanel1 function uipanel1 SelectionChangeFcn hObject eventdata handles hObject handle to the selected object in uipanel1 eventdata structure with the following fields see UIBUTTONGROUP EventName string SelectionChanged read only OldValue handle of the previously selected object or empty if none was selected NewValue handle of the currently selected object handles structure with handles and user data see GUIDATA switch get hObject tag 根据 tag 的名字判断是哪一个按钮白按 下 case radiobutton1 fs 22050 取样频率 duration 1 录音时间 fprintf 录音中 y wavrecord duration fs fs duration fs 是录音数据点数 handles y y guidata hObject handles 这两句是为了便于在其他的回调函数里面 调用 axes handles axes1 cla reset box on set gca XTickLabel YTickLabel axes handles axes1 plot y fprintf 录音结束 n case radiobutton2 y handles y 调用时采用这种形式 fs 22050 fprintf 播放中 wavplay y fs fprintf 播放结束 n case radiobutton3 Filename Pathname uiputfile wav 保存录音的声音 fullpath strcat Pathname Filename y handles y fs 22050 nbits 16 每点的分辨率为 16 bit wavwrite y fs nbits fullpath fprintf 已保存到当前工作目录 end 37 6 processing function varargout processing varargin PROCESSING M file for processing fig PROCESSING by itself creates a new PROCESSING or raises the existing singleton H PROCESSING returns the handle to a new PROCESSING or the handle to the existing singleton PROCESSING Property Value creates a new PROCESSING using the given property value pairs Unrecognized properties are passed via varargin to processing OpeningFcn This calling syntax produces a warning when there is an existing singleton PROCESSING CALLBACK and PROCESSING CALLBACK hObject call the local function named CALLBACK in PROCESSING M with the given input arguments See GUI Options on GUIDE s Tools menu Choose GUI allows only one instance to run singleton See also GUIDE GUIDATA GUIHANDLES Edit the above text to modify the response to help processing Last Modified by GUIDE v2 5 19 Nov 2014 20 49 02 Begin initialization code DO NOT EDIT gui Singleton 1 gui State struct gui Name mfilename 38 gui Singleton gui Singleton gui OpeningFcn processing OpeningFcn gui OutputFcn processing OutputFcn gui LayoutFcn gui Callback if nargin end if nargout varargout 1 nargout gui mainfcn gui State varargin else gui mainfcn gui State varargin end End initialization code DO NOT EDIT Executes just before processing is made visible function processing OpeningFcn hObject eventdata handles varargin This function has no output args see OutputFcn hObject handle to figure eventdata reserved to be defined in a future version of MATLAB handles structure with handles and user data see GUIDATA varargin unrecognized PropertyName PropertyValue pairs from the command line see VARARGIN Choose default command line output for processing handles output hObject Update handles structure guidata hObject handles UIWAIT makes processing wait for user response see UIRESUME uiwait handles figure1 39 Outputs from this function are returned to the command line function varargout processing OutputFcn hObject eventdata handles varargout cell array for returning output args see VARARGOUT hObject handle to figure eventdata reserved to be defined in a future version of MATLAB handles structure with handles and user data see GUIDATA Get default command line output from handles structure varargout 1 handles output Executes when selected object is changed in uipanel1 function uipanel1 SelectionChangeFcn hObject handles hObject handle to the selected object in uipanel1 eventdata structure with the following fields see UIBUTTONGROUP EventName string SelectionChanged read only OldValue handle of the previously selected object or empty if none was selected NewValue handle of the currently selected object handles structure with handles and user data see GUIDATA Executes on button press in pushbutton1 function pushbutton1 Callback hObject eventdata handles hObject handle to pushbutton1 see GCBO eventdata reserved to be defined in a future version of MATLAB handles structure with handles and user data see GUIDATA 打开待处理的语音文件 filename pathname uigetfile wav All Wave Files 选 择语音文件 if filename 0 return end 怎样打开一个文件 file fullfile pathname filename signal fs Bits wavread file 读取打开的文件的信号 采样频率 采样位 数 40 handles signal signal handles fs fs handles Bits Bits guidata hObject handles 将文件的信号 采样频率 采样位数 保存到句柄 方便以后调用 axes handles axes1 cla reset box on set gca XTickLabel YTickLabel axes handles axes1 plot signal title 原始语音信号 显示波形 wavplay signal fs 播放录音 Executes on button press in pushbutton2 function pushbutton2 Callback hObject eventdata handles hObject handle to pushbutton2 see GCBO eventdata reserved to be defined in a future version of MATLAB handles structure with handles and user data see GUIDATA signal handles signal fs handles fs Bits handles Bits 调用文件的信号 采样频率 采样位数 这都是格式 记住就行 sigLength length signal 计算信号长度 t 0 sigLength 1 fs 计算信号对应时间坐标 signal signal max abs signal x double signal x filter 1 0 9735 1 x axes handles axes2 cla reset box on set
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 进口代理合同15篇
- 2025年电商拣货员考试题及答案
- 青春正好的演讲稿
- 仲裁协议书冲突规则
- 只口头协议书
- 转让协议书没有公章
- 2026年中国田螺养殖项目经营分析报告
- CA砂浆现场拌制灌注凸台树胎树脂施工监控要点secret教案(2025-2026学年)
- 八年级物理下册第五章第二节速度教案(2025-2026学年)
- 大班社会公开课生活中的标志教案(2025-2026学年)
- 全国大学生职业规划大赛《电子竞技运动与管理》专业生涯发展展示【高职(专科)】
- 电缆检验员安全培训资料课件
- 建筑工地消防安全培训课件
- AI辅助阅读疗愈模式在智慧图书馆的构建与发展
- 医院《新生儿病室工作制度》试题与答案
- 特种车辆租赁管理办法
- 学堂在线 知识产权法 章节测试答案
- 风险合规培训课件
- 2025时事政治3月考试题库(附答案)
- 在线监测设备知识培训课件
- 疤痕病人护理查房
评论
0/150
提交评论