




已阅读5页,还剩14页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
附件1外文资料翻译译文改进型智能机器人的语音识别方法2、语音识别概述最近,由于其重大的理论意义和实用价值,语音识别已经受到越来越多的关注。到现在为止,多数的语音识别是基于传统的线性系统理论,例如隐马尔可夫模型和动态时间规整技术。随着语音识别的深度研究,研究者发现,语音信号是一个复杂的非线性过程,如果语音识别研究想要获得突破,那么就必须引进非线性系统理论方法。最近,随着非线性系统理论的发展,如人工神经网络,混沌与分形,可能应用这些理论到语音识别中。因此,本文的研究是在神经网络和混沌与分形理论的基础上介绍了语音识别的过程。语音识别可以划分为独立发声式和非独立发声式两G12193。非独立发声式是G6363发音模式是由G2345个人来进G15904G16769G13463,其G4557G16769G13463人G2641G1208的识别G17907度G5468G5567,G1306G4439G4557与其G1194人的G6363G1208识别G17907度G5468G5942,G6122者G993能识别。独立发声式是G6363其发音模式是由G993G2528G5192G21848,G993G2528性别,G993G2528G3332G3507的人来进G15904G16769G13463,G4439能识别一个G13688G1319的G6363G1208。一G14336G3332,由于用G6155G993G19668要G6817G1328G16769G13463,独立发声式系统得到了G7368G5203G8879的应用。G6164以,在独立发声式系统中,G1186语音信号中G6564G2474语音G10317G5461是语音识别系统的一个基本G19394G20076。语音识别G2265G6336G16769G13463和识别,G6117G1216可以G6238G4439G11487G1582一G12193模式G2282的识别G1231G2165。G17902G5132G3332,语音信号可以G11487G1328为一G8585G17902过隐马尔可夫模型来G15932G5461的时间G5219G2027。G17902过这些G10317G5461G6564G2474,语音信号G15999G17728G2282为G10317G5461G2533G18339G5194G6238G4439G1328为一G12193意G16277,在G16769G13463程G5219中,这些意G16277G4570G2465G20316到G43G48G48的模型G2454数G1284G16757中。这些G2454数G2265G6336意G16277和G1194G1216G2721应G10378态G6164G4557应的G8022G10587G4506度G2001数,G10378态间的G17728G12239G8022G10587,G12573G12573。经过G2454数G1284G16757以G2530,这个已G16769G13463模式就可以应用到识别G1231G2165G5415中。G17767G1849信号G4570G1262G15999G11842G16760为G17908G6116G16801,其G12946G11842度是可以G16792G1284的。整个过程如G3282一G6164G12046。G3282G20语音识别系统的模G3371G32823、理论与方法G1186语音信号中进G15904独立G6208声G3132的G10317G5461G6564G2474是语音识别系统中的一个基本G19394G20076。G16311G1927这个G19394G20076的最G8981G15904方法是应用线性G20056G8991G1510G16901系数和G48G72G79G20069G10587G1510G16901系数。这两G12193方法G18129是基于一G12193G1563G16786的线形程G5219,G16825G1563G16786G16760为G16840G16817者G6164G6329G7389的语音G10317性是由于声G17959G1861G6403G17908G6116的。这些信号G10317G5461G7512G6116了语音信号最基本的G1821G16901G13479G7512。G9994G13792,在语音信号中,这些非线形信G5699G993G4493G7143G15999G5415G2081的G10317G5461G6564G2474逻辑方法G6164G6564G2474,G6164以G6117G1216使用分型维数来G8991G18339非线形语音扰动。本文利用传统的LPCC和非线性多尺度分形维数G10317G5461G6564G2474研究G5194实现语音识别系统。31线性预测倒谱系数线性G20056G8991系数是一个G6117G1216在G1582语音的线形G20056分析时得到的G2454数,G4439是关于毗邻语音样本间G10317G5461联系的G2454数。线形G20056分析正式基于以下几个G8022念建立起来的,即一个语音样本可以G17902过一些以G2081的样本的线形组合来G5567G17907G3332G1284G16757,根据真实语音样本在G11842切的分析框架(短时间内的)和G20056G8991样本之间的差别的最小平方原则,最G2530G1262G11842G16760出唯一的一组G20056G8991系数。LPC可以用来G1284G16757语音信号的G1510G16901。在语音信号的短时G1510G16901分析中,这是一G12193G10317殊的处理方法。信G17959模型的系统G2001数可以G17902过如下的线形G20056分析来得到其中P代G15932线形G20056G8991G2641G1208,(KG20,2,P)代G15932G20056G8991G2454数,脉冲G2721应用HN来G15932G12046,G1563G16786H(N)的G1510G16901是。那么(G20)式可以扩展为(2)式G4570(G20)带G1849(2),两边G2528时,(2)变G6116(3)。就获得了方程(4)那么可以通过来获得。(5)中G16757算的G1510G16901系数叫G1582LPCC,N代G15932LPCCG2641G1208。在G6117G1216采集LPCCG2454数以G2081,G6117G1216应G16825G4557语音信号进G15904G20056加重,帧处理,加工和终端窗口检G8991G12573,G6164以,中文G2641G1208字“G2081进”的端点检G8991如G32822G6164G12046,接下来,断点检G8991G2530的中文G2641G1208字“G2081进”语音波形和LPCC的G2454数波形如G32823G6164G12046。G32822中文G2641G1208字“G2081进”的端点检G8991G32823断点检G8991G2530的中文G2641G1208字“G2081进”语音波形和LPCC的G2454数波形32语音分形维数计算分形维数是一个与分形的规模与数G18339相关的定值,也是G4557自G6117的G13479G7512相似性的G8991G18339。分形分维G8991G18339是67。G1186G8991G18339的角度来G11487,分形维数G1186整数扩展到了分数,打破了一G14336集拓扑学方面G15999整数分形维数的限制,分数大多是在欧几里得几何尺寸的延伸。G7389许多关于分形维数的定义,例如相似维度,豪斯多夫维度,信G5699维度,相关维度,G4493积维度,G16757盒维度G12573G12573,其中,豪斯多夫维度是最古老G2528时也是最重要的,G4439的定义如【3】G6164G12046其中,G15932G12046G19668要多少个G2345位来覆盖子集F端点检G8991G2530,中文G2641G1208G16801“G2533G2081”的语音波形和分形维数波形如G32824G6164G12046。G32824端点检G8991G2530,中文G2641G1208G16801“G2533G2081”的语音波形和分形维数波形33改进的特征提取方法考虑到LPCC语音信号和分形维数在G15932达上各自的优点,G6117G1216G6238G4439G1216G1120者混合到信号的G10317G2474中,即分形维数G15932G15932G5461语音时间波形G3282的自相似性,G2620G7411性,随G7438性,G2528时,LPCCG10317性在G20652语音G17148G18339和G20652识别G17907度上G1582得G5468G3921。由于人工神经网络的非线性,自G17878应性,G5390大的自学能G2159这些G7138G7186的优点,G4439的优G14403分G12879和G17767G1849G17767出G2721应能G2159G18129使G4439非G5132G17878合G16311G1927语音识别G19394G20076。由于人工神经网络的G17767G1849G11733的数G18339是G3278定的,因此,现在是进G15904正规G2282的G10317G5461G2454数G17767G1849到G2081神经网络G28,在G6117G1216的实G20576中,LPCC和G8611个样本的分形维数G19668要分别G3332G17902过时间规整G2282的网络,LPCC是一个4帧数据(LPCCA1G15LPCCA2G15LPCCA3G15LPCCA4,G8611个G2454数G18129是G204维的),分形维数G15999模G14551G2282为G202维数据,(FG39A1G15FG39A2G15FG39A1A2,G8611一个G2454数G18129是一维),以G1427于G8611个样本的G10317G5461G2533G18339G73894G13G204G14G202G13G206G27G39维,G16825G2641G1208就是G208156个维数是LPCC,G2109下的G202个维数是分形维数。因G13792,这样的一个G10317G5461G2533G18339可以G15932G5461语音信号的线形和非线性G10317G5461。自动语音识别的结构和特征自动语音识别是一G20045G4586端技术,G4439G1813许一G2500G16757算G7438,G10990G14279是一G2500G6175G6357G6496上G11017G14053(G17820尔斯,2G19G19G19)来识别那些G19668要G7403G16847G6122者G1231何G5417音G16786G3803发音的G16801G8731。自动语音识别技术的最终G11458的是G16765那些G993论G16801G8731G18339,G13984G7235G3134音,G16840G16817者变音的人G11464G11345G3332G16840出的G2345G16801能G3827达到G20G19G19G8的G1946G11842G10587(CG54LG56,2G19G192)。G9994G13792,大多数的自动语音识别工程G5084G18129G6227G16760这样一个现G10378,即G4557于一个大的语音G16801G8731G2345位,G5415G2081的G1946G11842度G8712平G1185G9994G1314于G28G19G8。G1042一个例子,G39G85G68G74G82NG10G86G49G68G87G88G85G68G79G79G92G54PG72G68KG76NG74G6122者G44G37G48G1856G2508,G19428G17860了G2474G1927于口音,G13984G7235G3134音,G16840G16817方式的基线识别的G1946G11842性G1177G1177为6G19G8G14279G27G19G8G40HG86G68NG76G9G46NG82G71G87G15G20G28G28G27。G7368多的能G17241越以上两个的G7126G17161的系统G7389G54G88G69G68G85G68G86HG76G76G37G72G85NG86G87G72G76NG15G72G87G68G79G15G20G28G28G28G15G40G71G88G54PG72G68KFG85G68NG70G82G15G72G87G68G79G152G19G19G20G15PHG82NG72PG68G86G86G43G76NKG86G152G19G19G20G15G44G54LG40PG85G82G77G72G70G87G48G72NG93G72G79G15G72G87G68G79G152G19G19G20G68NG71G53G36G39CG54LG56G152G19G193。语音识别的G1946G11842性G4570G7389G7407G6925G2904。在自动语音识别G1147G2709中的几G12193语音识别方式中,隐马尔可夫模型(G43G48G48)G15999G16760为是最G1039要的算法,G5194G1000G15999G16789G7138在处理大G16801G8731语音时是最G20652G6940的G40HG86G68NG76G9G46NG82G71G87G15G20G28G28G27。G16826G13466G16840G7138隐马尔可夫模型如何工G1328G17241出了本文的G14551G3272,G1306可以在G1231何关于语G16340处理的文G12468中G6226到。其中最G3921的是G45G88G85G68G73G86KG92G9G48G68G85G87G76N2G19G19G19G68NG71G43G82G86G82G80G15CG82G79G72G15G68NG71FG68NG87G922G19G193。G12628G13792G16340之,隐马尔可夫模型G16757算G17767G1849接G6922信号和G2265G2559于一个G6329G7389数以G11346G16757的本G3315音G13044G5417音的数据G5223的G2317G18209可能性G43G76NKG86G152G19G193G15P5。也就是G16840,一G2500基于隐马尔可夫模型的语音识别G3132可以G16757算G17767G1849一个发音的音G13044可以和一个基于G8022G10587论相应的模型达到的达到的接近度。G20652性能就意G2631着优G14403的发音,G1314性能就意G2631着G2167G17148的发音LG68G85G82G70G70G68G15G72G87G68G79G15G20G28G28G20。G15441G9994语音识别已G15999G7234G17953用于G2842业听写和获G2474G10317殊G19668要G12573G11458的,近G5192来,语G16340学习的市场占G7389G10587急剧增加G36G76G86G87G15G20G28G28G28G40G86KG72NG68G93G76G15G20G28G28G28G43G76NKG86G152G19G193。早G7411的基于自动语音识别的软件程G5219采用基于模板的识别系统,其使用动态规划执G15904模式G2317G18209G6122其G1194时间规G14551G2282技术G39G68G79G69G92G9G46G72WG79G72G92PG82G85G87G15G20G28G28G28这些程G5219G2265G6336TG68G79KG87G82G48G72G36G88G85G68G79G82G74G15G20G28G285G15G87HG72TG72G79G79G48G72G48G82G85G72G54G72G85G76G72G86G36G88G85G68G79G82G74G152G19G19G19G15TG85G76PG79G72PG79G68G92PG79G88G86G48G68G70KG72G92G9CHG82G76G15G20G28G28G27G15G49G72WG39G92NG68G80G76G70G40NG74G79G76G86HG39G92NG40G71G15G20G28G287G15G40NG74G79G76G86HG39G76G86G70G82VG72G85G76G72G86G40G71G88G86G82G73G87G15G20G28G28G27G15G68NG71G54G72G72G76G87G15G43G72G68G85G44G87G15G54G36YG44TCPG44G15G20G28G287。这些程G5219的大多数G18129G993G1262G6564供G1231何G2465G20316给G17241出G12628G2345G16840G7138的发音G1946G11842G10587,这个基于最接近模式G2317G18209G16840G7138是由用G6155G6564出书面G4557G16817选择的。学习者G993G1262G15999告之G1194G1216发音的G1946G11842G10587。G10317别是内里,(2G19G192G5192)G16792论例如TG68G79KG87G82G48G72和TG72G79G79G48G72G48G82G85G72G12573G1328G2709中的波形G3282,因为G1194G1216G7411待浮华的买家,G13792G993G1262G6564供G7389意义的G2465G20316给用G6155。TG68G79KG87G82G48G722G19G192G5192的版本已经G2265G2559了G7368多G43G76NKG862G19G193的G10317性,比如,信G1231G4557于学习者来G16840是非G5132G7389用的一个视觉信号可以G16765学习者G6238G1194G1216的语调G2528模型G6208声G3132发出的语调进G15904G4557比。学习者发音的G1946G11842度G17902G5132以数字7来度G18339(越G20652越G3921)那些发音失真的G16801语G1262G15999识别出来G5194G15999G7138G7186G3332标注。附件2外文原文(复印件)IMPROVEDSPEECHRECOGNITIONMETHODFORINTELLIGENTROBOT2、OVERVIEWOFSPEECHRECOGNITIONSPEECHRECOGNITIONHASRECEIVEDMOREANDMOREATTENTIONRECENTLYDUETOTHEIMPORTANTTHEORETICALMEANINGANDPRACTICALVALUE5UPTONOW,MOSTSPEECHRECOGNITIONISBASEDONCONVENTIONALLINEARSYSTEMTHEORY,SUCHASHIDDENMARKOVMODELHMMANDDYNAMICTIMEWARPINGDTWWITHTHEDEEPSTUDYOFSPEECHRECOGNITION,ITISFOUNDTHATSPEECHSIGNALISACOMPLEXNONLINEARPROCESSIFTHESTUDYOFSPEECHRECOGNITIONWANTSTOBREAKTHROUGH,NONLINEARSYSTEMTHEORYMETHODMUSTBEINTRODUCEDTOITRECENTLY,WITHTHEDEVELOPMENTOFNONLINEASYSTEMTHEORIESSUCHASARTIFICIALNEURALNETWORKSANN,CHAOSANDFRACTAL,ITISPOSSIBLETOAPPLYTHESETHEORIESTOSPEECHRECOGNITIONTHEREFORE,THESTUDYOFTHISPAPERISBASEDONANNANDCHAOSANDFRACTALTHEORIESAREINTRODUCEDTOPROCESSSPEECHRECOGNITIONSPEECHRECOGNITIONISDIVIDEDINTOTWOWAYSTHATARESPEAKERDEPENDENTANDSPEAKERINDEPENDENTSPEAKERDEPENDENTREFERSTOTHEPRONUNCIATIONMODELTRAINEDBYASINGLEPERSON,THEIDENTIFICATIONRATEOFTHETRAININGPERSONSORDERSISHIGH,WHILEOTHERSORDERSISINLOWIDENTIFICATIONRATEORCANTBERECOGNIZEDSPEAKERINDEPENDENTREFERSTOTHEPRONUNCIATIONMODELTRAINEDBYPERSONSOFDIFFERENTAGE,SEXANDREGION,ITCANIDENTIFYAGROUPOFPERSONSORDERSGENERALLY,SPEAKERINDEPENDENTSYSTEMISMOREWIDELYUSED,SINCETHEUSERISNOTREQUIREDTOCONDUCTTHETRAININGSOEXTRACTIONOFSPEAKERINDEPENDENTFEATURESFROMTHESPEECHSIGNALISTHEFUNDAMENTALPROBLEMOFSPEAKERRECOGNITIONSYSTEMSPEECHRECOGNITIONCANBEVIEWEDASAPATTERNRECOGNITIONTASK,WHICHINCLUDESTRAININGANDRECOGNITIONGENERALLY,SPEECHSIGNALCANBEVIEWEDASATIMESEQUENCEANDCHARACTERIZEDBYTHEPOWERFULHIDDENMARKOVMODELHMMTHROUGHTHEFEATUREEXTRACTION,THESPEECHSIGNALISTRANSFERREDINTOFEATUREVECTORSANDACTASOBSERVATIONSINTHETRAININGPROCEDURE,THESEOBSERVATIONSWILLFEEDTOESTIMATETHEMODELPARAMETERSOFHMMTHESEPARAMETERSINCLUDEPROBABILITYDENSITYFUNCTIONFORTHEOBSERVATIONSANDTHEIRCORRESPONDINGSTATES,TRANSITIONPROBABILITYBETWEENTHESTATES,ETCAFTERTHEPARAMETERESTIMATION,THETRAINEDMODELSCANBEUSEDFORRECOGNITIONTASKTHEINPUTOBSERVATIONSWILLBERECOGNIZEDASTHERESULTEDWORDSANDTHEACCURACYCANBEEVALUATEDTHEWHOLEPROCESSISILLUSTRATEDINFIG1FIG1BLOCKDIAGRAMOFSPEECHRECOGNITIONSYSTEM3THEORYANDMETHODEXTRACTIONOFSPEAKERINDEPENDENTFEATURESFROMTHESPEECHSIGNALISTHEFUNDAMENTALPROBLEMOFSPEAKERRECOGNITIONSYSTEMTHESTANDARDMETHODOLOGYFORSOLVINGTHISPROBLEMUSESLINEARPREDICTIVECEPSTRALCOEFFICIENTSLPCCANDMELFREQUENCYCEPSTRALCOEFFICIENTMFCCBOTHTHESEMETHODSARELINEARPROCEDURESBASEDONTHEASSUMPTIONTHATSPEAKERFEATURESHAVEPROPERTIESCAUSEDBYTHEVOCALTRACTRESONANCESTHESEFEATURESFORMTHEBASICSPECTRALSTRUCTUREOFTHESPEECHSIGNALHOWEVER,THENONLINEARINFORMATIONINSPEECHSIGNALSISNOTEASILYEXTRACTEDBYTHEPRESENTFEATUREEXTRACTIONMETHODOLOGIESSOWEUSEFRACTALDIMENSIONTOMEASURENON2LINEARSPEECHTURBULENCETHISPAPERINVESTIGATESANDIMPLEMENTSSPEAKERIDENTIFICATIONSYSTEMUSINGBOTHTRADITIONALLPCCANDNONLINEARMULTISCALEDFRACTALDIMENSIONFEATUREEXTRACTION31LINEARPREDICTIVECEPSTRALCOEFFICIENTSLINEARPREDICTIONCOEFFICIENTLPCISAPARAMETERSETWHICHISOBTAINEDWHENWEDOLINEARPREDICTIONANALYSISOFSPEECHITISABOUTSOMECORRELATIONCHARACTERISTICSBETWEENADJACENTSPEECHSAMPLESLINEARPREDICTIONANALYSISISBASEDONTHEFOLLOWINGBASICCONCEPTSTHATIS,ASPEECHSAMPLECANBEESTIMATEDAPPROXIMATELYBYTHELINEARCOMBINATIONOFSOMEPASTSPEECHSAMPLESACCORDINGTOTHEMINIMALSQUARESUMPRINCIPLEOFDIFFERENCEBETWEENREALSPEECHSAMPLEINCERTAINANALYSISFRAMESHORTTIMEANDPREDICTIVESAMPLE,THEONLYGROUPOFPREDICTIONCOEFFICIENTSCANBEDETERMINEDLPCCOEFFICIENTCANBEUSEDTOESTIMATESPEECHSIGNALCEPSTRUMTHISISASPECIALPROCESSINGMETHODINANALYSISOFSPEECHSIGNALSHORTTIMECEPSTRUMSYSTEMFUNCTIONOFCHANNELMODELISOBTAINEDBYLINEARPREDICTIONANALYSISASFOLLOWWHEREPREPRESENTSLINEARPREDICTIONORDER,AK,K1,2,PREPRESENTSPREDICTIONCOEFFICIENT,IMPULSERESPONSEISREPRESENTEDBYHNSUPPOSECEPSTRUMOFHNISREPRESENTEDBY,THEN1CANBEEXPANDEDAS2THECEPSTRUMCOEFFICIENTCALCULATEDINTHEWAYOF5ISCALLEDLPCC,NREPRESENTSLPCCORDERWHENWEEXTRACTLPCCPARAMETERBEFORE,WESHOULDCARRYONSPEECHSIGNALPREEMPHASIS,FRAMINGPROCESSING,WINDOWINGPROCESSINGANDENDPOINTSDETECTIONETC,SOTHEENDPOINTDETECTIONOFCHINESECOMMANDWORD“FORWARD”ISSHOWNINFIG2,NEXT,THESPEECHWAVEFORMOFCHINESECOMMANDWORD“FORWARD”ANDLPCCPARAMETERWAVEFORMAFTERENDPOINTDETECTIONISSHOWNINFIG332SPEECHFRACTALDIMENSIONCOMPUTATIONFRACTALDIMENSIONISAQUANTITATIVEVALUEFROMTHESCALERELATIONONTHEMEANINGOFFRACTAL,ANDALSOAMEASURINGONSELFSIMILARITYOFITSSTRUCTURETHEFRACTALMEASURINGISFRACTALDIMENSION67FROMTHEVIEWPOINTOFMEASURING,FRACTALDIMENSIONISEXTENDEDFROMINTEGERTOFRACTION,BREAKINGTHELIMITOFTHEGENERALTOPOLOGYSETDIMENSIONBEINGINTEGERFRACTALDIMENSION,FRACTIONMOSTLY,ISDIMENSIONEXTENSIONINEUCLIDEANGEOMETRYTHEREAREMANYDEFINITIONSONFRACTALDIMENSION,EG,SIMILARDIMENSION,HAUSDOFFDIMENSION,INFORATIONDIMENSION,CORRELATIONDIMENSION,CAPABILITYIMENSION,BOXCOUNTINGDIMENSIONETC,WHERE,HAUSDOFFDIMENSIONISOLDESTANDALSOMOSTIMPORTANT,FORANYSETS,ITISDEFINEDAS3WHERE,MFDENOTESHOWMANYUNITNEEDEDTOCOVERSUBSETFINTHISPAPER,THEBOXCOUNTINGDIMENSIONDBOF,F,ISOBTAINEDBYPARTITIONINGTHEPLANEWITHSQUARESGRIDSOFSIDE,ANDTHENUMBEROFSQUARESTHATINTERSECTTHEPLANENANDISDEFINEDAS8THESPEECHWAVEFORMOFCHINESECOMMANDWORD“FORWARD”ANDFRACTALDIMENSIONWAVEFORMAFTERENDPOINTDETECTIONISSHOWNINFIG433IMPROVEDFEATUREEXTRACTIONSMETHODCONSIDERINGTHERESPECTIVEADVANTAGESONEXPRESSINGSPEECHSIGNALOFLPCCANDFRACTALDIMENSION,WEMIXBOTHTOBETHEFEATURESIGNAL,THATIS,FRACTALDIMENSIONDENOTESTHESELF2SIMILARITY,PERIODICITYANDRANDOMNESSOFSPEECHTIMEWAVESHAPE,MEANWHILELPCCFEATUREISGOODFORSPEECHQUALITYANDHIGHONIDENTIFICATIONRATEDUETOANNSNONLINEARITY,SELFADAPTABILITY,ROBUSTANDSELFLEARNINGSUCHOBVIOUSADVANTAGES,ITSGOODCLASSIFICATIONANDINPUT2OUTPUTREFLECTIONABILITYARESUITABLETORESOLVESPEECHRECOGNITIONPROBLEMDUETOTHENUMBEROFANNINPUTNODESBEINGFIXED,THEREFORETIMEREGULARIZATIONISCARRIEDOUTTOTHEFEATUREPARAMETERBEFOREINPUTTEDTOTHENEURALNETWORK9INOUREXPERIMENTS,LPCCANDFRACTALDIMENSIONOFEACHSAMPLEARENEEDTOGETTHROUGHTHENETWORKOFTIMEREGULARIZATIONSEPARATELY,LPCCIS4FRAMEDATALPCC1,LPCC2,LPCC3,LPCC4,EACHFRAMEPARAMETERIS14D,FRACTALDIMENSIONISREGULARIZEDTOBE12FRAMEDATAFD1,FD2,FD12,EACHFRAMEPARAMETERIS1D,SOTHATTHEFEATUREVECTOROFEACHSAMPLEHAS41411268D,THEORDERIS,THEFIRST56DIMENSIONSARELPCC,THEREST12DIMENSIONSAREFRACTALDIMENSIONSTHUS,SUCHMIXEDFEATUREPARAMETERCANSHOWSPEECHLINEARANDNONLINEARCHARACTERISTICSASWELLARCHITECTURESANDFEATURESOFASRASRISACUTTINGEDGETECHNOLOGYTHATALLOWSACOMPUTEROREVENAHANDHELDPDAMYERS,2000TOIDENTIFYWORDSTHATAREREADALOUDORSPOKENINTOANYSOUNDRECORDINGDEVICETHEULTIMATEPURPOSEOFASRTECHNOLOGYISTOALLOW100ACCURACYWITHALLWORDSTHATAREINTELLIGIBLYSPOKENBYANYPERSONREGARDLESSOFVOCABULARYSIZE,BACKGROUNDNOISE,ORSPEAKERVARIABLESCSLU,2002H
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 工程资料归档课件
- 工程课件教学
- 揭阳市榕城区重点达标名校2026届中考二模语文试题含解析
- 二零二五年度电商数据分析专家劳动合同规范
- 工业园区设备维护保养服务协议
- 疫情停课线上家长会课件
- 疫情主题课件小学生
- 疝气的护理常规课件
- 番茄病虫害防治
- 桂林市九上期末数学试卷
- 女性压力性尿失禁诊断治疗指南wj
- GB/T 9115-2010对焊钢制管法兰
- GB/T 5237.1-2017铝合金建筑型材第1部分:基材
- GB/T 2423.3-2006电工电子产品环境试验第2部分:试验方法试验Cab:恒定湿热试验
- 顶管工程危险源辨识与控制
- 江西省卫生高级职称评审医学杂志分级汇总表(试行)
- 2018降低辖区卷烟外流率
- 统编高中语文教材课内文言文挖空练习及答案
- 弱电施工手册
- 实验室应急预案
- 强规划助成长学课件-心理教师如何帮助学生提升职业规划能力
评论
0/150
提交评论