




已阅读5页,还剩51页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Chapter3:SpeechPerception,OverviewofQuestions,Cancomputersperceivespeechaswellashumans?Whydoesanunfamiliarforeignlanguageoftensoundlikeacontinuousstreamofsound,withnobreaksbetweenwords?Doeseachwordthatwehearhaveauniquepatternofairpressurechangesassociatedwithit?Aretherespecificareasinthebrainthatareresponsibleforperceivingspeech?,Speechperceptionreferstotheprocessesbywhichhumansareabletointerpretandunderstandthesoundsusedinlanguage.Thestudyofspeechperceptioniscloselylinkedtothefieldsofphoneticsandphonologyinlinguisticsandcognitivepsychologyandperceptioninpsychology.,Researchinspeechperceptionseekstounderstandhowhumanlistenersrecognizespeechsoundsandusethisinformationtounderstandspokenlanguage.Speechresearchhasapplicationsinbuildingcomputersystemsthatcanrecognizespeech,aswellasimprovingspeechrecognitionforhearing-andlanguage-impairedlisteners.,SpeechPerception,Thefirststepincomprehendingspokenlanguageistoidentifythewordsbeingspoken,performedinmultiplestages:1.Phonemesaredetected(/b/,/e/,/t/,/e/,/r/,)2.Phonemesarecombinedintosyllables(/be/ter/)3.Syllablesarecombinedintowords(“better”)4.Wordmeaningretrievedfrommemory,Spectrogram:Ioweyouayo-yo,Speechperception:twoproblems,Wordsarenotneatlysegmented(e.g.,bypauses)LackofphonemeinvarianceCoarticulation=consecutivespeechsoundsblendintoeachotherduetomechanicalconstraintsonarticulatorsSpeakerdifferences;pitchaffectedbyageandsex;differentdialects,talkingspeedsetc.,Thespeechinputconsistsof;,Frequencyrange50-5600HzCriticalbandfilters(临界频带滤波器)Dynamicrange50dBTemporalresolution(瞬时清晰度)of10msSmallestdetectablechangeinF02HzSmallestchangeinF140HzSmallestchangeinF2100HzSmallestchangeinF3150Hz,TheSpeechStimulus,Phoneme-smallestunitofspeechthatchangesmeaninginawordInEnglishthereare47phonemes:23majorvowelsounds24majorconsonantsoundsNumberofphonemesinotherlanguagesvaried11inHawaiianand60insomeAfricandialects,Table13.1MajorconsonantsandvowelsofEnglishandtheirphoneticsymbols,TheAcousticSignal,ProducedbyairthatispushedupfromthelungsthroughthevocalcordsandintothevocaltractVowelsareproducedbyvibrationofthevocalcordsandchangesintheshapeofthevocaltractThesechangesinshapecausechangesintheresonantfrequencyandproducepeaksinpressureatanumberoffrequenciescalledformants(共振峰),Figure13.1Thevocaltractincludesthenasalandoralcavitiesandthepharynx,aswellascomponentsthatmove,suchasthetongue,lips,andvocalcords.,TheAcousticSignal-continued,Thefirstformanthasthelowestfrequency,thesecondhasthenexthighest,etc.SoundspectrogramsshowthechangesinfrequencyandintensityforspeechConsonantsareproducedbyaconstrictionofthevocaltractFormanttransitions-rapidchangesinfrequencyprecedingorfollowingconsonants,Figure13.3Spectrogramofthewordhadshowingthefirst(F1),second(F2),andthird(F3)formantsforthevowel/ae/.(SpectrogramcourtesyofKerryGreen.),Figure13.4Spectrogramofthesentence“Royreadthewill,”showingtheformantssuchasF1,F2,andF3,andformanttransitionssuchasT2andT3.(SpectrogramcourtesyofKerryGreen.),TheRelationshipbetweentheSpeechStimulusandSpeechPerception,Thesegmentationproblem-therearenophysicalbreaksinthecontinuousacousticsignalHowdowesegmenttheindividualwords?Thevariabilityproblem-thereisnosimplecorrespondencebetweentheacousticsignalandindividualphonemesVariabilityfromaphonemescontextCoarticulation-overlapbetweenarticulationofneighboringphonemes,Figure13.5Spectrogramof“Ioweyouayo-yo.”Thisspectrogramdoesnotcontainpausesorbreaksthatcorrespondtothewordsthatwehear.Theabsenceofbreaksintheacousticsignalcreatesthesegmentationproblem.(SpectrogramcourtesyofDavidPisoni.),Figure13.6Hand-drawnspectrogramsfor/di/and/du/.(From“PerceptionoftheSpeechCode,”byA.M.Liberman,1967,PsychologicalReview,74,431-461,figure1.Copyright1967bytheAmericanPsychologicalAssociation.Reprintedbypermissionoftheauthor.),TheRelationshipbetweentheSpeechStimulusandSpeechPerception-continued,VariabilityfromdifferentspeakersSpeakersdifferinpitch,accent,speedinspeaking,andpronunciationThisacousticsignalmustbetransformedintofamiliarwordsPeopleperceivespeecheasilyinspiteofthesegmentationandvariabilityproblems,Figure13.7(a)Spectrogramof“Whatareyoudoing?”pronouncedslowlyanddistinctly.(b)Spectrogramof“Whatareyoudoing?”aspronouncedinconversationalspeech.(SpectrogramcourtesyofDavidPisoni.),StimulusDimensionsofSpeechPerception,Invariantacousticcues-featuresofphonemesthatremainconstantShort-termspectrogramsareusedtoinvestigateinvariantacousticcuesSequenceofshort-termspectracanbecombinedtocreatearunningspectraldisplayFromthesedisplays,therehavebeensomeinvariantcuesdiscovered,Figure13.8Left:ashort-termspectrumoftheacousticenergyinthefirst26msofthephoneme/ga/.Right:soundspectrogramofthesamephoneme.Thesoundforthefirst26msisindicatedinred.Thepeakintheshort-termspectrum,markeda,correspondstothedarkbandofenergy,markedainthespectrum.Theminimumintheshort-termspectrum,markedb,correspondstothelightarea,markedbinthespectrogram.Thespectrogramontherightshowstheenergyfortheentire500msdurationofthesound,whereastheshort-termspectrumonlyshowsthefirst26msatthebeginningofthissignal.(CourtesyofJamesSawusch.),Figure13.9Runningspectraldisplaysfor/pi/and/da/.Thesedisplaysaremadeupofasequenceofshort-termspectra,liketheoneinFigure13.8.Eachofthesespectraisdisplaced5msonthetimeaxis,sothateachstepwemovealongthisaxisindicatesthefrequenciespresentinthenext5ms.Thelow-frequencypeak(V)inthe/da/displayisacueforvoicing.(From“Time-VaryingFeaturesofInitialStopConsonantsinAuditoryRunningSpectra:AFirstReport,”byD.Kewley-Port,andP.A.Luce,1984,PerceptionandPsychophysics,35,353-360,figure1.Copyright1984byPsychonomicSocietyPublications.Reprintedbypermission.),CategoricalPerception,ThisoccurswhenawiderangeofacousticcuesresultsintheperceptionofalimitednumberofsoundcategoriesAnexampleofthiscomesfromexperimentsonvoiceonsettime(VOT)-timedelaybetweenwhenasoundstartsandwhenvoicingbeginsStimuliareda(VOTof17ms)andta(VOTof91ms),CategoricalPerception-continued,ComputerswereusedtocreatestimuliwitharangeofVOTsfromlongtoshortListenersdonotheartheincrementalchanges,insteadtheyhearasuddenchangefrom/da/to/ta/atthephoneticboundaryThus,weexperienceperceptualconstancyforthephonemeswithinagivenrangeofVOT,Figure13.10Spectrogramsfor/da/and/ta/.Thevoiceonsettime-thetimebetweenthebeginningofthesoundandtheonsetofvoicing-isindicatedatthebeginningofthespectrogramforeachsound(SpectrogramcourtesyofRonCole.),Figure13.11Theresultsofacategoricalperceptionexperimentindicatethat/da/isperceivedforVOTstotheleftofthephoneticboundary,andthat/ta/isperceivedatVOTstotherightofthephoneticboundary.(From“SelectiveAdaptationofLinguisticFeatureDetectors,byP.EimasandJ.D.Corbit,1973,CognitivePsychology,4,99-109,figure2.Copyright1973AcademicPress,Inc.Reprintedbypermission.),Figure13.12Inthediscriminationpartofacategoricalperceptionexperiment,twostimuliarepresented,andthelistenerindicateswhethertheyarethesameordifferent.ThetypicalresultisthattwostimuliwithVOTsonthesamesideofthephoneticboundary(solidarrows)arejudgedtobethesame,andthattwostimuliondifferentsidesofthephoneticboundary(dashedarrows)arejudgedtobedifferent.,Figure13.13PerceptualconstancyoccurswhenallstimuliononesideofthephoneticboundaryareperceivedtobeinthesamecategoryeventhoughtheirVOTischangedoverasubstantialrange.ThisdiagramsymbolizestheconstancyobservedbyEimasandCorbit(1973)experiment,inwhich/da/washeardononesideoftheboundaryand/ta/ontheotherside.,SpeechPerceptionisMultimodal,Auditory-visualspeechperceptionTheMcGurkeffectVisualstimulusshowsaspeakersaying“ga-ga”Auditorystimulushasaspeakersaying“ba-ba”Observerwatchingandlisteninghears“da-da”,whichisthemidpointbetween“ga”and“ba”Observerwitheyesclosedwillhear“ba”,McGurkEffect,Figure13.14TheMcGurkeffect.Thewomanslipsaremovingasifsheissaying/ga-ga/,buttheactualsoundbeingpresentedis/ba-ba/.Thelistener,however,reportshearingthesound/da-da/.Ifthelistenercloseshiseyes,sothathenolongerseesthewomanslips,hehears/ba-ba/.Thus,seeingthelipsmovinginfluenceswhatthelistenerhears.,CognitiveDimensionsofSpeechPerception,Top-downprocessing,includingknowledgealistenerhasaboutalanguage,affectsperceptionoftheincomingspeechstimulusSegmentationisaffectedbycontextandmeaningIscreamyouscreamweallscreamforicecream,Figure13.15Speechperceptionistheresultoftop-downprocessing(basedonknowledgeandmeaning)andbottom-upprocessing(basedontheacousticsignal)workingtogether.,MeaningandPhonemePerception,ExperimentbyTurveyandVanGelderShortwords(sin,bat,andleg)andshortnonwords(jum,baf,andteg)werepresentedtolistenersThetaskwastopressabuttonasquicklyaspossiblewhentheyheardatargetphonemeOnaverage,listenerswerefasterwithwords(580ms)thannon-words(631ms),MeaningandPhonemePerception-continued,ExperimentbyWarrenListenersheardasentencethathadaphonemecoveredbyacoughThetaskwastostatewhereinthesentencethecoughoccurredListenerscouldnotcorrectlyidentifythepositionandtheyalsodidnotnoticethataphonemewasmissing-calledthephonemicrestorationeffect,Phonemicrestoration,AuditorypresentationPerceptionLegislaturelegislatureLegi_laturelegilatureLegi*laturelegislatureItwasfoundthatthe*eelwasontheaxle.wheelItwasfoundthatthe*eelwasontheshoe.heelItwasfoundthatthe*eelwasontheorange.peelItwasfoundthatthe*eelwasonthetable.meal,Warren,R.M.(1970).Perceptualrestorationsofmissingspeechsounds.Science,167,392-393.,MeaningandWordPerception,ExperimentbyMillerandIsardStimuliwerethreetypesofsentences:NormalgrammaticalsentencesAnomaloussentencesthatweregrammaticalUngrammaticalstringsofwordsListenersweretoshadow(repeataloud)thesentencesastheyheardthemthroughheadphones,MeaningandWordPerception-continued,Resultsshowedthatlistenerswere89%accuratewithnormalsentences79%accurateforanomaloussentences56%accurateforungrammaticalwordstringsDifferenceswereevenlargerifbackgroundnoisewaspresent,SpeakerCharacteristics,Indexicalcharacteristics-characteristicsofthespeakersvoicesuchasage,gender,emotionalstate,levelofseriousness,etc.ExperimentbyPalmeri,Goldinger,andPisoniListenersweretoindicatewhenawordwasnewinasequenceofwordsResultsshowedthattheyweremuchfasterifthesamespeakerwasusedforallthewords,SpeechPerceptionandtheBrain,Brocasaphasia-individualshavedamageinBrocasarea(infrontallobe)LaboredandstiltedspeechandshortsentencesbuttheyunderstandothersWernickesaphasia-individualshavedamageinWernickesarea(intemporallobe)SpeakfluentlybutthecontentisdisorganizedandnotmeaningfulTheyalsohavedifficultyunderstandingothers,Figure13.16BrocasandWernickesareas,whicharespecializedforlanguageproductionandcomprehension,arelocatedinthelefthemisphereofthebraininmostpeople.,SpeechPerceptionandtheBrain-continued,MeasurementsfromcatsauditoryfibersshowthatthepatternoffiringmirrorstheenergydistributionintheauditorysignalBrainscansofhumansshowthatthereareareasofthehumanwhatstreamthatareselectivelyactivatedbythehumanvoice,Figure13.17(a)Short-termspectrumfor/da/.Thiscurveindicatestheenergydistributionin/da/between20and40msafterthebeginningofthesignal.(b)Nervefiringofapopulationofcatauditorynervefiberstothesamestimulus.(From“EncodingofSpeechFeaturesintheAuditoryNerve,”byM.B.Sachs,E.D.Young,andM.I.Miller,1981.InR.CarlsonandB.Granstrom(Eds.)TheRepresentationofSpeechinthePeripheralAuditorySystem,pp.115-130.Copyright1981byElsevierSciencePublishing,NewYork.Reprintedbypermission.),ExperienceDependentPlasticity,Beforeage1,humaninfantscantelldifferencebetweensoundsthatcreatealllanguagesThebrainbecomes“tuned”torespondbesttospeechsoundsthatareintheenvironmentOthersounddifferentiationdisappearswhenthereisnoreinforcementfromtheenvironment,MotorTheoryofSpeechPerception,LposedthatmotormechanismsresponsibleforproducingsoundsactivatemechanismsforperceivingsoundEvidencefrommonkeyscomesfromtheexistenceofmirrorneuronsExperimentbyWatkinsetal.Participantshadtheirmotorcortexforfacemovementsstimulatedbytranscranialmagneticstimulation(TMS),MotorTheoryofSpeechPerception-continued,Resultsshowedsmallmovementsforthemouthcalledmotorevokedpotentials(MEP)ThisresponsebecamelargerwhenthepersonlistenedtospeechorwatchedsomeoneelseslipmovementsInaddition,thewherestreammayworkwiththewhatstreamforspeechperception,Figure13.18Thetranscranialmagneticstimulationexperimentthatprovidesevidenceforalinkbetweenspeechperceptionandproductioninhumans.Seetextfordetails.(R
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 国画数字化保护-洞察与解读
- 2025年宁波余姚市卫生健康事业单位公开招聘卫生技术人员179人考前自测高频考点模拟试题附答案详解(典型题)
- 2025内蒙古能源集团有限公司法律合规与企业管理部副经理岗位招聘1人考前自测高频考点模拟试题及一套答案详解
- 2025年台州温岭市卫生事业单位公开招聘工作人员48人模拟试卷及答案详解(名校卷)
- 微电机仿生结构创新-洞察与解读
- 多模式交通碳排放核算-洞察与解读
- 2025昆明市盘龙区汇承中学招聘教师(12人)考前自测高频考点模拟试题含答案详解
- 2025福建龙岩农业发展有限公司所属企业招聘1人考前自测高频考点模拟试题及完整答案详解
- 2025北京市怀柔区卫生健康委员会所属事业单位招聘25人考前自测高频考点模拟试题及答案详解一套
- 2025黑龙江鸡西市城子河区招聘民兵军事训练教练员2人模拟试卷及完整答案详解1套
- 钢铁销售基础知识培训
- 医生进修6个月汇报大纲
- 5.1延续文化血脉 教案 -2025-2026学年统编版道德与法治九年级上册
- 2025年保密观原题附答案
- 基于项目学习的英语核心素养心得体会
- 2025年全球汽车供应链核心企业竞争力白皮书-罗兰贝格
- 第六章-材料的热性能
- (完整版)抛丸机安全操作规程
- 高一前三章数学试卷
- 自助与成长:大学生心理健康教育
- 2025年新高考2卷(新课标Ⅱ卷)语文试卷
评论
0/150
提交评论