版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Howbigdatacanhelpsmalldata?DepartmentofComputerScienceUniversityofSouthernCaliforniaEdgecasestudy“常常会遇到偶发的⼤QuanFinance“左尾”/“右尾”没有⾜够多的数据TimescalesShort-termWhichschool?TimescalesShort-termWhichschool?Extracurricularactivities:sports,arts,etc?Mid-termWhichuniversity?Whichspeciality?Long-termWhatkindofcareerpath?Whatelsemakesthishard?IndividualdifferencesGeneticallyEnvironmentallyHarshconstraintsOne-shotgameCostlytorecoverfrommistakes[Bill&MelindaGatesFoundationonPersonalizedLearning]CNNonPersonalizedLearning6]CanyoubuildamodelofSadly,Sadly,not100%yetIndividualizedmodelsneedindividual-specificdataTheamountofdataisfundamentallylimited,hencebeingSmall.MostmodernlearningalgorithmsrequireBigDataabouttheindividual.Sadly,not100%yetIndividualizedmodelsneedindividual-specifiSadly,not100%yetIndividualizedmodelsneedindividual-specificdataTheamountofdataisfundamentallylimited,hencebeingSmall.MostmodernlearningalgorithmsrequireBigDataabouttheindividual.you.CanyoubuildamodelofSadly,Sadly,not100%yetIndividualizedmodelsneedindividual-specificdataTheamountofdataisfundamentallylimited,hencebeingSmall.MostmodernlearningalgorithmsrequireBigDataabouttheindividual.you.rI3LearningsettingsMulti-tasklearningDomainadaptationZero-shotlearningPrimaryapplicationfocusComputervision3LearningsettingsMulti-tasklearningDomainadaptationZero-shotlearningPrimaryapplicationfocusComputervision3LearningsettingsMulti-tasklearningDomainadaptationZero-shotlearningPrimaryapplicationfocusComputervision3LearningsettingsMulti-tasklearningDomainadaptationZero-shotlearningPrimaryapplicationfocusComputervision3LearningsettingsMulti-tasklearningDomainadaptationZero-shotlearningPrimaryapplicationfocusComputervisionVignette1Multi-taskLearning(MTL)“众人拾柴高”w1w2w3w4w1w2w3w4ProblemsettingMtasks,eachwithitsowndataNeedtofindsolutionsforallofthemTraditionalframeworkforsupervisedlearningSolveeachtaskindependentlyargmin`(Dm;wm)+λmR(wm)wmw1w2w1w2w3w4MainideaLearnjointlymultiplerelatedtasksForceknowledgesharingCombinesmalldataintobigdataBenefitsImprovegeneralizationperformanceRequirelessamountofdataWorksinbothdeepandshallowlearningmodelsw1,w2,···,wMMX`(Dm;wm)+λR(w1,w2,···,wM)m=1eeetalArgyriouetal08,Daumé,09..…]MX`(Dm;wm)+m=1MX`(Dm;wm)+m=1w1,w2,···,wMExploitingtaskrelatednessEncodepriorknowledgebyselectingtheregularizerConstrainthehypothesisspaceforalltasksChoicesofregularizerAllparametersaresimilartoeachotherParametersshouldhavesimilarsparsitypatterns.λλR(w1,w2,···,wM)ww1ww2ww3ww4D23InputvisualfeatureD[objectcategoriesandattributes,CVPR,2011]haredfeaturesxxxxD23InputvisualfeatureD[objectcategoriesandattributes,CVPR,2011]haredfeaturesxxxxwhitespotspolarbearwhitespotsject yaAAttributesclassifier classject yaAAttributesclassifier 911912 u1911912 u1u2u3MA11VisualfeaturespaceAnalogiesleopard:cat=wolf:dogleopard:tiger=horse:zebraRRegularization--------==SemanticEmbeddingSpace[Analogy-preservingembedding,ICML,2013]NBSharingontologiesNIPS2)]NotalltasksarebeneficialNotalltasksarebeneficialNotalltasksarebeneficialNotalltasksarebeneficialw1Howtodiscovergroupsofrelatedsubtasks?“Learningwithwhomtow1Howtodiscovergroupsofrelatedsubtasks?“Learningwithwhomtoshare”(ICML,2011)Group1w2“Resistingthetemptationtoshare”(CVPR,2014)Whythisisuseful?w3LearninginnoisytaskdataLearningfromasetofirrelevanttasksEx:compbio,noisylabelsGroup2w4NotalltasksVignette2DomainadaptationClassificationtask:givenafaceimage,determinemanorwoman?CollectalotoflabeledimagestrainingtaanwomanxxxxInferaclassificationboundary22 xxClassifyontestimagex2xxxClassifyontestimagex2xccessSharedstatisticalproperties,usefulforclassificationSharedstatisticalproperties,usefulforclassificationtell-talefeature:lengthofhairtrainingdatatestdataMismatchbetweentrainingandtestingtrainingdatatestdataunseendataMismatchbetweentrainingandtesting“lengthofhair”nolongerefective!trainingdatatestdataunseendataUnrealistic,oversimplifyingassumptionsLearningenvironmentisstationaryTraining,testingandfuturedataaresampledini.i.dfromthesamedistributionWorkswellinacademic/well-controlledsettings.Inreal-life,Learningenvironmentchanges.Training,testingandfuturedataaresampledfromdifferentdistributions.Wesufferfrompoorcross-distributiongeneralization,whereaccuracyfordisparatedomainsdropssignificantly.ComputervisionObjectrecognition:train&testondifferentdatasetsVehiclepedestrianavoidancesystems:train&testindifferentvehicular/cityenvironmentsNaturallanguageprocessingSyntacticparsing:trainonbusinessarticlesbutappliedtomedicaljournalsSpeechrecognition:trainonnativespeakersbutappliedtoaccentedvoicesChallengesManyexogenousfactorsaffectvisualappearances:pose,illumination,camera’squality,etc.Collectingdataunderallpossiblecombinationsofthosefactorsisexpensive.Labelingthosedataisevenmorecostly.CaltechCaltech-256mAmazonDSLRExampleimagesfrom4domainsinourempiricalstudiesAccuracyAccuracy[Anonymoussource,2014]EffectofusingbiggerdatasetsforadaptationlargersourceAmazonWebcamImageNetAdaptedAmazonAdaptedImageNetHowtoadapt?linearsubspacesDomain-invariantfeaturesTheoreticalmotivationExploitintrinsicstructuresLearnkernelsdiscriminativeclusteringGrasGrassmannmanifoldofsubspacesSourcedomainGeodesicflowcapturesdomain-invariantrepresentation(forvisualrecognition)Targetdomain(ICML13,NIPS13)[Ben-Davidetal’06,Blitzeretal’06,DaumeIII’07,Panetal,09,SharedrepresentationExistenceofa(latent)featurespaceThemarginalsofsourceandtargetsarethesame(orsimilar)inthisspaceExistasingleclassifierworkswellonbothdomainseT[h]<eS[h]+A(PS,PT)+infh2H[eT[h]+eS[h]]howwellahowwellasingleclassifiercandodistributionsaresimilarGrassmannmanifoldofsubspacesTargetdomainGeodesicflowcapturesGrassmannmanifoldofsubspacesTargetdomainGeodesicflowcapturesdomain-invariantSourcedomainrepresentation(forvisualrecognition)PRDomain-invariantfeaturesParameterizedaslinearkernelmappingoforiginalfeaturesConstructedtominimizediscrepancybetweentwodomainsModeldomainswithsubspacesComputediscrepancyasdifferencesbetweensubspacesGG(d,D)Noadaptation SGF(Gopalanetal,ICCV2011)GeodesicFlowkernel(ours)DAC45004500Geodesicflowkernel(GFK)LandmarkC-->AA-->WW-->CD-->AC-->DA-->CVignette3Zero-shotlearningClassicalmachinelearningframeworkMultiwayclassificationLabelingspaceisdeterminedapriorAlargenumberofannotatedtrainingsamplesforeveryclassChallengesforrecognitioninthewildLabelingspacegrowsarbitrarilylargewithemergenceofnewclassesCollectingdatafornewclassesisnotalwayscost-effectiveSomeclassesdonothaveenoughlabeledorzerolabeledimages“cat”“flower”“bench”“dog”“bear”“bird”Numberofspecies(total:1,589,361)Birds:9956Fish:30,000Mammals:5,416Reptiles:8,240Insects:950,000Corals:2,175Plants:297,326Mushrooms:16,000“Skywalker”gibbonObjectsSimilarly,inImageNetTwotypesofclassesSeen:withalotoflabeledexamplesUnseen:withoutanyexamplesCatHorse ?FiguresfromDerekHoiem’sslidesWhatisit:bear-like,withblackandwhitestripeandoftenwithbamboo?ClasslabelsClasslabels≠discretenumbersNeedtoassignsemanticmeaningstoclasslabelsNeedtodefinerelationshipsamongclasslabelsKeyassumptionsThereisacommonsemanticspacesharedbybothtypesofclassesConfigurationoftheembeddingsenable“transfer”.seeseenclassuneenclassSemanticEmbeddings•Attributes(Farhadietal.09,Lampertetal.09,Parikh&Grauman11,…)•Wordvectors(Mikolovetal.13,Socheretal.13,Fromeetal.13,…)•Word•Wordvectors(Mikolovetal.13,Socheretal.13,Fromeetal.13,…)SemanticEmbeddings•Attributes(Farhadietal.09,Lampertetal.09,Parikh&Grauman11,…)ngSeenObjectsnObjectSeenObjectsnObjectBrownMuscularHasSnoutHasMane(likehorse)HasSnout(likedog)HowHowtoeffectivelyconstructamodelforzebra?FiguresfromDerekHoiem’sslidesTrainingSeenclassesandtheirsemanticembeddingsS={1,2,···,S}AS={a1,a2,···,aS}AnnotatedtrainingsamplesD={(xn,yn)}=1GoalUnseenclassesandtheirsemanticembeddings八={S+1,···,S+U}AU={aS+1,aS+2,···,aS+U}Classifier:f:x!y2八ardaCardinal2v2w3311ModelspacewvvwardaCardinal2v2w3311Modelspacewvvwb1a1b2SemanticspaceSemanticspaceb3aSynthesizedclassifiersforzero-shotlearningSemanticrepresentationsSemanticembeddingspaceVisualfeaturesaGadwallaCedarWaxwinga(·)=PCAaHouseWren((au)forNNclassificationortoimproveexistingZSLapproaches:classexemplarcat01.11b1a1penguin−.2cat01.11b1a1penguin−.2Modelspace2−1.0(0.4A(−0.3Av1v2b2a3 Semanticspace3abBBC(−0.4AIntroducephantomclassesasbasesLearnbases’semanticembeddingsaswellasmodelsforbasesGraphsstructuresencode“relatedness”DefinehowclassesarerelatedinthesemanticembeddingspaceDefinehowclassesarerelatedinthemodelspaceDatasetsDatasetsTotal#AwA†CUB‡ClassificationaccuracyAwACUBSUNImageNet
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 南航通航2026届春季校园招聘5人农业笔试备考题库及答案解析
- 2026年长沙理工大学教师招聘笔试备考试题及答案解析
- 2026年广州软件学院教师招聘笔试备考试题及答案解析
- 2026年成都纺织高等专科学校教师招聘笔试备考试题及答案解析
- 2026年廊坊卫生职业学院教师招聘考试备考题库及答案解析
- 2026年安徽黄梅戏艺术职业学院教师招聘考试参考试题及答案解析
- 企业制造过程质量验证方案
- 2026年河北外国语学院教师招聘考试备考试题及答案解析
- 2025-2030年打浆机行业直播电商战略分析研究报告
- 2025-2030年物流无人机货物配送效率评估行业跨境出海战略分析研究报告
- AQ 1119-2023 煤矿井下人员定位系统技术条件
- CHT 3006-2011 数字航空摄影测量 控制测量规范
- 地心游记教学设计
- 留置导尿术操作评分标准
- 外科中级常考知识点(心胸外科)
- 北京市通州区2023年八年级下学期《语文》期中试题与参考答案
- 监理实施细则混凝土工程
- 牵引管管道施工方案【实用文档】doc
- 课前小游戏(肢体猜词接力)课件
- 询价单(表格模板)
- 教学大纲-数据库原理及应用(SQL Server)(第4版)
评论
0/150
提交评论