机器学习题库_第1页
机器学习题库_第2页
机器学习题库_第3页
机器学习题库_第4页
机器学习题库_第5页
已阅读5页,还剩39页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

机器学习题库

一、极大似然

I、MLestimationofexponentialmodel(10)

.Gaussia.distributio.i.ofte.use.t.niode.dat.o.th.rea.line.bu.i.sometinie.inappropriat.whe.th.dat.ar.oft

e.clos.t.zer.bu.constraine.t.b.nonnegative.I.bab

ilit.densit.functio.i.give.by

p(x)=*i

GivenNobservationsxidrawnfromsuchadistribution:

(a)Writedownthelikelihoodasafunctionofthescaleparameterb.

(b)Writedownthederivativeoftheloglikelihood.

(c)GiveasimpleexpressionfortheMLestimateforb.

(a)L(X®=W,

<=i°

.N

(b)=I<)R(L(X:6))=-.Vk>R(fc)--.r,

<=1

(c)£(x:g=()n,,=:f

(JbA

,二1

2.换成Poisson分布:

/⑹=之log(P(£⑹)=2七loglog(x!)

J-lJ-l

=£七loge-N8-£log(xJ)

・EI

二、贝叶斯

1、贝叶斯公式应用

2、假设在考试的多项选择中,考生知道正确答案的概率为p,猜测答案的概率为1-p,并且

假设考生知道正确答案答对题的概率为1,猜中正确答案的概率为,其中m为多选项

的数目。那么已知考生答对题目,求他知道正确答案的概率。:

3、Conjugatepriors

Givenalikelihood°)foraclassmodelswithparameters,aconjugatepriorisa

distributionp(0\y\withhyperparametersy,suchthattheposteriordistribution

X,r)=ap(X\0)p(0\y)=p(0]/)

与先验的分布族相同

(a)SupposethatthelikelihoodisgivenbytheexponentialdistributionwithrateparameterX:

Showthatthegammadistribution

Gamma(Ma,p)=-

i..conjugat.pno.tb.th.cxponcntial.Dcriv.th.paramctc.updat.givc.obscrvation..an.(h.prcdictio.distributio.

(a)Ex]x>nentialandGamma

ThelikelihoodisP(XA)=fJ*LiAexp(—Ax,)andtlieprioris/XA|a.3)=gainnia(X\a.J)=

eXp(-3A).LetXdenotetii<*olnervationsX),...xjvandlet$Ndenotetheirsum.Then

〔heposterioris

伊JL.

|X)8exp(-^A)JJAoxp(-A.G)

=段K+NTexp(f(3+sN))

1(c)

8gamnui(A|n+Ar,3+*N).

Thereforetheparameterupdatean、asfollows:

a-a+N

P<—B+§N

Fortheprcxlictiondistributionwecomputethefollowingintegral:

*

P(J,N+1IJi•…,上N)=jP"N+IIA)/>(AI]1,...,上丹川人

=jAexp(—AJ'N+1)<7«>"(A|c+N.34-a^)(IX

(…阳尸+Nr(n|N)f/\|上\rQa.,5

=「/,g773---;------/a+加Aa+N,0+8N+J-JV+1)«A

Ha+N)(/>+sN+J-A-+i)J

(8+5N)a+NQ+N

(,3+§N+%+1)°+'1+S/V+£JV+1

wherethepenultimatestepusesthestandardformula<i/3fortheexpectedvalueofagamma

distribution.

(b)Showthatthebetadistributionisaconjugatepriorforthegeometricdistribution

p(x=k\e)={\-O^0

babilit.o.head.o.eac.tos.i.

0.Deriv.th.paramete.updat.rul.an.predictio.distribution.

(b)GeometricandBeta

ThelikelihoodforasingleobservationofvaluekisP(X=Ar|^)=(1-0)k~l0andprioris

p(H|a.6)=Beta(a.b)=a8a~1(l—8/一:whereoistheiionnalizationconstant.Tlicnthe

posterioris.

p(01X)=卅一1(]一8尸(]一6尸6

=aea(l-0)b+k-2

=Beta(0\a+l.b^k-l)

Thereforetheparameterupdatesare.

«+l

I;b+k-1

Forthepn*<lictiondistributionwtcompute(lietbllowingintegral:

P(%2=£I%=*)=fp(%2=,I|X1=k)M

=^(1-oy-^Betaid\a+l.b+k-l)de

r(a+6+A)r(a+l)r(6+A+f-2)J0Bcta(e|o+1.6+A+f-2)d6

r(a+l)r(6+Ar-l)V(a+b+k+t-l)

r(a+b+A)r(6+A-+r-2)«+1

r(6+A--1)r(«+6+A-+-1)«+6+A-+I

=r(a+6+A)Rb+A:+f—2)

「(b+--1)Ra+b+k+f)•“-

wherethepenultimatestepusesthestandardtbnnulaa/(a+3)fortheexpectedvalueofaBeta

distribution.

(c)Suppose〃(例,)isaconjugatepriorforthelikelih(x>d6);showthatthemixtureprior

M

p(例%,…,八,)=ZM、P(0外,)

EI

i.als.conjugat.fo.th.sam.likclihood.assumin.th.mixtur.wcight.w.su.t.i.

(c)MixturePrior

Thepriorisgivenbythemixture,

M

p(eI»,....)“)=£”言「(&I.

m=l

Moreover,wcarcgiventliatP(6ym)isaconjugatepriorforthelikelihoodP(X6);inother

words,

?(6|X-m)=«mP(X|0)P(6|7m)=-®I7m).

Whenweinnlti])lythemixturepriorwiththelikelihood,wegetrliefollowingposterior:

M

P(0|X.7,..…7A/)=cP(X|6)EITmP。|%)

m=l

=.atz,mpxI8)P(0I7m)

=£—P(eI7;.)

Cm

=ftr/WlYn)

Thereforeweobservethattheposteriorhasthesameformastlieprior,i.e..amixturedistribution

withupdatedweightsandhyperparainelens.

(d)Repeatpart(c)forthecasewherethepriorisasingledistributionandthelikelihoodisamixture,

andthepriorisconjugateforeachmixturecomponentofthelikelihood.

somepriorscanbeconjugateforseveraldifferentlikelihoods;torexample,thebetaisconjugatefortheBernoulli

andthegcomclricdistributionsandthegammaisconjugatefortheexponentialandforthegammawithfixeda

(c)(Extracredit,20)Explorethecasewherethelikelihoodisamixturewithfixedcomponentsand

unknownweights;i.e.,theweightsaretheparameterstobelearned.

Problem2

Considertlicprobabilitydensityfunction(ormasbfunction,ifXisdiscrete)fortheexponentialfamily:

力=h(x)exp{r)Tu(x)一〃(?/)}.

(a)Showtliattheunivariatenormalandtheinultinoniialdistributionsbelongtotliisfamily.

(b)Showthat,inagenerativeckissificationin()(h'l.ittheclassconditionaldensitiesbelongtotheexponentia)

family,tlientheposteriordistributionforaclassisasoftmaxfunctionofalinearfunctionofthefeature

vectorx.

(c)Considering»/tobeascalar,findanexpi<*xsi()nfor出.(Wherewillthisexpressionbeluxxicd?)

(<l)(Forextracredit)AstatisticT(J)issaidtobesufficientforaparameterifp(.r|T(.r)://)=p(j|T(x)),

orinotherwords,itisin(lept*n<lentofSliowthatforarandomvariableXdrawnfromanexponentialfaniily

densityp(x;»/).u(x)isasufficientstatistictor//.(Showthatafactorizationp{x.M(X);7)=7/)

isnecessaryandsufficientforM(T)tobeasufficientstatisticfor川.

(e)(Forextracredit)SupposearedrawniidfromanexponentialfamilydensityWhat

.Xn〃(『:〃).

isnowthesufficientstatisticr(xj,...,xn)for;/?

三、判断题

(1)给定n个数据点,如果其中一半用于训练,另一半用于测试,则训练误差和测试误差之间的

差别会随着n的增加而减小。

(2)极大似然估计是无偏估计且在所有的无偏估计中方差最小,所以极大似然估计的风险最小。

(3)回归函数A和B,如果A比B更简单,则A几乎一定会比B在测试集上表现更好。

(4)全局线性回归需要利用全部样本点来预测新输入的对应输出值,而局部线性回归只需利用

查询点附近的样本来预测输出值。所以全局线性回归比局部线性回归计算代价更高。

(5)Boosting和Bagging都是组合多个分类器投票的方法,二者都是根据单个分类器的正确率决

定其权重。

(6)Intheboostingiterations,thetrainingerrorofeachnewdecisionstumpandthetrainingerrorof

thecombinedclassifiervaryroughlyinconcert(F)

Whilethetrainingerrorofthecombinedclassifiertypicallydecreasesasafunctionofboosting

iterations,theerroroflheindividualdecisionstumpstypicallyincreasessincetheexampleweighs

becomeconcentratedatthemostdifficultexamples.

(7.On.advantag.o.Boostin.i.iha.i.doc.no.ovcrfit.(F)

(8.Supp()r.vecto.machine.ar.resistan.t.outliers.i.e..ver.nois.example.draw.fro..differen.distribution.

(F)

(9)在回归分析中,最佳子集选择可以做特征选择,当特征数目较多时计算量大;岭回归和Lasso

模型计算量小,且Lasso也可以实现特征选择。

(10)当训练数据较少时更容易发生过拟合。

(11)梯度下降有时会陷于局部极小值,但EM算法不会。

(12)在核回归中,最影响回归的过拟合性和欠拟合之间平衡的参数为核函数的宽度。

(13.I.th.AdaBoos.algorithm.th.weight.o.al.th.misclassifie.point.wil.g.u.b.th.sam.multiplicativ.factor.(T)

xp(-6%=exp(a)

Ut

7.[2points]true/falseInAdaBoost,weightedtrainingerror€tofthetweakclassilier

ontrainingdatawithweightsDttendstoincreaseasafunctionoft.

★SOLUTION:True.Inthecourseofboostingiterationstheweakclassifiersare

forcedtotrytoclassifymoredifficultexamples.Theweightswillincreaseforexamples

thatarerepeatedlymisclassifiedbytheweakclassifiers.Theweightedtrainingerrorof

thetthweakclassifieronthetrainingdatathereforetendstoincrease.

9.2points]Considerapointthatiscorrectlyclassifiedanddistantfromthedecision

)()tin(laiy.WhywouldSVM's(l(*cisi()nboundarybeunafTech'dbythispoint,butthe

onelearnedbylogisticregressionbeaffected?

★SOLUTION:ThehingelossusedbySVMsgiveszeroweighttothesepointswhile

thelog-lossusedbylogisticregressiongivesalittlebitofweighttothesepoints.

(14.True/False.L.blem.addin.a.Lregularizati().penalt.cann().decreas.th.L.

erro.o.th.solutio.w.o.th.trainin.data..(F)

(15)True/False:Inaleast-squareslinearregressionproblem,addinganL?regularizationpenalty

alwaysdecreasestheexpectedL2errorofthesolutionwAonunseentestdata(F).

(16)除了EM算法,梯度下降也可求混合高斯模型的参数。(T)

(20)Anydecisionboundarythatwegetfromagenerativemodelwith

class-conditionalGaussiandistributionscouldinprinciplebereproducedwithan

SVMandapolynomialkernel.

True!Infact,sinceclass-conditionalGaussiansalwaysyieldquadraticdecision

boundaries,theycanbereproducedwithanSVMwithkernelofdegreelessthanor

equaltotwo.

(21)AdaBoostwilleventuallyreachzerotrainingerror,regardlessofthetypeofweak

classifierituses,providedenoughweakclassifiershavebeencombined.

False!Ifthedataisnotseparablebyalinearcombinationoftheweakclassifiers,

AdaBoostcan'tachievezerotrainingerror.

(22.Th.L.penalt.i..ridg.regressio.i.equivalen.t..Laplac.prio.o.th.weights.(F)

(23.Th.log-likelihoo.o.th.dat.wil.alway.increas.throug.successiv.iteration.o.th.expectatio.niaxi

matio.algorithm.(F)

(24.1.trainin..logisti.regressio.mode.b.maximizin.th.likelihoo.o.th.label.give.th.input,w

.hav.multipl.locall.optima.solutions.(F)

四、回归

I.考虑回归一个正则化回归问题。在卜图中给出了惩罚函数为二次正则函数,当正则化参数C取

不同值时,在训练集和测试集上的log似然(meanlog-probability)。(10分)

(1)说法”随着C的增加,图2中训练集上的log似然永远不会增加”是否正确,并说明理由。

(2)解释当C取较大值时,图2中测试集上的log似然下降的原因。

2.考虑线性回归模型:,训练数据如下图所示。(10分)

(1)用极大似然估计参数,并在图(a)中画出模型。(3分)

(2)用正则化的极大似然估计参数,即在log似然目标函数中加入正则惩罚函数,

并在图(b)中画出当参数C取很大值时的模型。(3分)

(3)在正则化后,高斯分布的方差是变大了、变小了还是不变?(4分)

3.5

3

2.5

2

a1

0.5

0

-0.5

图(a)图(b)

3.考虑二维输入空间点上的回归问题,其中在单位正方形内。训练样本和测试样本在单位正

方形中均匀分布,输出模型为,我们用1-10阶多项式特征,采用线性回归模型来学习x与y之

间的关系(高阶特征模型包含所有低阶特征),损失函数取平方误差损失。

(1)现在个样本上,训训练误差最小训练误差最大测试误差最小

练1阶、2阶、8阶和10

阶特征的模型,然后在

一个大规模的独立的测

试集上测试,则在下3列

中选择合适的模型(可能

有多个选项),并解释第

3列中你选择的模型为

什么测试误差小。(10

分)

1阶特征的线性模型X

2阶特征的线性模型X

8阶特征的线性模型X

10阶特征的线性模型X

(2)现在个样本上,训训练误差最小训练误差最大测试误差最小

练1阶、2阶、8阶和1()

阶特征的模型,然后在

一个大规模的独立的测

试集上测试,则在下3列

中选择合适的模型(可能

有多个选项),并解释第

3列中你选择的模型为

什么测试误差小。(10

分)

1阶特征的线性模型X

2阶特征的线性模型

8阶特征的线性模型XX

10阶特征的线性模型X

(3.Tii.approximatio.erro.o..polynoniia.regressio.niode.dcpcnd.oth.numbc.o.trainin.points.(T)

(4.Th.stiuctura.erro.o..polynomia.regressk).mode.depend.o.th.numbe.o.trainin.points.(F)

4.W.ar.tryin.t.lear.regressio.parameter.fo..datase.whic.w.kno.wa.generate.fro..polynomia.o..certai.degre

e.bu.w.d.no.kno.wha.thi.degre.is.Assuni.th.dat.wa.actuall.generate.fro..polynomia.o.degre..wit.som.add

c.Gaussia.nois.(tha.i..

Fo.trainin.w.hav.l0.{x,y.pair.an.fo.testin.w.ar.usin.a.additiona.se.o.l0.{x,y.pairs.Sinc.

w.d.no.kno.th.degre.o.th.polynomia.w.lear.tw.model.fro.th.data.Modc..learn.paramete

r.fo..polynomia.o.degre..an.mode..learn.parameter.fo..polynomia.o.degre.6.Whic.o.the

s.tw.modeLi.like!.t.fi.lh.tes.dat.better?

Answer.Degre..polynomial.Sinc.th.mode.i..degre..polynomia.an.w.hav.enoug.trai

nin.data.th.mode.w.lear.fo..si.degre.polynomia.wil.likel.fi..ver.smal.coefficien.fo.x

..Thus.eve.thoug.i.i..si.degre.polynomia.i.wil.actuall.behav.i..ver.simila.wa.t..fift.d

egre.polynoniia.whic.i.th.correc.mode.leadin.t.bette.fi.t.th.data.

5.Inputdependentnoiseinregression

a)Ordinar.least-square.regressio.i.equivalen.t.assuniin.tha.eac.dat.poin.i.generate.accordin.t..linea.fun

ctio.o.ih.inpu.plu.zero-mean.constant-varianc.Gaussia.noise.I.man.systems.however.th.nois.varian

c.i.itsel..posidv.linea.functio.o.(h.inpu.(whic.i.assume.t.b.non-nega(ive.i.e...>.0).

b)Whichofthefollowingfamiliesofprobabilitymodelscorrectlydescribesthissituationinthe

univariatecase?(Hint:onlyoneofthemdoes.)

Pp/W।x=布1。…xp,(—(!/一(『^<>5+—)

m।、1(。一("b+Wi+"2)G)2\

iii.

—左一)

(iii.i.correct.L.Gaussia.distributio.ove.y.th.varianc.i.determine.b.th.coefficien.o.y2.s.b.replacin.b.w.

gc..varianc.tha.incrcasc.lincarLwit.x.(Not.als.th.chang.t.th.nonnalizatio.

“constant.".(iha.quadrati.dependenc.o.x.(ii.doe.no.chang.th.varianc.a.all.i.jus.rename.wl.

c)CircletheplotsinFigure1thatcouldplausiblyhavebeengeneratedbysomeinstanceofthe

modelfamily(ies)youchose.

(ii.an.(iii).(Not.lha.(iii.work.fo...(i.exhibit..larg.varianc.a...0.an.th.varianc.appear.independen.o.x.

d)True/False:Regressionwithinput-dependentnoisegivesthesamesolutionasordinaryregression

foraninfinitedatasetgeneratedaccordingtotheconcspondingmodel.

Truc.I.bot.casc.th.algorith.wil.rccovc.th.tru.undcrlyin.model.

e)Forthemodelyouchoseinpart(a),writedownthederivativeofthenegativeloglikelihoodwith

respecttowi.

Thenegativeloglikelihoodis

andthederivativew.r.t.u'iis

dLN

-(»o+•

Notethatforlinesthroughtheorigin(“•()=()),theoptimalsolitionhastheparticularlysimpleform

=W五

Itispossibletotakethoderivativeofthologwithoutnoticingthatlogoxp(j-)=.r:wcuseloglikelihoods

forag(xxlreason!Plus,theysimplifythehandlingofmultipledatapoints,lx*causetheproducto:

probabilitiesIMTOIBCSasumoflogprobabilities.

五、分类

产生式模.VS.判别式模型

(a.You.billionair.frien.need.you.help.S.good/ba.ca

tegories.an.als.t.detec.jo.applicant.wh.li.i.thei.application.usin.densit.estimatio.t.de

tec.outliers.T.mee.thes.needs.d.yo.recommen.usin..discriminativ.o.generativ.classi

fier.Why.

产生式模型

因为要估计密度〃(x|y)

(b.You.biHionair.frien.als.wanl.t.classif.soflwar.叩plication.l.detec.bug-pron.appli

cation.usin.feature.o.th.sourc.code.Tjec.onl.ha..fe.apphcation.t.b.use.a.tr

ainin.data.though.T.creat.th.mos.accurat.classifier.d.yo.recommen.usin..discrimina

tiv.o.generativ.classifier.Why?

判别式模型

样本数较少,通常用判别式模型直接分类效果会好些

Fpanie.t.dccid.whic.on.t.acquire.T

jec.ha.lot.o.trainin.dat.base.o.severa.decade.o.research.T.creat.th.mos.accura

t.classifier.d.yo.recommen.usin..discriminativ.o.generativ.classifier.Why?

产生式模型

样本数很多时,可以学习到正确的产生式模型

2.1ogstic回归

0

M

=

q

E

q

o-0.2

d」

l

6

o

_

-0.4

00.511.522.533.54

regularizationparameterC

Figure2:Log-probabilityoflabelsasafunctionofregularizationparameterC

1、Hblem.I.Figur.2.w.hav.plotte.th.mea.lo

g-probabilit.o.labcli.th.trainin.an.tes.sct.aftc.havin.trainc.th.classific.wit.quadrati.rcgularizati

o.penalt.an.differen.value.o.th.regularizatio.paramete.C.

I.trainin..logisti.regressio.mode.b.maximizin.th.likelihoo.o.th.label.give.th.input.w.hav.multipl.locall.op

tima.solutions.(F)

2、Answer.Th.log-prohabilit.o.label.give.example.implie.b.th.logisti.regressio.mode.i..concav.(c

onve.down.funclio.wit.respec.t.th.weights.Th.(only.locall.optima.solutio.i.als.globalLoplimal

.stochasti.gradien.algorith.fo.trainin.logisti.regressio.modeLwit..fixe.leaniin.rat.wil.f'in.th.optima.settin

.o.th.weight.exactly.(F)

3、Answer..fixe.learnin.rat.meanJha.w.ar.alway.takin..finit.sie.toward.iinprovin.th.log-probabili

t.o.an.sini>Lirainin.exciini)Li.tli.updat.equation.Unles.th.example.ar.someho.

“aligned".w.wil.contimi.jumjjin.fro.sid.t.sid.o.th.optima.soliition.an.wil.no.b.abl.t.ge.arbitr

aril,clos.t.it.Th.learnin.rat.ha.t.approach.zer.i.th.cours.o.ih.update.fo.th.weight.t.converge.

Th.averag.log-probabilit.o.trainin.label.a.i.Figur..ca.neve.increas.a.w.increas.C.(T)

Strongerregularizationmeansmoreconstraintsonthesolutionandthusthe(average)log-probability

ofthetrainingexamplescanonlyf>etworse.

Explai.wh.i.Figur..th.tes.log-probabilit.o.label.decrease.fo.larg.value.o.C.

A..increases.w.giv.mor.weif»h.t.consirainin.th.preclicior.an.thu.giv.les.Jlexibilii.i.Jitlin.th.lrainin.sei

.Th.increase,regularizaiio.guarantee,tha.th.tes.peiformanc.get.close,t.th.trahun.performance.bu.a

.w.over-consirai.ou.allowe.predictors.w.ar.no.abl.t.fi.th.irainin.se.a.alLan.althoug.ih.tes.performa

nc.i.no.ver.clos.t.th.trainin.peij'onnance.hot.ar.low.

Th.log-probabilit.o.labcLi.th.tes.se.wouLdccrcas.tb.larg.vakic.o.evc.i.w.ha.Jarg.niirnbc.o.trainin.exanip

les.(T)

Theabovear^urnenlstillholds,butthevalueofCforwhichwewillobservesuchadecreasewillscale

upwiththenumberofexamples.

4、Addingaquadraticregularizationpenaltyfortheparameterswhenestimatingalogistic

regressionmodelensuresthatsomeoftheparameters(weightsassociatedwiththe

componentsoftheinputvectors)vanish.

.regulurizatio.penalt.fo.featiir.seleciio.mus.hawnon-zer.derivativ.a.zero.Oihenvise.th.regularizaiio.ha.n

.effec.a.zero.an.weigh.wiLten.t.b.slightl.non-zero.eve.whe.thi.doe.no.improv.th.log-probabilitie.b.iauch.

3.正则化的Logstic回归

ThisproblemwewillrefertothebinaryclassificationtaskdepictedinFigure1(a).whichweattemptto

solvewiththesimplelinearlogisticregressionmodel

.1

P(y=l|x,tt-1,U'2)=+"2£2)=-----;--------------;

1+exp(一皿叫—«'2^2)

(fb.siniplicit.w.d.no.us.th.bia.parainctc.wO).Th.trainin.dat.ca.b.separatc.wit.zcr.trainin.crro..sc.lin.L.i.Fi

gur.l(b.R).instance.

(a)The2-dimensionaldatasetusedin(b)ThepointscanbeseparatedbyL\

Problem2(solidline).Possibleotherdecision

boundariesareshownbyL2;L3;LA.

(1)Consideraregularizationapproachwherewetrytomaximize

£logp(%|x”U'I.M-2)-—^2

i-l

fo.larg.C.Not.tha.onl.w.i.penalized.We'.lik.t.kno.whic.o.th.fbu.line.i.Figur.l(b.coul.aris.a..resul.o.suc.

regularization.Fo.eac.potentia.lin.L2.L.o.L.determin.whethe.i.ca.resul.fro.regularizin.w2.I.not.explai.ve

r.briefl.wh.not.

L2.No.Whe.vv.re^ulariz.w2.th.resullin.boundar.ca.reI.les.o.th.valu.o.x.an.therefor,become,mor.vertical.

I..her.seem.t.b.n)or.horizonta.a..resul.o.penali2in.w2

L3.Yes.Her.vv2A.i.smal.relativ.I.wlA.(a.evidence.b.hi^.slope).an.eve.thoug.i.woiil.eissig..rathe.Io.log-pro

habilit.t.th.obsei^e.labels,i.coul.b.force,b..larg.regularizatio.paramete.C.

(2)L4.No.Fo.ver.larg.C.w.ge..boundar.iha.i.eniirel.veriica.axis).L.her.i.reflecte.acros.th.

x.axi.an.represent..poore.solutio.tha.it'.counte.par.o.th.othe.side.Fo.moderat.regularizatio.w.hav

.t.ge.th.bes.solutio.tha.w,ca.construc.whil.keepin.w.sniall.Li.no.th.bes.an.thu.canno.corn.a.,resid.o.

regularizin.w2.

(3)Ifwechangetheformofregularizationtoone-norm(absolutevalue)andalsoregularizewlv/eget

thefollowingpenalizedlog-likelihood

J、c

£logp®i|xi,奶,—)--(|wi|+|w2|).

i=l

Cble.i.Figiir.l(a.an.th.sam.linea.logisii.regressio.modeLA.w.increas.th.regularizaio.p

aramete..whic.o.th.followin.scenario.d.yo.expec.t.obsen,.(choos.onl.one):

(x)Firstw\willbecome0,:henvv2.

()wlandw2willbecomezerosimultaneously

()Firstw2willbecome0,thenw\.

()Noneoftheweightswillbecomeexactlyzero,onlysmallerasCincreases

Th.dat.ca.b.classifie.wit.zer.irainin.eno.an.iherefor.als.wii.hig.'og-probabilit.b.lookin.a.th.valu.o.x.alo

ne.i.e.makin.w..O.hutiall.w.migh.prefe.t.ha\,..n(m-zer.valu.f().w.bu.i.wil.g.t.zer.rathe.quickl.a.w.increas.r

egularizaiion.Not.tha.w.pa..regularizatio.penalt.fi)..non-zer.valu.o.w.cin.i.i.doesn'.heI.classificatio.wh.

woul.w.pa.th.penalty.Th.absclut.vain,regularizcitio.ensure,tha.vv.wil.indee.g.t.exact!,zero.A..increase.fu

rther.eve.w.wil.eventuall.becom.zero.\V.pa.highe.an.highe.cosJo.settin.w.t..non-zer.value.Eventuall.thi.

cos.ovenvhelm.ih.gai.fro.th.lof>■probahilit.o.label.Iha.w.ca.achiev.wit..non-zer.w2.Nol.tha.whe.w..卬..0.1

h.log-probabilit.o.label.i..finit.valu.nlog(0:5).

1、SVM

Figure4:Trainingset,maximummarginlinearseparator,andthesupportvectors(inbold).

(1)Whatistheleave-one-outcross-validationerrorestimateR:rmaximummarginseparationinfigure

4?(wcarcaskingforanumber)(0)

(2)Base.o.th.figiir.w.ca.se.tha.removin.an.singLpoin.woul.no.chanc.th.resultin.nuiximu.margi.separat

or.Sinc.al.th.point.ar.iniiialLclassifie.correctly.th.leave-onc-ou.erro.i.zero.

W.woiil.expec.th.suppor.vector.t.reinai.th.sam.i.genera.a.w.mov.fro..linea.kerne.t.highe.orde.polynomia

.kcmcls.(F)

(3)Ther.ar.n.giuirantee.tha.th.suppor.vector.reinai.th.sanie.Th.featur.vecior.correspondin.t.polynomia

.kernel.ar.non-linea.function.o.th.ohgina.inpu.vector.an.thii.th.suppor.point.fo.maxinuLmargisepa

ratio,i.th.feaiur.spac.ca.b.quit.different.

Structura.ris.niininiizatio.i.guaranteed.fin.th.mode.(amon.thos.a)nsidered.wit.th.lowes.expecte.loss.(F)

Weareguaranteedtofindonlythemodelwiththelowestupperboundontheexpectedloss.

(4)WhatistheVC-dimensionofamixtureoftwoGaussiansmodelintheplanewithequalcovariance

matrices?Why?

.mixtur.o.tw.Gaiissian.\vil.equa.covananc.matrice.ha..linea.decisio.boundary.Linea.separator.i.th.plan.

hav.VC-di.exactl.3.

4.SVM

对如下数据点进行分类:

class£1①2

4-11

十22

十20

一00

10

01

(a.Plo.thes.si.trainin.points.?Xr.th.classe.{+.-linearl.separable?

yes

(b)Constructtheweightvectorofthemaximummarginhypcrpiancbyinspectionandidentifythe

supportvectors.

Themaximummarginhypcrplaneshouldhaveaslopeof-1andshouldsatisfyxi=3/2,xz=0.

Thereforeit'sequationisxi4-X2=3/2,andtheweightvectoris(l,1)T.

(c)Ifyouremoveoneofthesupportvectorsdocsthesizeoftheoptimalmargindecrease,staythe

same,orincrease?

Inthisspecificdatasettheoptimalmarginincreaseswhenweremovethesupportvectors(1.0)or(1.1)

andstaysthesamewhenweremovetheothertwo.

(d)(ExtraCredit)Isyouranswerto(c)alsotrueforanydataset?Provideacounterexampleorgivea

shortproof.

Wblem.w.ge.a.optima.valu.whic.i.a.lcas.a.goo.th.

previou.one.I.i.becaus.th.se.o.candidate.satisfyin.th.origina.(larger.stronger.se.o.contraint.i..subse.o.th.c

andidatc.satisfyin.th.nc.(smalIcr.wcakcr.se.o.constraints.So.fo.th.wcakc.constraints.th.oldoptinia.solutio

.i.stil.aviiilabLan.thcr.ma.b.addition.solton.tha.ar.cvc.bcttcr.I.mathcmatica.form:

maxf(x)<max.

Finally.not.tha.i.SV.problem.w.ar.maxiniizin.th.margi.subjec.t.th.constraint.give.b.trainin.points.Whe.

w.dro.an.o.th.constraint.th.margi.ca.increas.o.sta.th.sam.dependin.o.th.dataset.I.blem.wit.real

isti.dataset.

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论