版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、1 Give the definitions or your comprehensions of the following terms.(12')1.1 The in ductive lear ning hypothesisP171.2 Overfitti ngP491.4 Con siste nt lear nerP1482 Give brief answers to the following questions.(15')2.2 If the size of a version space is| VS |, In general what is the smalles
2、t numberof queries may be required by a concept learner using optimal query strategy to perfectly lear n the target con cept?P272.3 In genaral,decision trees represent a disjunctionof conjunctions ofconstrains on the attribute values of instanse,then what expression does the follow ing decisi on tre
3、e corresp onds to ?3 Give the expla in ati on CANDIDATE-ELIMINATIONto in ductive algorithm,bias, and listin ductivebias ofdecisi ontreelear nin g(ID3),BACKPROPAGATION algorithm.(10 ')4 How to solve overfitting in decision tree and neural network?(10')Soluti on:Decisi on tree:及早停止树增长(stop gro
4、wing earlier)后修剪法(post-pruning)Neural Network权值衰减(weight decay) 验证数据集(validation set)5 Prove that the LMS weight update rulei i(Vtrain (b) V (b)xi performs agradient descent to minimize the squared error. In particular, define the squared error E as in the text. Now calculate the derivative of E wit
5、h respect to the weightAi, assuming that V (b) is a linear function as definedin the text. Gradientdesce nt is achieved by updat ing each weight in proport ion toE.Therefore,iyou must show that the LMS training rule alters weights in this proportion foreach training exampleit encounters.( E(VtrainbM
6、ain (b) training exampleA(b) V(b)(8')Soluti onAs Vtrai n(b)? (Successor(b)we can get E=Main (bV(b) w0 +w1x1+w2x2+w3x3+w4x4 +w5x5+w 6x6E/Wj2(Vtrain(b)2(Vtrain (b)V(b)*As mentioned in LMS:(Vtrain (b) V?(b)XV(b)g Main (b) V?(b)/ WiWe can get i/2Therefore, gradie nt desce nt is achieveme nt by updat
7、i ng each weight in proport ion to E/ wi ;LMS rules encoun ters.altersweightsin thisproporti onfor each trai ningexample it6 True orfalse:if decisiontreeD2is an elaborationoftree D1,the n D1 ismore-ge neral-tha nD2. AssumeD1and D2 are decisi ontreesreprese nti ngarbitrarybooleanfuncions.andthatD2 is
8、 an elaborationof D1if ID3 couldexte nd D1 to D2. If true give a proof; if false, a coun ter example.(Definition:Lethj and hk beboolean-valuedfunctions defined over X .then hjis more_general_than_or_equal_tohk (writtenhj g hk ) If and only if(x X)(hk(x)1)(hj(x)1) then hjhk (hjg hk) (hk g hj)(10'
9、)The hypothesis is false.One cou nter example is A XOR Bwhile if A!=B, trai ning examples are all positive,while if A=B, training examples are all negative,then, using ID3 to exte nd D1, the new tree D2 will be equivale nt to D1, i.e., D2 is equal to D1.7 Desig natwo-i nputperceptr onthatimpleme nts
10、theboolea nfun ctio nA B .Design a two-layernetworkof perceptrons that implements A XOR B .(10 ')8 Suppose that a hypothesis space containing three hypotheses,h-i, h2, h3, andthe posterior probabilities of these typotheses give n the training data are 0.4, 0.3and 0.3 respectively. And if a new i
11、nstaneex is encountered, which is classifiedpositive by h| , but negative by h2 andh3 ,then give the result and detailclassification course of Bayes optimal classifier.(10')P1259 Suppose S is a collectionof training-exampledays described by attributesincluding Humidity, which can have the values
12、 High or Normal. Assume S is acollecti on containing 10 examples, 7+,3-. Of these 10 examples, suppose 3 of the positive and 2 of the n egative examples have Humidity = High, and the rema in der have Humidity = Normal. Please calculate the in formatio n gain due to sort ing the original 10 examples
13、by the attribute Humidity.(log 2仁0, log 22=1, log 23=1.58,log 24=2, log 25=2.32, log 26=2.58,log 27=2.8, log 28=3, log 29=3.16,log 2 10=3.32,)(5')Soluti onHere we denote S=7+,3-,then Entropy(7+,3-)=10 10 10 10=0.886;(b) Gain(S,Humidity)=Entropy(S)-v values(Humidity JEntropy(Sv) Gain(S,a2)Q Value
14、s( Humidity )=High, NormalSHighs S | Humidity (s) High33 22Entropy(SHigh Apog?二-匸1og2二 0.972, SHigh5=455 554 4 11EntrOpy(SNormal)=-:lOg2 匚-匚 log2 匚 0.72, Srmal =55 5 5555ThusGain (S,Humidity) =0.886- (0.972*0.72) =0.04101010 Finish the following algorithm. (10')(1) GRADIENT-DESCENT(training exam
15、ples, Each training example is a pair of the form in put values, and t is the target output value.)x is the vector ofis the lear ning rate (e.g.,0.05).In itialize eachi to some small random valueUn til the term in atio n con diti on is met, DoIn itialize eachto zero.For eachin trai nin g_examples, D
16、oIn put the in sta neex to the unit and compute the output oFor each linear unit weighti , DoFor each lin ear unit weighti, Do(2) FIND-S AlgorithmIn itialize h to the most specific hypothesis in HFor each positive trai ning in sta nee xFor each attribute con stra int ai in hIf The ndo nothingElserep
17、lace ai in h by the next more general constraintthat issatisfied by xOutput hypothesis h1. What is the defi niti on of lear ning problem?(5)Use “ a checkers learning problem ” as an example to state how to design a lear ning system. (15)An swer:A computer program is said to lear n from experie nee E
18、 with respect to someclass of tasks T and performa nee measure P, if its performa nee at tasks in T, asmeasuredbyP,improvesExample:A checkers lear ning problem:T: play checkers(1)P:perce ntageof games won(1)E:opport un itytoplay(1)To desig n a lear ning system:Step1:Choosi ngthewithexperie nee.in at
19、ourn ame ntaga institselfTrai ningExperie neeA checkers lear ning problem:Task T: play ing checkersPerforma nee measure P: perce nt of games won in the world tourn ame ntTraining experie nee E: games played aga in st itselfIn order to complete the design of the learning system, we must now choose1.
20、the exact type of kno wledge to be lear ned2. a represe ntati on for this target kno wledge3. a lear ning mecha nismStep2:Choos ingtheTargetFunction1. if b is a final board state that is won, then V(b)=1002. if b is a final board state that is lost, then V (b)=-1003. if b is a final board state that
21、 is drawn, then V (b)=04. if b is a not a final state in the game, then V(b)=V (b'), where b' is the bestfinalboard statethat can be achievedstartingfrom b and playingoptimallyuntilthe end ofthe game (assuming theopponentplays optimally, aswell).Step3:Choosinga Representationfor theTarget Fu
22、nction(4)x1: the number of black pieces on the boardx2: the number of red pieces on the boardx3: the number of black kings on the boardx4: the number of red kings on the boardx5: the number of black pieces threatened by red (i.e., which can be captured on red's ext turn)x6: the number of red pie
23、ces threatened by black.Thus, our learning program will represent V (b) a's a linear function of the formV (b)=w o+wlxl+w2x2+w 3x3+w4x4+w5x5+w 6x6where w o through w 6 are numerical coefficients, or weights, to be chosen by thelearning algorithm. Learned values for the weights w1 through w6 will
24、 determinethe relative importance of the various board features in determining the value ofthe board, whereas the weight wo will provide an additive constant to the boardvalue.2. Answer:Find-S & Find-G:Step 1: Initialize S to the most specific hypothesis in H. (1)S0: , , , , , Initialize G to th
25、e most general hypothesis in H.G 0:?, ?, ?, ?, ?, ?.Step 2: The first example is <Sunny, Warm, Normal, Strong, Warm, Same, +>(3)S 1:Sunny, Warm, Normal, Strong, Warm, Same G1:?, ?, ?, ?, ?, ?.、-5、-5JStep 3: The second example is <Sunny, Warm, High, Strong, Warm, Same, +> (3)S 2:Sunny, Wa
26、rm, ?, Strong, Warm, SameG2:? ? ? ? ? ?Step 4: The third example is <Rainy. Cold, High,Strong.Warm, Change, ->S3: Su nny, Warm, ?, Stro ng, Warm, Same G3:<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, SameStep 5: The fourth example is <S unny, Warm, High, Stro
27、ng, Cool, Chan ge, +>S4:Su nny, Warm, ?, Stro ng, ?, ?G4:<S unny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> Fin ally, all the hypotheses are:(2)<Sunny,Warm, ?, Strong,?, ?>, <Sunny, ?, ?, Strong, ?, ?>, <Sunny,Warm, ?, ?, ?, ?>,<?, Warm, ?, Stro ng, ?, ?>, <Sunn
28、y, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>3. An swer:Flog(X) = -X*log(X)-(1-X)*log(1-X);STEP1 choose the root no de:en tropy_all=flog(4/10)=0.971;gain_outlook = entropy_all- 0.3*flog(1/3)- 0.3*flog(1)- 0.4*flog(1/2)=0.296;(1)gain_templture = en tropy_all - 0.3*flog(1/3) - 0.3*flog(1/3) - 0.4*
29、flog(1/2)=0.02;(1)gain _humidity=en tropy_all-0.5*flog (2/5)-0.5*flog(1/5)=0.125;(1)gain_wind=en tropy_all-0.6*flog(5/6)-0.4*flog(1/4)=0.256;(1)RootNodeis“ outlookstep 2 choose the second NODE: for sunny (humidity OR temperature):entropy_sunny =flog(1/3)=0.918;(1)sunny_gain_windentropy_sunny -(2/3)*
30、flog(0.5)- (1/3)*flog(1)=0.252;(1)sunny_gain_humidity= entropy_sunny- (2/3)*flog(1)- (1/3)*flog(1)=0.918;(1) sunny_gain_temperature = entropy_sunny - (2/3)*flog(1) - (1/3)*flog(1)=0.918;(1)choose humidity or temperature. (1)for rain (wind):entropy_rain = flog(1/2)=1; rain_gain_wind = entropy_rain(1)
31、 rain_gain_humidity = entropy_rain(1)rain_gain_temperature = entropy_rain(1)choose(1)(1) (1/2)*flog(1) - (1/2)*flog(1)=1;- (1/2)*flog(1/2)-(1/2)*flog(1/2)=0;- (1/4)*flog(1)- (3/4)*flog(1/3)=0.311;wind.oroutlookHigh4. An swer:A: The primitive n eural un its are: perceptro n, li near unit and sigmoid
32、un it.(3)Perceptr on:(2)A perceptr ontakes a vector of real-valued in puts, calculates a lin earcomb in ati on of these in puts, the n output a 1 if the result is greater tha n somethreshold and -1 otherwise. More precisely, give n in put x1 through xn,theoutput o(x1,.xi,. xn) computed by the percep
33、tr on is NSometimes write the perceptr on fun cti on asLin ear un its:(2)a lin ear unit for which the output o is give n byThus, a lin ear un it corresp onds to the first stage of a perceptr on, without the threshold.Sigmoid un its:(2)The sigmoid un it is illustrated in picture like the perceptro n,
34、 the sigmoid un itfirst computes a lin ear comb in ati on of its in puts, the n applies a threshold to theresult. In the case of the sigmoidunit,however, the threshold output is acon ti nu ous fun cti on of its in put.-netMore precisely, the sigmoid un it computes its output o asWhere,B:(因题目有打印错误,所以
35、感知器规则和delta规则均可,给出的是delta规则)Derivati on process is:dE眾盘21 / 22 kn如2匸2仙oj)d-一(切 %)伽一仍扃)怎 3w-£(站呛)(-也5. Answer:P(no)=5/14(1)P(sunny|no)=3/5(1)P(cool|no)=1/5P(high|no)=4/5P(strong|no)=3/5P(no|new instance)=P(no)*P(sunny|no)*P(cool|no)*P(high|no)*P(strong|no) -2=5/14*3/5*1/5*4/5*3/5 = 0.02057=2.057
36、*10 -2P(sunny|yes)=2/9P(cool|yes)=3/9P(high|yes)=3/9P(strong|yes)=3/9P(yes|new instance)=P(yes)*P(sunny|yes)*P(cool|yes)*P(high|yes)*P(strong|yes) =9/14*2/9*3/9*3/9*3/9 = 0.05291=5.291*10 -3ANSWER: NOP(yes)=9/14(1)(1)(1)(2)(1)(1)(1)(1)(2)(2)6. Answer:INDUCTIVE BIAS: (8)Consider a concept learning al
37、gorithm L for the set of instances X, Let c be anarbitrary concept define over X, and letDc = < x; c (x) > be an arbitrary set ofinstancexi by L after training on the dataminimal set of assertions B such corresponding training examples Dc:(? xi X)(B A xi A De) ?L(xi;Dc)-Dc .The inductive bias
38、of L is any that for any target concept c andtraining examples of c . Letdenote the classification assigned to theThe futility of bias-free learning:(7)A learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instances
39、. In fact, the only reason that the learner was able to generalize beyond the observed training examples is that it was biased by the inductive bias.Unfortunately , the only instances that will produce a unanimous vote are the previously observed training examples. For, all the other instances, taki
40、ng a vote will be futile: each unobserved instance will be classified positive by precisely half the hypotheses in the version space and will be classified negative by the other half.1 In the EnjoySportlearningtask, every example day is representedby 6attributes. Given that attributes Sky has three
41、possible values, and that AirTemp、Humidity 、Wind、Wind、Water and Forecast each have two possible values. Explain why the size of the hypothesis space is 973. How would thenu mber of possiblein sta ncesand possible hypotheses in crease with theaddition of one attribute A that takes on on K possible va
42、lues?2 Write the algorithm of Can didate_Elim in ati on using vers ion space. Assume Gis the set of maximally gen eral hopytheses in hypothesis space H, and S is the set of maximally specific hopytheses.3 Con sider the followi ng set of trai ning examples for EnjoySport:ExampleSkyAirTempHumidityWind
43、WaterForcastEnjoySp ort1sunnywarmno rmalstro ngwarmsameYes2sunnywarmhighstro ngwarmsameyes3rai nycoldhighstro ngwarmchaggeno4sunnywarmhighstro ngcoolchangeyes5sunnywarmno rmalweakwarmsameno(a) Whatis theEntropy of thecollectio ntrai ningexamples withrespect to thetarget function classificati on?(b)
44、According to the 5 traning examples, compute the decision tree that be learned by ID3, and show the decisi on tree.(Iog23=1.585, Iog25=2.322)4 Give several approaches to avoid overfitti ng in decisi on tree lear ning. How todetermin the correct final tree size?5 Write the BackPropagatio n algorithm
45、for feedforward n etwork containing twolayers of sigmoid un its.6 Explai n the Maximum a posteriori(MAP) hypothesis.7 Usi ng Naive Byes Classifier to classify the new in sta nee:<Outlook=s unn y,Temperature=cool,Humidity=high,Wi nd=stro ng>Our task is to predict the target value (yes or no) of
46、 the target conceptPlayTennis for this new instanee. The table blow provides a set of 14 training examples of the target con cept.DayOutlookTemperatureHumidityWindPlayTe nnisD1SunnyHotP HighWeakNoD2SunnyHotHighStro ngNoD3OvercastHotHighWeakYesD4Rai nMildHighWeakYesD5Rai nCoolNormalWeakYesD6Rai nCool
47、NormalStro ngNoD7OvercastCoolNormalStro ngYesD8SunnyMildHighWeakNoD9SunnyCoolNormalWeakYesD10Rai nMildNormalWeakYesD11SunnyMildNormalStro ngYesD12OvercastMildHighStro ngYesD13OvercastHotNormalWeakYesD14Rai nMildHighStro ngNo8 Question Eight : The definition of three types of fitness functions in gen
48、eticalgorithmQuestio n oneDefinition: A computer program is said to kam from experience E with respect to some class of T and performance measure P, if its performance at casks in as measured bv P* improves with experience E.(举一个例子,比如:导航仪、西洋跳棋)Questio n two:Ini tilize: G=?,?,?,?S=,Step 1:G=?,?,?,?,?
49、,?S=s unny ,warm ,no rmal,str on g,warm,sameStep2: coming one positive in sta nee 2G=?,?,?,?,?,?S=s unn y,warm,?,stro ng,warm,sameStep3: coming one n egative in sta nee 3G=<Su nn y,?,?,?,?,?> <?,warm,?,?,?,?> <?,?,?,?,?,same>S=s unn y,warm,?,str on g,warm,sameStep4: coming one posi
50、tive in sta nee 4S= sunny ,warm,?,str on g,?,? G=<Su nn y,?,?,?,?,?> <?,warm,?,?,?,?>Questio n three(a) Entropy(S)=og(3/5) og(2/5)= 0.971(b) Gain(S,sky) = Entropy(S) (4/5) Entropy(Ssunny) + (1/5) Entropy(Srainny)=0.322Gai n( S,AirTemp) = Gai n(S,wi nd) = Gai n(S,sky) =0.322Gai n( S,Humid
51、ity) = Gain (S,Forcast) = 0.02Gai n( S,water) = 0.171Choose any feature of AirTemp, wind and sky as the top node.The decisi on tree as follow: (If choose sky as the top no de)wind No7Tweak '导 ”gQuestio n FourAn swer:In ductive bias: give some proor assumpti on for a target con cept made by the l
52、earner to have a basis for classify ing un see n in sta nces.Suppose L is a mach ine lear ning algorithm and x is a set of training examples.L(xi, Dc) denotes the classification assigned to xi by L after training examples onDc. Then the in ductive bias is a mini mal set of asserti on B, give n an ar
53、bitrary target con cept C and set of training examples Dc: (Xi ) (BDcXi) -| L(xi, Dc)C_E: the target con cept is contained in the give n gypothesis space H, and the training examples are all positive examples.ID3: a, small trees are preferred over larger trees.B, the trees that place high informatio
54、n gain attribute close to root are preferred over those that do not.BP:Smooth in terpolati on betee n data poin ts.areQuestio n FiveAn swer: In n a?ve bayes classificati on,we assump that all attributesin depe ndent give n the tatget value, whilein bayes belif n et, it specifes a set ofconditional i
55、ndependence along with a set of probability distribution.Question Six :随即梯度下降算法GrDlFNT*DE3CENT(frdining jfxamples. jj)Eac/t training example is a pair of the form (xT t), where x is the vector of input values, andt is lhe large! output value, r is the learning rait (.g.r ,05. lnitULize each 昭 to some mall random ¥hh)c Uniil the Lermination condition ist met, Do* Initialize each AiUj to zero. For each (j?, t) in training e
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年祖国的幼儿园课件
- 2026年奖状格式幼儿园
- 2026年爱路护路幼儿园
- 2026年幼儿园碰球游戏
- 2026年互动课件幼儿园
- 2026年幼儿园着装教案和
- 2026年幼儿园怎样正确洗手
- 航空客运服务与票务管理手册
- 烹饪技艺传承与创新手册
- 柳州工贸大厦计算机管理系统
- 2026年宝鸡市辛家山林业局、宝鸡市马头滩林业局招聘(12人)考试参考题库及答案解析
- 超声科产前筛查异常应急预案演练脚本
- 2026年非遗保护中心招聘考试面试题及参考答案
- 6.3 社会主义市场经济体制(教学设计) 2025-2026学年统编版道德与法治八年级下册
- 2026年及未来5年市场数据中国电化学工作站行业发展监测及投资战略咨询报告
- 江苏省南京市2025届中考化学试卷(含答案)
- DB35-T 2262-2025 海峡两岸共通 美人茶加工技术规程
- DB5134-T 14-2021 美丽乡村 农村人居环境整治规范
- 《医学免疫学》 课件 第1-7章 免疫学概述- 细胞因子
- 大学校医笔试试题及答案
- 第11课《防恐防暴有办法》课件
评论
0/150
提交评论