下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、1 Give the definitions or your comprehensions of the following terms.(12')1.1 The in ductive lear ning hypothesisP171.2 Overfitti ngP491.4 Con siste nt lear nerP1482 Give brief answers to the following questions.(15')2.2 If the size of a version space is| VS |, In general what is the smalles
2、t numberof queries may be required by a concept learner using optimal query strategy to perfectly lear n the target con cept?P272.3 In genaral,decision trees represent a disjunctionof conjunctions ofconstrains on the attribute values of instanse,then what expression does the follow ing decisi on tre
3、e corresp onds to ?3 Give the expla in ati on CANDIDATE-ELIMINATIONto in ductive algorithm,bias, and listin ductivebias ofdecisi ontreelear nin g(ID3),BACKPROPAGATION algorithm.(10 ')4 How to solve overfitting in decision tree and neural network?(10')Soluti on:Decisi on tree:及早停止树增长(stop gro
4、wing earlier)后修剪法(post-pruning)Neural Network权值衰减(weight decay) 验证数据集(validation set)5 Prove that the LMS weight update rulei i(Vtrain (b) V (b)xi performs agradient descent to minimize the squared error. In particular, define the squared error E as in the text. Now calculate the derivative of E wit
5、h respect to the weightAi, assuming that V (b) is a linear function as definedin the text. Gradientdesce nt is achieved by updat ing each weight in proport ion toE.Therefore,iyou must show that the LMS training rule alters weights in this proportion foreach training exampleit encounters.( E(VtrainbM
6、ain (b) training exampleA(b) V(b)(8')Soluti onAs Vtrai n(b)? (Successor(b)we can get E=Main (bV(b) w0 +w1x1+w2x2+w3x3+w4x4 +w5x5+w 6x6E/Wj2(Vtrain(b)2(Vtrain (b)V(b)*As mentioned in LMS:(Vtrain (b) V?(b)XV(b)g Main (b) V?(b)/ WiWe can get i/2Therefore, gradie nt desce nt is achieveme nt by updat
7、i ng each weight in proport ion to E/ wi ;LMS rules encoun ters.altersweightsin thisproporti onfor each trai ningexample it6 True orfalse:if decisiontreeD2is an elaborationoftree D1,the n D1 ismore-ge neral-tha nD2. AssumeD1and D2 are decisi ontreesreprese nti ngarbitrarybooleanfuncions.andthatD2 is
8、 an elaborationof D1if ID3 couldexte nd D1 to D2. If true give a proof; if false, a coun ter example.(Definition:Lethj and hk beboolean-valuedfunctions defined over X .then hjis more_general_than_or_equal_tohk (writtenhj g hk ) If and only if(x X)(hk(x)1)(hj(x)1) then hjhk (hjg hk) (hk g hj)(10'
9、)The hypothesis is false.One cou nter example is A XOR Bwhile if A!=B, trai ning examples are all positive,while if A=B, training examples are all negative,then, using ID3 to exte nd D1, the new tree D2 will be equivale nt to D1, i.e., D2 is equal to D1.7 Desig natwo-i nputperceptr onthatimpleme nts
10、theboolea nfun ctio nA B .Design a two-layernetworkof perceptrons that implements A XOR B .(10 ')8 Suppose that a hypothesis space containing three hypotheses,h-i, h2, h3, andthe posterior probabilities of these typotheses give n the training data are 0.4, 0.3and 0.3 respectively. And if a new i
11、nstaneex is encountered, which is classifiedpositive by h| , but negative by h2 andh3 ,then give the result and detailclassification course of Bayes optimal classifier.(10')P1259 Suppose S is a collectionof training-exampledays described by attributesincluding Humidity, which can have the values
12、 High or Normal. Assume S is acollecti on containing 10 examples, 7+,3-. Of these 10 examples, suppose 3 of the positive and 2 of the n egative examples have Humidity = High, and the rema in der have Humidity = Normal. Please calculate the in formatio n gain due to sort ing the original 10 examples
13、by the attribute Humidity.(log 2仁0, log 22=1, log 23=1.58,log 24=2, log 25=2.32, log 26=2.58,log 27=2.8, log 28=3, log 29=3.16,log 2 10=3.32,)(5')Soluti onHere we denote S=7+,3-,then Entropy(7+,3-)=10 10 10 10=0.886;(b) Gain(S,Humidity)=Entropy(S)-v values(Humidity JEntropy(Sv) Gain(S,a2)Q Value
14、s( Humidity )=High, NormalSHighs S | Humidity (s) High33 22Entropy(SHigh Apog?二-匸1og2二 0.972, SHigh5=455 554 4 11EntrOpy(SNormal)=-:lOg2 匚-匚 log2 匚 0.72, Srmal =55 5 5555ThusGain (S,Humidity) =0.886- (0.972*0.72) =0.04101010 Finish the following algorithm. (10')(1) GRADIENT-DESCENT(training exam
15、ples, Each training example is a pair of the form in put values, and t is the target output value.)x is the vector ofis the lear ning rate (e.g.,0.05).In itialize eachi to some small random valueUn til the term in atio n con diti on is met, DoIn itialize eachto zero.For eachin trai nin g_examples, D
16、oIn put the in sta neex to the unit and compute the output oFor each linear unit weighti , DoFor each lin ear unit weighti, Do(2) FIND-S AlgorithmIn itialize h to the most specific hypothesis in HFor each positive trai ning in sta nee xFor each attribute con stra int ai in hIf The ndo nothingElserep
17、lace ai in h by the next more general constraintthat issatisfied by xOutput hypothesis h1. What is the defi niti on of lear ning problem?(5)Use “ a checkers learning problem ” as an example to state how to design a lear ning system. (15)An swer:A computer program is said to lear n from experie nee E
18、 with respect to someclass of tasks T and performa nee measure P, if its performa nee at tasks in T, asmeasuredbyP,improvesExample:A checkers lear ning problem:T: play checkers(1)P:perce ntageof games won(1)E:opport un itytoplay(1)To desig n a lear ning system:Step1:Choosi ngthewithexperie nee.in at
19、ourn ame ntaga institselfTrai ningExperie neeA checkers lear ning problem:Task T: play ing checkersPerforma nee measure P: perce nt of games won in the world tourn ame ntTraining experie nee E: games played aga in st itselfIn order to complete the design of the learning system, we must now choose1.
20、the exact type of kno wledge to be lear ned2. a represe ntati on for this target kno wledge3. a lear ning mecha nismStep2:Choos ingtheTargetFunction1. if b is a final board state that is won, then V(b)=1002. if b is a final board state that is lost, then V (b)=-1003. if b is a final board state that
21、 is drawn, then V (b)=04. if b is a not a final state in the game, then V(b)=V (b'), where b' is the bestfinalboard statethat can be achievedstartingfrom b and playingoptimallyuntilthe end ofthe game (assuming theopponentplays optimally, aswell).Step3:Choosinga Representationfor theTarget Fu
22、nction(4)x1: the number of black pieces on the boardx2: the number of red pieces on the boardx3: the number of black kings on the boardx4: the number of red kings on the boardx5: the number of black pieces threatened by red (i.e., which can be captured on red's ext turn)x6: the number of red pie
23、ces threatened by black.Thus, our learning program will represent V (b) a's a linear function of the formV (b)=w o+wlxl+w2x2+w 3x3+w4x4+w5x5+w 6x6where w o through w 6 are numerical coefficients, or weights, to be chosen by thelearning algorithm. Learned values for the weights w1 through w6 will
24、 determinethe relative importance of the various board features in determining the value ofthe board, whereas the weight wo will provide an additive constant to the boardvalue.2. Answer:Find-S & Find-G:Step 1: Initialize S to the most specific hypothesis in H. (1)S0: , , , , , Initialize G to th
25、e most general hypothesis in H.G 0:?, ?, ?, ?, ?, ?.Step 2: The first example is <Sunny, Warm, Normal, Strong, Warm, Same, +>(3)S 1:Sunny, Warm, Normal, Strong, Warm, Same G1:?, ?, ?, ?, ?, ?.、-5、-5JStep 3: The second example is <Sunny, Warm, High, Strong, Warm, Same, +> (3)S 2:Sunny, Wa
26、rm, ?, Strong, Warm, SameG2:? ? ? ? ? ?Step 4: The third example is <Rainy. Cold, High,Strong.Warm, Change, ->S3: Su nny, Warm, ?, Stro ng, Warm, Same G3:<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>, <?, ?, ?, ?, ?, SameStep 5: The fourth example is <S unny, Warm, High, Stro
27、ng, Cool, Chan ge, +>S4:Su nny, Warm, ?, Stro ng, ?, ?G4:<S unny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> Fin ally, all the hypotheses are:(2)<Sunny,Warm, ?, Strong,?, ?>, <Sunny, ?, ?, Strong, ?, ?>, <Sunny,Warm, ?, ?, ?, ?>,<?, Warm, ?, Stro ng, ?, ?>, <Sunn
28、y, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?>3. An swer:Flog(X) = -X*log(X)-(1-X)*log(1-X);STEP1 choose the root no de:en tropy_all=flog(4/10)=0.971;gain_outlook = entropy_all- 0.3*flog(1/3)- 0.3*flog(1)- 0.4*flog(1/2)=0.296;(1)gain_templture = en tropy_all - 0.3*flog(1/3) - 0.3*flog(1/3) - 0.4*
29、flog(1/2)=0.02;(1)gain _humidity=en tropy_all-0.5*flog (2/5)-0.5*flog(1/5)=0.125;(1)gain_wind=en tropy_all-0.6*flog(5/6)-0.4*flog(1/4)=0.256;(1)RootNodeis“ outlookstep 2 choose the second NODE: for sunny (humidity OR temperature):entropy_sunny =flog(1/3)=0.918;(1)sunny_gain_windentropy_sunny -(2/3)*
30、flog(0.5)- (1/3)*flog(1)=0.252;(1)sunny_gain_humidity= entropy_sunny- (2/3)*flog(1)- (1/3)*flog(1)=0.918;(1) sunny_gain_temperature = entropy_sunny - (2/3)*flog(1) - (1/3)*flog(1)=0.918;(1)choose humidity or temperature. (1)for rain (wind):entropy_rain = flog(1/2)=1; rain_gain_wind = entropy_rain(1)
31、 rain_gain_humidity = entropy_rain(1)rain_gain_temperature = entropy_rain(1)choose(1)(1) (1/2)*flog(1) - (1/2)*flog(1)=1;- (1/2)*flog(1/2)-(1/2)*flog(1/2)=0;- (1/4)*flog(1)- (3/4)*flog(1/3)=0.311;wind.oroutlookHigh4. An swer:A: The primitive n eural un its are: perceptro n, li near unit and sigmoid
32、un it.(3)Perceptr on:(2)A perceptr ontakes a vector of real-valued in puts, calculates a lin earcomb in ati on of these in puts, the n output a 1 if the result is greater tha n somethreshold and -1 otherwise. More precisely, give n in put x1 through xn,theoutput o(x1,.xi,. xn) computed by the percep
33、tr on is NSometimes write the perceptr on fun cti on asLin ear un its:(2)a lin ear unit for which the output o is give n byThus, a lin ear un it corresp onds to the first stage of a perceptr on, without the threshold.Sigmoid un its:(2)The sigmoid un it is illustrated in picture like the perceptro n,
34、 the sigmoid un itfirst computes a lin ear comb in ati on of its in puts, the n applies a threshold to theresult. In the case of the sigmoidunit,however, the threshold output is acon ti nu ous fun cti on of its in put.-netMore precisely, the sigmoid un it computes its output o asWhere,B:(因题目有打印错误,所以
35、感知器规则和delta规则均可,给出的是delta规则)Derivati on process is:dE眾盘21 / 22 kn如2匸2仙oj)d-一(切 %)伽一仍扃)怎 3w-£(站呛)(-也5. Answer:P(no)=5/14(1)P(sunny|no)=3/5(1)P(cool|no)=1/5P(high|no)=4/5P(strong|no)=3/5P(no|new instance)=P(no)*P(sunny|no)*P(cool|no)*P(high|no)*P(strong|no) -2=5/14*3/5*1/5*4/5*3/5 = 0.02057=2.057
36、*10 -2P(sunny|yes)=2/9P(cool|yes)=3/9P(high|yes)=3/9P(strong|yes)=3/9P(yes|new instance)=P(yes)*P(sunny|yes)*P(cool|yes)*P(high|yes)*P(strong|yes) =9/14*2/9*3/9*3/9*3/9 = 0.05291=5.291*10 -3ANSWER: NOP(yes)=9/14(1)(1)(1)(2)(1)(1)(1)(1)(2)(2)6. Answer:INDUCTIVE BIAS: (8)Consider a concept learning al
37、gorithm L for the set of instances X, Let c be anarbitrary concept define over X, and letDc = < x; c (x) > be an arbitrary set ofinstancexi by L after training on the dataminimal set of assertions B such corresponding training examples Dc:(? xi X)(B A xi A De) ?L(xi;Dc)-Dc .The inductive bias
38、of L is any that for any target concept c andtraining examples of c . Letdenote the classification assigned to theThe futility of bias-free learning:(7)A learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instances
39、. In fact, the only reason that the learner was able to generalize beyond the observed training examples is that it was biased by the inductive bias.Unfortunately , the only instances that will produce a unanimous vote are the previously observed training examples. For, all the other instances, taki
40、ng a vote will be futile: each unobserved instance will be classified positive by precisely half the hypotheses in the version space and will be classified negative by the other half.1 In the EnjoySportlearningtask, every example day is representedby 6attributes. Given that attributes Sky has three
41、possible values, and that AirTemp、Humidity 、Wind、Wind、Water and Forecast each have two possible values. Explain why the size of the hypothesis space is 973. How would thenu mber of possiblein sta ncesand possible hypotheses in crease with theaddition of one attribute A that takes on on K possible va
42、lues?2 Write the algorithm of Can didate_Elim in ati on using vers ion space. Assume Gis the set of maximally gen eral hopytheses in hypothesis space H, and S is the set of maximally specific hopytheses.3 Con sider the followi ng set of trai ning examples for EnjoySport:ExampleSkyAirTempHumidityWind
43、WaterForcastEnjoySp ort1sunnywarmno rmalstro ngwarmsameYes2sunnywarmhighstro ngwarmsameyes3rai nycoldhighstro ngwarmchaggeno4sunnywarmhighstro ngcoolchangeyes5sunnywarmno rmalweakwarmsameno(a) Whatis theEntropy of thecollectio ntrai ningexamples withrespect to thetarget function classificati on?(b)
44、According to the 5 traning examples, compute the decision tree that be learned by ID3, and show the decisi on tree.(Iog23=1.585, Iog25=2.322)4 Give several approaches to avoid overfitti ng in decisi on tree lear ning. How todetermin the correct final tree size?5 Write the BackPropagatio n algorithm
45、for feedforward n etwork containing twolayers of sigmoid un its.6 Explai n the Maximum a posteriori(MAP) hypothesis.7 Usi ng Naive Byes Classifier to classify the new in sta nee:<Outlook=s unn y,Temperature=cool,Humidity=high,Wi nd=stro ng>Our task is to predict the target value (yes or no) of
46、 the target conceptPlayTennis for this new instanee. The table blow provides a set of 14 training examples of the target con cept.DayOutlookTemperatureHumidityWindPlayTe nnisD1SunnyHotP HighWeakNoD2SunnyHotHighStro ngNoD3OvercastHotHighWeakYesD4Rai nMildHighWeakYesD5Rai nCoolNormalWeakYesD6Rai nCool
47、NormalStro ngNoD7OvercastCoolNormalStro ngYesD8SunnyMildHighWeakNoD9SunnyCoolNormalWeakYesD10Rai nMildNormalWeakYesD11SunnyMildNormalStro ngYesD12OvercastMildHighStro ngYesD13OvercastHotNormalWeakYesD14Rai nMildHighStro ngNo8 Question Eight : The definition of three types of fitness functions in gen
48、eticalgorithmQuestio n oneDefinition: A computer program is said to kam from experience E with respect to some class of T and performance measure P, if its performance at casks in as measured bv P* improves with experience E.(举一个例子,比如:导航仪、西洋跳棋)Questio n two:Ini tilize: G=?,?,?,?S=,Step 1:G=?,?,?,?,?
49、,?S=s unny ,warm ,no rmal,str on g,warm,sameStep2: coming one positive in sta nee 2G=?,?,?,?,?,?S=s unn y,warm,?,stro ng,warm,sameStep3: coming one n egative in sta nee 3G=<Su nn y,?,?,?,?,?> <?,warm,?,?,?,?> <?,?,?,?,?,same>S=s unn y,warm,?,str on g,warm,sameStep4: coming one posi
50、tive in sta nee 4S= sunny ,warm,?,str on g,?,? G=<Su nn y,?,?,?,?,?> <?,warm,?,?,?,?>Questio n three(a) Entropy(S)=og(3/5) og(2/5)= 0.971(b) Gain(S,sky) = Entropy(S) (4/5) Entropy(Ssunny) + (1/5) Entropy(Srainny)=0.322Gai n( S,AirTemp) = Gai n(S,wi nd) = Gai n(S,sky) =0.322Gai n( S,Humid
51、ity) = Gain (S,Forcast) = 0.02Gai n( S,water) = 0.171Choose any feature of AirTemp, wind and sky as the top node.The decisi on tree as follow: (If choose sky as the top no de)wind No7Tweak '导 ”gQuestio n FourAn swer:In ductive bias: give some proor assumpti on for a target con cept made by the l
52、earner to have a basis for classify ing un see n in sta nces.Suppose L is a mach ine lear ning algorithm and x is a set of training examples.L(xi, Dc) denotes the classification assigned to xi by L after training examples onDc. Then the in ductive bias is a mini mal set of asserti on B, give n an ar
53、bitrary target con cept C and set of training examples Dc: (Xi ) (BDcXi) -| L(xi, Dc)C_E: the target con cept is contained in the give n gypothesis space H, and the training examples are all positive examples.ID3: a, small trees are preferred over larger trees.B, the trees that place high informatio
54、n gain attribute close to root are preferred over those that do not.BP:Smooth in terpolati on betee n data poin ts.areQuestio n FiveAn swer: In n a?ve bayes classificati on,we assump that all attributesin depe ndent give n the tatget value, whilein bayes belif n et, it specifes a set ofconditional i
55、ndependence along with a set of probability distribution.Question Six :随即梯度下降算法GrDlFNT*DE3CENT(frdining jfxamples. jj)Eac/t training example is a pair of the form (xT t), where x is the vector of input values, andt is lhe large! output value, r is the learning rait (.g.r ,05. lnitULize each 昭 to some mall random ¥hh)c Uniil the Lermination condition ist met, Do* Initialize each AiUj to zero. For each (j?, t) in training e
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年西安铁路职业技术学院单招综合素质笔试备考试题含详细答案解析
- 2026年潍坊护理职业学院单招综合素质考试备考试题含详细答案解析
- 2026年广东环境保护工程职业学院单招综合素质笔试模拟试题含详细答案解析
- 2026年宁德职业技术学院单招综合素质考试模拟试题含详细答案解析
- 2026年江门职业技术学院高职单招职业适应性测试模拟试题及答案详细解析
- 2026年哈尔滨市南岗区人民医院招聘编外工作人员10人备考考试试题及答案解析
- 2026年南京科技职业学院单招综合素质考试备考试题含详细答案解析
- 2026年晋中师范高等专科学校单招综合素质考试参考题库含详细答案解析
- 2026年内蒙古民族幼儿师范高等专科学校单招综合素质笔试参考题库含详细答案解析
- 2026年广州体育职业技术学院高职单招职业适应性测试模拟试题及答案详细解析
- 2026年及未来5年市场数据中国集装箱物流行业市场发展数据监测及投资战略规划报告
- 中小学人工智能教育三年发展规划(2026-2028)7500字完整方案目标务实真能落地
- 七年级地理下册(人教版)东半球其他的国家和地区-欧洲西部自然环境教学设计
- 解决劳资纠纷与调解制度
- 护理个人先进
- DB34-T 4877-2024 智慧检验检测实验室建设指南
- GB/T 32399-2024信息技术云计算参考架构
- 食堂设备使用及保养培训
- 村庄异地搬迁安置点项目可行性研究报告
- 《正常人体形态学》考试复习题库大全(含答案)
- 抗洪抢险先进事迹2023
评论
0/150
提交评论