统计计算考试题目_第1页
统计计算考试题目_第2页
统计计算考试题目_第3页
免费预览已结束,剩余9页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、统计计算题目42Epidemiologists are in terested in study ing the sexual behavior of in dividuals at risk for HIV infection. Suppose 1500 gay men were surveyed and each was asked how many risky sexual encounters he had in the previous 30 days. Let nj denote the number of respondents reporting i encounters, f

2、or i=1,.,16. Table 4.2 summarizes the resp on ses.These data are poorly fitted by a Poiss on model. It is more realistic to assume that the respondents comprise three groups. First, there is a group of people who,for whatever reason, report zero risky encounters even if this is not true. Suppose a r

3、espondent has probability of belonging to this group.With probability :, a respondent belongs to a second group representing typical behavior. Such people resp on dtruthfully, and their nu mbers of risky encoun ters are assumed to follow a Poisson(" ) distribution.Fin ally, with probability 1 -

4、匚-,a resp ondent bel ongs to a high-risk group. Such people resp ond truthfully, and their nu mbers of risky encoun ters are assumed to follow a Poisson(,) distribution.The parameters in the model are:,:,and .At thetth iteration of EM, we user(t) =(: (t), 1 (t)r-(t), ,(t)to denote the current parame

5、ter values. The likelihood of the observed data is give n by,| )异 yeTL(,|no, ,nie)-,i=0 1 i!where二iU)1待-'exp'- " (1- ) ' exp' - 'for i=1,.,16.The observed data aren°, ,n花.The complete data may be construed to benz,0, n,0,口,16,andnp,0,,np,16,wherenk,i,denotes thenumber o

6、f respondentsin group k reporting i risky encounters andk 二 z,t and p correspond to the zero, typical,a ndpromiscuousgroups,respectively.16Thus,n0 =nz,0 + nt,0 + np,0andq =ntj + np,i for i=1,.,16.Let N =Z 口口 =1500.DefineZ°(RtOexp:-.-j J二 2)PC)(1 - : - ) 1 exp' - /叫(日)for i=0,.,16. These cor

7、respond to probabilities that respondents with i risky encoun ters bel ong to the various groups.a. Show that the EM algorithm provides the following updates:一 t 1 = n0Z0(r)-N16t1 二、i =0n击(寸(t)Nj (t1)16-(t 1)i =0ini Pi (r(t)16i =0niP(t)b. Estimate the parameters of the model, using the observed data

8、.c. Estimate the sta ndard errors and pairwise correlatio ns of your parameter estimates, using any available method.解:(1)Ez(Znk lx,")” ZnkP(Znk 区)二、P(Xj |Znk)P(Znk)p(x)=1 P(Znk =1)P(Xj R =1)0 P(Znk =2)p(Xj 吃伙=2)P(xJ二kfE叩")、二lfl(Xi| W G)则有Z°二。(1 _a _ 0)2怙p (日)=;其中znk表示不同组,X表示危险性行为。即得证

9、;眄(日)(2)下证EM算法更新推导过程: 计算E步的Q函数:N 3Q(二畀)=En p(x,z|n)|xc二 Ez znk In (二 Q I n( p" | 讪 xc(t)i二 kAN 3二為為 Ez(znk | x,畀)ln(二k) In( Pk(Xi p)i 4 k 4ni161616 ig-l-八叫)八砧(列)=0i =0一i =016无in击理)从而有' riiti(y) =e=n ozo(Jt)| n:亠二 n 击(°)| n, .l ni pi(t) I n(1 - : - - ) ti (二)| n|(;) oi z0oi!16 、 2Pi(J)I

10、n【(寻)ni7 i!(3)计算M步的Q函数求极值过程:(i)由于要使 Q函数达到最大,同时参数二必须满足37二k=1,运用拉格朗日乘法可得k 4二k =丄 J E N nVz (znk xi ,J)从而有:-(t 1)n°z°(畀);:(t1)Jiti严);二 N ;(ii )由于要使-QLJ(t)Q函数达到最大,即对In p(x,z|,)求偏导。16 、 1616Z iniPiG)从而有,=捋6;即得证。' niPi(J)i卫算法:(1 )首先将混合正态模型的参数初始化为邛0) =(:.(0), 1(0),川0),.(0);(2)E步:通过混合正态分布进行随机模

11、拟得到n个样本 (x1,x2,.,xn),计算完全数据对数似然log p(x, z|r)关于数据z的期望值,对数似然函数的期望QJ) =En p(x,z|R|x,J)1616 = ngZo(B(t)lna+瓦 n击(日)1 n 0 +瓦 np (日(t)ln(1a 0)+1 但(t)lnni Ai =0i=0i . (丄)ni!16i!16Pi(t) In i【i =0(3)M步:最优化期望值 Q()r),即通过迭代找到 Q(J二)的最大值 J ° ;即嗨(二)'J 1、16 n”)i =S、邙nitL)、1:口以列)题目7.2Simulating from the mixt

12、ure distribution in Equation (7.6) is straightforward see part (a) of Problem 7.1. However, using the Metropolis -Hastings algorithm to simulate realizations from this distribution is useful for exploring the role of the proposal distributio n.a. Implement a Metropolis -Hastings algorithm to simulat

13、e from Equation (7.6) with6 =0.7,usingN(x(t),0.01A2)as the proposal distribution. For each of three starting values,x(0) =0,7,and 15, run the chain for 10,000 iterations. Plot the sample path of the output from each chain. If only one of the sample paths was available, what would you con clude about

14、 the cha in? For each of the simulati ons, create a histogram of the realizations with the true density superimposed on the histogram. Based on your output from all three cha ins, what can you say about the behavior of the cha in?b. Now change the proposal distribution to improve the convergence pro

15、perties of the cha in. Using the new proposal distributio n, repeat part (a).算法:1. 从两个正态总体里分别以0.7和0.3的概率产生100个随机模拟样本y =(%, 丫2,., yn)2. 选取一个建议分布g(.|)二u (0,1),从建议分布g( |:.(t)中抽取一个候选值3.计算 Metropolis-Hastings比率 RG (t)*)二(通常实际中用贝叶斯推断得到的一个比率R(§(t)V、_ L(6*|y) _ 丁 expL舟( 气)2+(16*)expL占( 巴)2 ,LC |y)'

16、 expT(% - 7)2(1 -、)expT (% - 叮产4. 以等于R的概率接受*,如果接受,则(“)一:*,如果没有接受,则5. 增加t,重复上述过程,直到收敛题目7.5A cli nical trial was con ducted to determ ine whether a horm one treatme nt ben efits women who were treated previously for breast cancer. Each subject entered the clinical trial when she had a recurrence. She

17、was then treated by irradiation and assig nedto either a horm one therapy group or a con trol group. The observati on of interest is the time until a second recurrence, which may be assumedto follow an exponential distribution with parameter 71 (hormone therapy group) or71 (control group). Many of t

18、he wome n did not have a sec ond recurre nce before the cli ni cal trial was con eluded, so that their recurre nee times are cen sored.In Table 7.2, a cen sori ng time M means that a woma n was observed for M mon ths and did not have a recurrence during that time period, so that her recurrence time

19、is known to exceed M mon ths. For example, 15 wome n who received the horm one treatme nt suffered recurre nces, and the total of their recurre nce times is 280 mon ths.Let yiH 二(xH,、J) be the data for thei th person in the hormone group, wherexH is the time and§H equals 1 if xH is a recurrence

20、 time and 0 if a censored time. The data for the con trol group can be writte n similarly.The likelihood is the nYou' ve)een hired by the drug company to analyze their data. They want to know if the hormone treatment works, so the task is to find the marginal posterior distribution of using the

21、Gibbs sampler. In a Bayesian analysis of these data, use the con jugate priorf (8,y) x 8%b exo 一c日-dtPhysicians who have worked extensively with this hormone treatment have in dicatedthat reas on able values for the hyperparameters are (a, b, c, d) = (3, 1,60, 120).a. Summarize and plot the data as

22、appropriate.b. Derive the conditional distributions necessary to implement the Gibbs sampler.c. Program and run your Gibbs sampler. Use a suite of convergencediagnostics to evaluate the con verge nce and mixi ng of your sampler. I nterpret the diag no stics.d. Compute summary statistics of the estim

23、ated joint posterior distribution, including marginal means, standard deviations, and 95% probability intervals for each parameter. Make a table of these results.e. Create a graph which shows the prior and estimated posterior distribution for superimposed on the same scale.f. Interpret your results

24、for the drug company. Specifically, what does your estimate of mean for the clinical trial? Are the recurrence times for the hormone group sig nifica ntly differe nt from those for the con trol group?g. A com mon criticism of Bayesia n an alyses is that the results are highly depe ndent on the prior

25、s. Investigate this issue by repeating your Gibbs sampler for values of the hyperparameters that are half and double the origi nal hyperparameter values.Provide a table of summary statistics to compare your results. This is called a sensitivity analysis. Based on your results, what recommendations do you have for the drug compa ny regard ing the sen sitivity of your results to

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论