




已阅读5页,还剩208页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
.,1,统计软件和R语言,装了R没有?,.,2,一个广泛接受的统计定义为:,统计是用以收集数据、分析数据和由数据得出结论的一组概念、原则和方法.,.,3,这个定义决定了统计的命运:,和数学及音乐不同,统计不能欣赏自己,它不为实际服务就没有存在必要统计必须为各个领域服务统计必须和数据打交道因此,统计必须和计算机结合,.,4,搞“理论统计”是否用不着动手搞数据呢?,如果倒退几十年就可以.,.,5,如果没有应用背景,文章没人要,基金无人给.现在一些人即使瞎编也要编出一个应用背景来.纯理论统计存在吗?,.,6,统计和计算机,现代生活已离不开计算机了。但最早使用计算机的是统计。最初的计算机仅仅是为科学计算而建造的。大型计算机的最早一批用户就包含统计。而现在统计仍然是进行数字计算最多的用户。,.,7,统计和计算机,计算机现在早已脱离了仅有计算功能的单一模式,而成为百姓生活的一部分。计算机的使用,也从过去必须学会计算机语言到只需要“傻瓜式”地点击鼠标。结果也从单纯的数字输出到包括漂亮的表格和图形的各种形式。,.,8,统计软件,统计软件的发展,也使得统计从统计学家的圈内游戏变成了大众的游戏。只要输入数据,点几下鼠标,做一些选项,马上就得到令人惊叹的漂亮结果了。,.,9,统计软件,是否傻瓜式的统计软件使用可以代替统计课程了?当然不是。数据的整理和识别,方法的选用,计算机输出结果的理解都不象使用傻瓜相机那样简单可靠。,.,10,统计软件的问题,诸如法律和医学的软件都有不少警告,不时提醒你去咨询专家。这是注意饭碗的律师和大夫的高明之处。但统计软件则不那么负责。只要数据格式无误、方法不矛盾而且不用零作为除数就一定给你结果,而且没有任何警告。可能统计学家缺乏商业头脑。,.,11,统计软件的问题,另外,统计软件输出的结果太多;即使是同样的方法,不同软件输出的内容还不一样;有时同样的内容名称也不一样。这就使得使用者大伤脑筋。即使统计学家也不一定能解释所有的输出。因此,就应该特别留神,明白自己是在干什么。不要在得到一堆毫无意义的垃圾之后还沾沾自喜。,.,12,datatest;inputx;cards;1231760run;procunivariatefreqnormal;run;,随意键入几行SAS语句和5个数目.,得到下面结果,一共50多个数目(你能够解释多少?你需要多少?).,.,13,TheSASSystem15:33Friday,September12,20031UnivariateProcedureVariable=XMomentsQuantiles(Def=5)N5SumWgts5100%Max6099%60Mean16.6Sum8375%Q31795%60StdDev25.12568Variance631.350%Med390%60Skewness1.899804Kurtosis3.56305725%Q1210%1USS3903CSS2525.20%Min15%1CV151.3595StdMean11.236551%1T:Mean=01.477322Pr|T|0.2136Range59Num=05Num05Q3-Q115M(Sign)2.5Pr=|M|0.0625Mode1SgnRank7.5Pr=|S|0.0625W:Normal0.726472Pr,=,x,1,2,3,4,51,0.79836780.046076010.045553230.85944830.730895002,0.65598510.795622220.029482700.14533640.795528383,0.67591710.561931470.482866530.24199310.560699884,0.11837010.806526270.494051670.65231370.08345406x=matrix(1:20,4,5);x,1,2,3,4,51,15913172,261014183,371115194,48121620x=matrix(1:20,4,5,byrow=T);x,1,2,3,4,51,123452,6789103,11121314154,1617181920,.,61,一些简单函数,max,min,length,mean,median,fivenum,quantile,unique,sd,var,range,rep,diff,sort,order,sum,cumsum,prod,cumprod,rev,print,sample,seq,exp,pi,.,62,矩阵的行和列(子集),nrow(x);ncol(x);dim(x)#行列数目x=matrix(rnorm(24),4,6)xc(2,1),#第2和第1行x,c(1,3)#第1和第3列x2,1#第2,1元素xx,10,1#第1列大于0的元素sum(x,10)#第1列大于0的元素的个数sum(x,10unique(x),.,64,矩阵的转置和逆矩阵,x=matrix(runif(9),3,3);x,1,2,31,0.67476520.99547310.75245022,0.30901990.23901410.24729613,0.51026750.95155050.6082803t(x),1,2,31,0.67476520.30901990.51026752,0.99547310.23901410.95155053,0.75245020.24729610.6082803solve(x)#solve(a,b)可以解ax=b方程,1,2,31,-12.31329315.1258199.0823002,-8.4597253.6278988.9898643,23.563034-18.363808-20.037986,.,65,警告:计算机中的0是什么?,x%*%solve(x),1,2,31,1.000000e+00-9.454243e-17-3.911801e-162,5.494737e-161.000000e+003.248270e-163,-3.018419e-161.804980e-151.000000e+00要用线性代数的知识来判断诸如有多少非零特征根等问题.假定v是特征值组成的向量,不能用诸如sum(v!=0)等方法来判断非零特征根的数目!,.,66,Matrixx,1,2,3,4,51,0.54743060.23623560.6870071070.40369980.52558392,0.82343630.49227110.9605545640.47049760.13278703,0.18611510.84616550.3905234240.22025750.40576074,0.81175210.53759460.0045058450.48215670.7644741is.matrix(x)1TRUEx1,2x1,x,2dim(x)#得到维数(4,5),.,67,Array,x=array(runif(24),c(4,3,2)is.matrix(x)#可由dim(x)得到维数(4,3,2)1FALSEx,1,1,2,31,0.35126150.72706110.0090555222,0.14449650.25276730.6979770273,0.66581760.66385420.7737475424,0.42584360.41689400.634235148,2,1,2,31,0.36641520.96334970.56280062,0.34666450.50368300.15429863,0.45525530.12897750.84230174,0.10748990.38414630.7648297,.,68,Array的子集,x=array(1:24,c(4,3,2)xc(1,3),1,1,2,31,1592,3711,2,1,2,31,1317212,151923,.,69,矩阵乘法及行列运算,x=matrix(1:30,5,6);y=matrix(rnorm(20),4,5)y%*%x,1,2,3,4,5,61,-3.231808-8.13791204-13.044017-17.950121-22.856225-27.7623302,-14.072030-39.33640851-64.600787-89.865165-115.129543-140.3939213,-1.750057-0.027647831.6947613.4171705.1395786.8619874,5.8624129.7806421813.69887217.61710321.53533325.453563apply(x,1,mean)113.514.515.516.517.5apply(x,2,sum)115406590115140apply(x,2,prod)1120302403603601860480637560017100720,.,70,Array的维运算,x=array(1:24,c(4,3,2)apply(x,1,mean)111121314apply(x,1:2,sum),1,2,31,1422302,1624323,1826344,202836apply(x,c(1,3),prod),1,21,4546412,12055443,23165554,3847680,.,71,矩阵与向量之间的运算,sweep(x,1,1:5,*),1,2,3,4,5,61,16111621262,414243444543,924395469844,16365676961165,255075100125150 x*1:5sweep(x,2,1:6,+),1,2,3,4,5,61,28142026322,39152127333,410162228344,511172329355,61218243036,.,72,Array和矩阵/向量/array之间的运算,z=array(1:24,c(2,3,4)#注意排列次序z,1,1,2,31,1352,246,2,1,2,31,79112,81012,3,1,2,31,1315172,141618,4,1,2,31,1921232,202224,.,73,Array和矩阵/向量/array之间的运算,sweep(z,1,1:2,-),1,1,2,31,0242,024,2,1,2,31,68102,6810,3,1,2,31,1214162,121416,4,1,2,31,1820222,182022,.,74,Array和矩阵/向量/array之间的运算,sweep(z,c(1,2),matrix(1:6,2,3),-),1,1,2,31,0002,000,2,1,2,31,6662,666,3,1,2,31,1212122,121212,4,1,2,31,1818182,181818,.,75,外积(产生矩阵或array),outer(1:2,rep(1,2),1,21,112,22outer(1:2,matrix(rep(1,6),3,2),1,1,2,31,1112,222,2,1,2,31,1112,222,.,76,List(setofobjects),list可以是任何对象的集合(包括lists)z=list(1:3,Tom=c(1:2,a=list(R,letters1:5),w=hi!)z1;z2;z$T;z$T$a2;z$T3;z$T$wattributes(z)#属性!$names1Tomattributes(matrix(1:6,2,3)$dim123,.,77,矩阵,array及其维名字,x=matrix(1:12,nrow=3,dimnames=list(c(I,II,III),paste(X,1:4,sep=)X1X2X3X4I14710II25811III36912y=array(1:12,c(3,2,2),dimnames=list(c(I,II,III),paste(X,1:2sep=),paste(Y,1:2,sep=),Y1X1X2I14II25III36,Y2X1X2I710II811III912,.,78,data.frame,x=matrix(1:6,2,3)x=as.data.frame(x);xV1V2V311352246x$V2134x$V2134attributes(x)$names1V1V2V3$s112$class1data.frame,.,79,data.frame,names(x)=c(TOYOTA,GM,HUNDA)s(x)=c(2001,2002)xTOYOTAGMHUNDA20011352002246x$GM134,.,80,data.frame,attach(x)GM134detach(x)GMError:ObjectGMnotfound,.,81,直接手工输入和编辑数据,直接敲入:x=c(1,2,7,8,)或者x=scan()1278.(以“Enter”两次来结束)fix(x)(通过编辑修改数据),.,82,CategoricaldataAsurveyaskspeopleiftheysmokeornot.ThedataisYes,No,No,Yes,Yesx=c(Yes,No,No,Yes,Yes)table(x);xfactor(x),.,83,Barplot:Suppose,agroupof25peoplearesurveyedastotheirbeer-drinkingpreference.Thecategorieswere(1)Domesticcan,(2)Domesticbottle,(3)Microbrewand(4)import.Therawdatais3411343313212123231111431beer=scan()3411343313212123231111431barplot(beer)#thisisntcorrectbarplot(table(beer)#Yes,callwithsummarizeddatabarplot(table(beer)/length(beer)#dividebynforproportiontable(beer)/length(beer),.,84,CEOsalaries:Suppose,CEOyearlycompensationsaresampledandthefollowingarefound(inmillions).(Thisisbeforebeingindictedforcookingthebooks.)12.4525083140.25sals=scan()#readinwithscan12.4525083140.25mean(sals);var(sals);sd(sals);median(sals)fivenum(sals)#min,lowerhinge,Median,upperhinge,maxsummary(sals)data=c(10,17,18,25,28,28);summary(data);quantile(data,.25);quantile(data,c(.25,.75),.,85,sort(sals);fivenum(sals);summary(sals)mean(sals,trim=1/10);mean(sals,trim=2/10)IQR(sals)Mad:median|Xi-median(X)|(1.4826)mad(sals)median(abs(sals-median(sals)#withoutmedian(abs(sals-median(sals)*1.4826,.,86,Stem-and-leafChartsSupposeyouhavetheboxscoreofabasketballgameandthefollowingpointspergameforplayersonbothteams2316231412413200062831144825scores=scan()2316231412413200062831144825apropos(stem)#aproposreturnsacharactervectorgivingthenamesofallobjectsinthesearchlistmatchingwhat.如apropos(“stem”)1“stem”“system”“system.file”“system.time”参看find(stem)stem(scores);stem(scores,scale=2),.,87,Thesalariescouldbeplacedintobroadcategoriesof0-1million,1-5millionandover5million.TodothisusingRoneusesthecut()functionandthetable()function.Supposethesalariesareagain12.452508314.25Andwewanttobreakthatdataintotheintervals0;1;(1;5;(5;50sals=c(12,.4,5,2,50,8,3,1,4,.25)#enterdatacats=cut(sals,breaks=c(0,1,5,max(sals)#thebreakscats#viewthevaluestable(cats)#organizelevels(cats)=c(poor,rich,rollinginit)table(cats),.,88,Histograms:Supposethetop25rankedmoviesmadethefollowinggrossreceiptsforaweek429.628.219.613.713.01.91.00.1x=scan()29.628.219.613.713.01.91.00.1hist(x)#frequencieshist(x,probability=TRUE)#proportions(orprobabilities)rug(jitter(x)#addtickmarkshist(x,breaks=10)#10breaks,orjusthist(x,10)hist(x,breaks=c(0,1,2,3,4,5,10,20,max(x)#breaks,.,89,FrequencyPolygons:x=c(.314,.289,.282,.279,.275,.267,.266,.265,.256,.250,.249,.211,.161)tmp=hist(x)#storetheresultslines(c(min(tmp$breaks),tmp$mids,max(tmp$breaks),c(0,tmp$counts,0),type=l),.,90,data(faithful)attach(faithful)#makeeruptionsvisiblehist(eruptions,15,prob=T)#proportions,notfrequencieslines(density(eruptions)#linesmakesacurve,defaultbandwidth,.,91,Handlingbivariatecategoricaldata:PersonSmokesamountofStudying1Ylessthan5hours2N5-10hours3N5-10hours4Ymorethan10hours5Nmorethan10hours6Ylessthan5hours7Y5-10hours8Ylessthan5hours9Nmorethan5hours10Y5-10hours,.,92,library(MASS)quineattach(quine)table(Age)table(Sex,Age);tab=xtabs(Sex+Age,quine);unclass(tab)tapply(Days,Age,mean)tapply(Days,list(Sex,Age),mean)apply,sapply,tapply,lapply,.,93,smokes=c(Y,N,N,Y,N,Y,Y,Y,N,Y)amount=c(1,2,2,3,3,1,2,1,3,2)table(smokes,amount)tmp=table(smokes,amount)#storethetableoptions(digits=3)#onlyprint3decimalplacesprop.table(tmp,1)#therowssumto1nowprop.table(tmp,2)#thecolumnssumto1nowReallysweep(x,margin,margin.table(x,margin),/)prop.table(tmp)#amount#allthenumberssumto1options(digits=7)#restorethenumberofdigits,.,94,Plottingtabulardatabarplot(table(smokes,amount)barplot(table(amount,smokes)smokes=factor(smokes)#fornamesbarplot(table(smokes,amount),beside=TRUE,legend.text=T)barplot(table(amount,smokes),main=table(amount,smokes),beside=TRUE,legend.text=c(lessthan5,5-10,morethan10),.,95,categoricalvs.numerical:Asimpleexamplemightbeinadrugtest,whereyouhavedata(insuitableunits)foranexperimentalgroupandforacontrolgroup.experimental:5551371111989control:1184595105410 x=c(5,5,5,13,7,11,11,9,8,9)y=c(11,8,4,5,9,5,10,5,4,10)boxplot(x,y);amount=scan()55513711119891184595105410category=scan()11111111112222222222boxplot(amountcategory)#notethetilde,.,96,从文本文件输入ASCII码数据,x=scan(f:book1.txt)这是按照文本一行一行读入的数据如果原先是45矩阵形式,则加用x=matrix(x,4,5,byrow=T)或直接用x=matrix(scan(f:book1.txt),4,5,b=T)如果原先是45有名字的data.frame形式,则用x=read.table(f:bookww.txt,header=T)xGMVWHUNDA19931231994678,.,97,控制语句,x=NULL;for(iin1:5)x=cbind(x,i2)i=1;x=NULL;while(i0)y=xelsey=-x+10i=1;x=rnorm(1);repeatx=x+rnorm(1);if(x3)break;i=i+1;print(c(i,x),.,98,怎么调出Packages来使用?,Packages:libraries敲library(),就知道有什么libraries了,缺省library是base.比如要进入mass,就敲library(mass)每个library都有许多数据在每个library敲data(),就知道有什么数据了比如敲data(Titanic),就调出数据Titanic来了.注意:R语言对大小写敏感.,.,99,怎么从网上下载Packages来使用?,从R上网直接到CRAN主页,.,100,下载zip文件到你的计算机,.,101,用本机的zip来安装程序包,.,102,by,attach(warpbreaks)by(warpbreaks,1:2,tension,summary)by(warpbreaks,1,list(wool=wool,tension=tension),summary)#nowsupposewewanttoextractthecoefficientsbygrouptmp-by(warpbreaks,tension,function(x)lm(breakswool,data=x)sapply(tmp,coef),.,103,%in%和match,#Theintersectionoftwosets:intersect-function(x,y)ymatch(x,y,nomatch=0)intersect(1:10,7:20)1:10%in%c(1,3,5,9)sstr-c(c,ab,B,bba,c,bla,a,Ba,%)sstrsstr%in%c(letters,LETTERS)%w/o%-function(x,y)x!x%in%y#-xwithouty(1:10)%w/o%c(3,7,12)intersect-function(x,y)ymatch(x,y,nomatch=0)intersect(1:10,7:20)attach(warpbreaks)warpbreakstension%in%c(L,H),warpbreakswarpbreaks$tension%in%c(L,H),warpbreakswarpbreaks,3%in%c(L,H),unique(tension),.,104,ftable:把array/矩阵(没有频率的)数据变成列联表(找出计数),#Startwithacontingencytable.ftable(Titanic,row.vars=1:3)ftable(Titanic,row.vars=1:2,col.vars=“Survived”)ftable(Titanic,row.vars=2:1,col.vars=“Survived”)#Startwithadataframe.xis.factor(DF,3)1TRUEis.factor(DF,4)1FALSEDF,4=as.factor(DF,4)is.factor(DF,4)1TRUEDF,4=as.numeric(DF,4)is.factor(DF,4)1FALSE,.,107,在用哑元记录属性变量观测时,如不改变标签,则可能出错mtcars1:4,mpgcyldisphpdratwtqsecvsamgearcarbMazdaRX421.061601103.902.62016.460144MazdaRX4Wag21.061601103.902.87517.020144Datsun71022.84108933.852.32018.611141Hornet4Drive21.462581103.083.21519.441031lm(mpggear+carb,mtcars)#把定性变量当成定量变量Call:lm(formula=mpggear+carb,data=mtcars)Coefficients:(Intercept)gearcarb7.2765.576-2.754mtcars,10=as.factor(mtcars,10)#改变标签mtcars,11=as.factor(mtcars,11)#改变标签(lm(mpggear+carb,mtcars)Call:lm(formula=mpggear+carb,data=mtcars)Coefficients:(Intercept)gear4gear5carb2carb3carb420.9327.7208.349-3.289-4.632-9.064carb6carb8-9.581-14.281,.,108,向量比较:all,x=1:12;y=1:12;all(y=x)1TRUE,.,109,cat和print,if(all(xcat(alogicalor(positive)numericcontrollinghowtheoutputis+brokenintosuccessivelines.IfFALSE(default),only+newlinescreatedexplicitlybyn)alogicalor(positive)numericcontrollinghowtheoutputisbrokenintosuccessivelines.IfFALSE(default),onlynewlinescreatedexplicitlybyif(all(xprint(alogicalor(positive)numericcontrollinghowtheoutputis+brokenintosuccessivelines.IfFALSE(default),only+newlinescreatedexplicitlyby)1alogicalor(positive)numericcontrollinghowtheoutputisnbrokenintosuccessivelines.IfFALSE(default),onlynnewlinescreatedexplicitlyby,.,110,解释后面语句的意思,.,111,n=1200;x=runif(n*10)x=matrix(x,n,10)x1=(x,1:5.4)*1p=x1*.8+(1-x1)*.2x2=1*(x,6:10=n)breakif(is.matrix(z0)=F)nu=c(nu,1);pattern=rbind(pattern,z0);id=id+1breakid=id+1list(pattern=pattern,number=nu,id=id),.,113,R-语言画图,.,114,x=0:18plot(x,x,pch=x,col=x)points(x,18-x,pch=x)matplot(x,cbind(x,18-x),.,115,画图,spring=data.frame(compression=c(41,39,43,53,42,48,47,46),distance=c(120,114,132,157,122,144,137,141)attach(spring)(Hookeslaw:f=.5ks)plot(distancecompression)plot(compression,distance),.,116,画图,par(mfrow=c(2,2)#准备画22的4个图plot(compression,distance,main=HookesLaw)#只有标题的图plot(compression,distance,main=HookesLaw,xlab=x,ylab=y)#标题+x,y标记identify(compression,distance)#标出点号码plot(compression,distance,main=HookesLaw)#只有标题的图text(46,120,f=1/2*k*s)#在指定位写入文字plot(compression,distance,main=HookesLaw)#只有标题的图text(locator(2),Iamhere!)#在点击的两个位置写入文字,.,117,.,118,画图,library(mass);data(Animals);attach(Animals)
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年住院医师规培-海南-海南住院医师规培(神经内科)历年参考题库含答案解析(5套)
- 2025年住院医师规培-浙江-浙江住院医师规培(耳鼻咽喉科)历年参考题库含答案解析
- 2025年住院医师规培-浙江-浙江住院医师规培(全科医学)历年参考题库含答案解析(5套)
- 2025年住院医师规培-河北-河北住院医师规培(妇产科)历年参考题库含答案解析
- 2025年住院医师规培-江苏-江苏住院医师规培(胸心外科)历年参考题库含答案解析(5套)
- 2025年住院医师规培-江苏-江苏住院医师规培(急诊科)历年参考题库典型考点含答案解析
- 2025年事业单位工勤技能-重庆-重庆水文勘测工五级(初级工)历年参考题库典型考点含答案解析
- 2025年事业单位工勤技能-重庆-重庆动物检疫员三级(高级工)历年参考题库典型考点含答案解析
- 2025年事业单位工勤技能-北京-北京水利机械运行维护工五级(初级工)历年参考题库典型考点含答案解析
- 2025年事业单位工勤技能-北京-北京工程测量工二级(技师)历年参考题库典型考点含答案解析
- 北师大版小学六年级数学上册导学案全册
- 资产减值准备管理办法模版
- GB/T 42268-2022乙烯-丙烯-二烯烃橡胶(EPDM)评价方法
- 装饰员工薪资工资表
- 医务人员之间的沟通技巧
- GB/T 20671.7-2006非金属垫片材料分类体系及试验方法第7部分:非金属垫片材料拉伸强度试验方法
- GB/T 10781.1-2006浓香型白酒
- 轴孔用YX型密封圈规格尺寸
- 肾上腺疾病外科治疗
- 第9章探放水钻机及相关设备的安全使用.
- 人教版三年级下册体育与健康教案(全册教学设计)
评论
0/150
提交评论