ma4125 introd to computer aided data analysis Autumn 2005.doc_第1页
ma4125 introd to computer aided data analysis Autumn 2005.doc_第2页
ma4125 introd to computer aided data analysis Autumn 2005.doc_第3页
ma4125 introd to computer aided data analysis Autumn 2005.doc_第4页
ma4125 introd to computer aided data analysis Autumn 2005.doc_第5页
已阅读5页,还剩8页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH COLLEGE OF INFORMATICS AND ELECTRONICS END OF SEMESTER ASSESSMENT PAPER MODULE CODE MA4125SEMESTER Autumn 2001 MODULE TITLE Computer Aided Data AnalysisDURATION 2 hours LECTURER Dr Ailish Hannigan EXTERNAL EXAMINER Prof S McClean INSTRUCTIONS TO CANDIDATES Answer any 4 questions each question is worth 20 marks Calculators may be used This exam is worth 60 of your final grade MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 2 Q1 a Distinguish between observational and experimental studies What are the phases of the research process 6 marks b A researcher wants to find out the percentage of small and medium sized Irish companies that are ready for the Euro changeover A list of small and medium sized Irish companies was obtained from Business and Finance and 100 companies were randomly selected from this list using systematic sampling Of these 100 companies 50 responded to the researcher s survey 35 of the 50 companies who responded were ready for the Euro changeover i For this example identify the population the sampling frame the sample and the variable measured ii Briefly describe how systematic sampling is carried out iii What is the parameter of interest in this example iv What is the best estimate of this parameter v Describe the potential bias in this example 10 marks c Classify the data from the following variables by data type Level of education 1 primary 2 secondary 3 third level Rating of satisfaction with employer on a scale of 0 to 10 Salary s Length of work experience 1 less than 5 years 2 5 to 10 years 3 more than 10 years 4 marks Q2 A survey of home Internet users was carried out to profile the users find out the frequency of use of the Internet at home and also investigate if the home Internet users shopped online Data on the following variables were collected from a sample of 30 users Freq frequency of use i e the number of hours per week spent online Gender 1 male 0 female Highest level of education 1 primary education 2 secondary education 3 third level education Purchase whether the user shopped online where 1 yes and 0 no Question 2 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 3 Using the following output from SPSS a Write a newspaper report suitable for a general audience 8 marks b Write a detailed statistical report 12 marks Descriptives 9 5933 3560 8 8652 10 3215 9 6185 9 6000 3 803 1 9501 5 10 13 60 8 50 2 3250 157 427 348 833 Mean Lower Bound Upper Bound 95 Confidence Interval for Mean 5 Trimmed Mean Median Variance Std Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis FREQ StatisticStd Error Tests of Normality 10430 200 98030 825FREQ StatisticdfSig StatisticdfSig Kolmogorov SmirnovaShapiro Wilk This is a lower bound of the true significance Lilliefors Significance Correction a Question 2 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 4 Normal Q Q Plot of FREQ Observed Value 141210864 Expected Normal 2 1 0 1 2 30N FREQ 16 14 12 10 8 6 4 Question 2 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 5 Descriptives 8 4556 3343 7 7502 9 1609 8 5284 8 6000 2 012 1 4185 5 10 10 50 5 40 1 4250 1 098 536 9421 038 11 3000 3674 10 4913 12 1087 11 2667 11 0500 1 620 1 2728 9 60 13 60 4 00 1 8000 432 637 6661 232 Mean Lower Bound Upper Bound 95 Confidence Interval for Mean 5 Trimmed Mean Median Variance Std Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean Lower Bound Upper Bound 95 Confidence Interval for Mean 5 Trimmed Mean Median Variance Std Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis GENDER female male FREQ StatisticStd Error Tests of Normality 20618 041 90718 078 15312 200 95512 666 GENDER female male FREQ StatisticdfSig StatisticdfSig Kolmogorov SmirnovaShapiro Wilk This is a lower bound of the true significance Lilliefors Significance Correction a Question 2 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 6 1218N GENDER malefemale FREQ 16 14 12 10 8 6 4 21 14 GENDER 1860 060 060 0 1240 040 0100 0 30100 0100 0 female male Total Valid FrequencyPercentValid Percent Cumulative Percent EDUC 413 313 313 3 1550 050 063 3 1136 736 7100 0 30100 0100 0 primary secondary third level Total Valid FrequencyPercentValid Percent Cumulative Percent Question 2 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 7 PURCHASE 1550 050 050 0 1550 050 0100 0 30100 0100 0 no yes Total Valid FrequencyPercentValid Percent Cumulative Percent EDUC PURCHASE Crosstabulation 314 75 0 25 0 100 0 20 0 6 7 13 3 10 0 3 3 13 3 10515 66 7 33 3 100 0 66 7 33 3 50 0 33 3 16 7 50 0 2911 18 2 81 8 100 0 13 3 60 0 36 7 6 7 30 0 36 7 151530 50 0 50 0 100 0 100 0 100 0 100 0 50 0 50 0 100 0 Count within EDUC within PURCHASE of Total Count within EDUC within PURCHASE of Total Count within EDUC within PURCHASE of Total Count within EDUC within PURCHASE of Total primary secondary third level EDUC Total noyes PURCHASE Total MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 8 Q3 a What is objective of the Data Protection Act 1988 What are the requirements for data under this act 4 marks b Statistical errors can occur in the planning design execution and analysis of a study and in the presentation and interpretation of the results Discuss with reference to at least eight common statistical errors 8 marks c Outline briefly the appropriate analysis or hypothesis test for the following examples i Examining the relationship between amount spent on research and development s and profit s ii Investigating if the computing skills have improved for a group of employees where their skill level was measured before and after a training course using a test with a maximum score of 100 iii Investigating if there is an association between gender and the rating of importance of salary not important relatively important very important when deciding to accept a new job iv Finding a suitable measure of centrality and variability for a sample of house prices in Ireland 8 marks Q4 a Define a p value 2 marks b A call centre states that the mean amount of time customers spend waiting to be dealt with is 6 minutes A random sample of 40 waiting times was selected from data stored by the call centre The waiting times were summarised using SPSS and a hypothesis test was carried out to investigate the call centre s claim Write a statistical report on the results of the analysis using the output below 8 marks Question 4 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 9 One Sample Statistics 406 105 950 150TIMES NMeanStd Deviation Std Error Mean One Sample Test 69939 489 105 199 409TIMES tdfSig 2 tailed Mean DifferenceLowerUpper 95 Confidence Interval of the Difference Test Value 6 c Market research staff are considering two different types of packaging for their breakfast cereal product They pilot the packaging by placing the cereal in the two types of packaging and putting both at eye level on the same shelf in 10 supermarkets The sales of the cereal in the two packages for each shop are given in the table below A hypothesis test was carried out to investigate if there was any difference in the sales of the cereal in the two different packages The following output was obtained from SPSS Paired Samples Statistics 40 5000106 11461 9336 37 4000108 01662 5351 PACK1 PACK2 Pair 1 MeanNStd Deviation Std Error Mean Question 4 contd Shop12345678910 Pack150354045384248364130 Pack248323140324550303828 MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 10 Paired Samples Test Paired Differences tdfSig 2 tailed MeanStd Dev Std Error Mean 95 Confidence Interval of the Difference LowerUpper Pair 1 PACK1 PACK2 3 10003 66521 1590 4781 5 72192 6759 025 i Why was a paired samples test carried out for this example ii What assumption should have been checked before the analysis was carried out iii Assuming that this assumption holds write a statistical report on the output above 10 marks Q5 a A random sample of employees was surveyed in a non unionised production facility Employees were classified according to their type of job office staff managerial staff or operator and also whether they were interested in joining a union if one was set up in the facility The data collected were analysed using SPSS and a chi square test was carried out The output is as follows Question 5 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 11 UNION WORKTYPE Crosstabulation 30154085 31 624 329 185 0 35 3 17 6 47 1 100 0 46 2 30 0 66 7 48 6 17 1 8 6 22 9 48 6 35352090 33 425 730 990 0 38 9 38 9 22 2 100 0 53 8 70 0 33 3 51 4 20 0 20 0 11 4 51 4 655060175 65 050 060 0175 0 37 1 28 6 34 3 100 0 100 0 100 0 100 0 100 0 37 1 28 6 34 3 100 0 Count Expected Count within UNION within WORKTYPE of Total Count Expected Count within UNION within WORKTYPE of Total Count Expected Count within UNION within WORKTYPE of Total interested not interested UNION Total officemanagerialoperator WORKTYPE Total Chi Square Tests 14 921a2 001 15 2662 000 4 9241 026 175 Pearson Chi Square Likelihood Ratio Linear by Linear Association N of Valid Cases Valuedf Asymp Sig 2 sided 0 cells 0 have expected count less than 5 The minimum expected count is 24 29 a Use the output to answer the following questions i What percentage of managerial staff were interested in joining a union ii What percentage of those who were interested in joining a union were operators iii What is the null hypothesis for the chi square test in this example iv What is your conclusion from the results of the hypothesis test v Does the rule of thumb for the chi square test hold for this example vi What is the nature of the relationship between type of work and interest in joining a union 8 marks Question 5 contd MA4125 Computer Aided Data Analysis Autumn 2001 Dr Ailish Hannigan 12 b A study was carried out to predict the square footage of homes given the size of the household and whether the house is in an urban or rural area The variables are as follows Footage size of house in square feet people number o

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

最新文档

评论

0/150

提交评论