Lean与六西格玛培训第5课:Basic Statistics_第1页
Lean与六西格玛培训第5课:Basic Statistics_第2页
Lean与六西格玛培训第5课:Basic Statistics_第3页
Lean与六西格玛培训第5课:Basic Statistics_第4页
Lean与六西格玛培训第5课:Basic Statistics_第5页
已阅读5页,还剩44页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Basic Statistics,May 2015,Kenny,Operational ExcellencePanyu / China,Basic Statistics,Data Collection StrategyData summaryNumericalCentral Tendency (Location)Variation (Dispersion)ShapeGraphical PresentationDot plotBoxplotHistogram (and distribution plot)Normal DistributionSome Other Graphical  PlotsTime Series chartScatter PlotsPareto,Data Definition,Data is the facts, either qualitative or quantitative that are obtained by observing a population, product, process or service.,Continuous data:Variable dataMeasured on a continuum or scale Can be sub-divided and still have meaning Examples include:Cycle Time(days, hours, minutes, or seconds)Money(dollars, cents, sales, costs, losses)WeightTemperatureSpeed,Data Definition,Discrete data:Categorical or attribute dataMeasured by countingCannot be sub-divided and still have meaningExamples include:Defects (yes/no, approved/disapproved, pass/fail, met customer requirement/did not meet customer requirement)Categories (days of the week, locations, type of customer, type of product, risk - low/medium/high)Satisfaction (poor/fair/good/excellent or dissatisfied/satisfied),Shortages,Data Definition,数据类型,离散数据,连续数据,二元数据,分类序数,离散统计,测量系统精度允许的情况下可以将测量对象进行无限有意义的细分,某员工在职时间超过2年与否,员工满意度调查的打分(1-5),公司内在职时间超过5年的员工人数,员工的在职时间,分为两种,排序/评分,离散统计,能够识别对象是否存在问题,但对问题无法得到更多的理解,连续数据,离散数据,除了能够识别对象是否存在问题,还能够得到关于问题的更多细节信息,数据分布的集中趋势和波动不总是可以衡量的,数据分布的集中趋势和波动总是可以衡量的,数据收集成本较低,收集过程简单,收集连续数据成本较高,对测量仪器有一定的要求,分析离散数据需要更大的样本量,分析连续数据需要较少的样本量,测量不能无限细分,测量可以无限细分,测量不能实现理想的精度,测量可以实现理想的精度,Data Definition,Population:  Consists all the members.Sample: A portion of the members from the population.,Population,Sample,Process,Population,Sample,“Population Parameters”,“Sample Statistics”,m = Population mean,s = Sample standard deviation,s = Population standard deviation,X = Sample mean,Population Vs Sample,Population,Data Collection : Sample,Process,Measure,Analysis,Sample Statistics are used to estimate population attributes,Generating Basic Statistics for Information,数一下教室里面穿蓝色上衣的人数,蓝色上衣的操作定义: 上衣是指裙装/裤装以上所有衣服面积遮蔽身体70%以上,衣服垂挂底部边缘比裙装/裤装的上部边缘低3到7寸。如果未着裙装/裤装,则衣服不属于上衣。按此定义,如果上衣外表面颜色蓝色(包括磨损所致)所占面积超过50%,则该上衣为蓝色上衣。对蓝色的定义由通过U.S.A.F色盲测试,并得到医学认证的检查员判断,用G.E.蓝色荧光灯单独照射上衣或色卡,上衣颜色和色卡相应区域的任意颜色匹配一致情况的情况下,就认为上衣的颜色是蓝色。,Tips Operational Definition,Operational DefinitionA precise description of the specific criteria used for the measures (the what), the methodology to collect the data (the how), the amount of data to collect (how much), and who has responsibility to collect the data (the who).Provides everybody with the same meaning.Ensures that consistency and reliability are built in up front.Describes the scope of the measure (what is included and what is not included).,Tips Operational Definition,Data Description of Sample Center Tendency,Mean:  Arithmetic average of a set of valuesReflects the influence of all valuesStrongly Influenced by extreme valuesMedian:  Reflects the 50% rank - the center number after a set of numbers has been sortedDoes not necessarily include all     values in calculationIs “robust” to extreme scoresMode: Most frequently occurring value in a data set,Data Description of Sample Dispersion,Q1 =The first or lower quartile is a value that has approximately 25% of the observations below in value.Q3 =The third or upper quartile is a value that has approximately 75% of the observations below in value.,Sample Standard Deviation,IQR: Interquartile range,Kurtosis = -ve,Kurtosis = 0,Kurtosis = +ve,Skewness,Data Description of Sample - Shape,Exercise : Description Statistic by Minitab,Minitab : StatBasic StatisticsGraphical Summary,Using the sales number in the excel, doing the numerical analysis.,Graphical Presentation for Data summary,Dot PlotBox PlotHistogram,Example 1,Data is collected on the level of orders for a particular product. The data is collected for 12 weeks and the supplier is open Monday to Saturday each week. The number of customers ordering from the store is also recorded.Summary the data (Overall)Summarize the data by DayDraw conclusion,Worksheet: sales.mtw,Example 1 (Numerically Description),Comparison between days?Conclusion?Can you clearly see the conclusion in any difference or trend ?,Variable  day    Mean  StDev  Variance  Median  Range    IQRsales     1     54.50  21.70    470.80   54.83  79.47  25.51          2     52.19  31.46    989.62   50.82  98.46  55.43          3     63.27  27.45    753.32   64.43  99.39  41.79          4     58.81  19.70    388.13   59.93  68.83  29.25          5     88.92  25.64    657.55   84.01  93.59  37.43          6    126.27  25.45    647.60  129.50  87.32  41.14,Variable   Mean  StDev  Variance  Median   Range    IQRsales     73.99  36.17   1307.92   68.84  177.56  52.90,Now select Graph Dot Plot,Dot Plots,Sales worksheet,Dot Plots,What are the information available in the Plots?Central TendencyVariationShape,Box Plot Analysis,+,*,Outlier,75th Percentile (Third Quartile, or Q3),Distribution Minimum(=Max lowest data point, Q1 - 1.5 (Q3-Q1) ),Distribution Maximum(=Min highest data point, Q3 + 1.5 (Q3-Q1) ),25th Percentile (First Quartile, or Q1),Median (50th Percentile),Mean,GraphBoxplots,Box Plots,Box Plots,What are the information available in the Plots?Central TendencyVariationShape,GraphHistogram,Histogram,GraphHistogram,Histogram,GraphHistogram,Histogram,What are the information available in the Plots?Central TendencyVariationShape,Dot plot, Boxplot, Histogram,Minitab : StatBasic StatisticsDisplay Descriptive Statistics,Dot plot, Boxplot, Histogram,Is there a difference from day to day?Is your answer different when you doing the comparison numerically?,Dot plot, Boxplot, Histogram,Of these 3 graphs, Which is better?,GraphTime Series PlotThis produces a simple run chart of the data. Example : The sales trend for the year of 2006.Do you think is the business favor in year 2007 or not? And whats your worse scenario forecast of year 2007 revenue?,Frequent Use Graphic - Time Series Plots,StatQuality ToolsPareto ChartChoose Chart defects table:,Frequent Use Graphic Pareto Plot,To identify the 30-70 rules (20-80), then Helps a team to focus on those causes that will have the greatest impact if solved.,DefectsFreqsWeekAir Bubble931Air Bubble812Air Bubble623Air Bubble574Weight Dev.1201Weight Dev.1322Weight Dev.913Weight Dev.884Deformation181,Frequent Use Graphic Pareto Plot,Which defects we should concentrate first? By solved this defects, how much improvement we can make?,Exercise 4,Q1: 3 Difference types of fuel have been tested for car distance drive for 20 liters of gasoline for 20 samples. Draw the Dot, Box and histogram plot, and here after, determine whats the energy severest  among the 3. (Car distance drive worksheet)Q2 :  A professor is trying to select a good diet for chickens. Chickens are classified into  5 groups of different diet. Random samples of 10 are taken from each group and the weight of each chicken is measured. Draw the appropriate graphics, and here after, make a conclusion upon the experiments. (Chicken worksheet)Q3 : The coating yield for the first 25 calendar weeks in year 2006 were recorded. Whats your conclusion about the process trend in first 25 calendar weeks? (Coating yield worksheet),33,Definition定义:A probability distribution where the most frequently occurring value is in the middle and other probabilities tail off symmetrically in both directions.一个概率分布图,出现次数最多的值在中间,其它的在两侧对称地逐渐缩小,The Normal Curve,Characteristics特点:  Curve theoretically does not reach zero.理论上曲线不会到达零值  Curve can be divided in half with equal pieces falling either side of the most frequently occurring value.曲线可以再出现概率最大的值两侧分为相同的两部分  A normal curve indicates random or chance variation.一条正态曲线包含随机变异  The peak of the curve represents the center of the process.曲线的峰值表示过程的中心  The area under the curve represents virtually 100% of the product the process is capable of producing.曲线以下区域实质上表示生产能力之内100%的产量,Empirical Rule of Standard Deviation,We have been talking about the Normal Distribution.  However, the following  rules apply to most distributions youll find in the real world:我们已经谈过了正态分布。但是下面这些公式可以应用于现实中的大部分分布Rule 1Roughly 60-75% of the data are within a distance of one standard deviation on either side of the mean. (NC=68.3%)大概有60-75%的数据是在均值两边的一个标准偏差之内的(NC=68.3%)Rule 2Usually 90-98% of the data are within a distance of two standard deviations on either side of the mean. (NC=95.4%)通常90-98%的数据是在均值两边的两个标准偏差之内的(NC=95.4%)Rule 3Approximately 99-100% of the data are within a distance of three standard deviations on either side of the mean (NC=99.7%)大约有99-100%的数据是在均值两边的三个标准偏差之内的(NC=99.7%)We use this +/- 3 standard deviation range as representative of natural process variation我们把这+/-3个标准偏差范围作为自然过程变异的典型,34,Graphical Meaning of Standard Deviation,35,68.27%,95.45%,99.73%,99.9937%,99.999943%,99.9999998%,Normal Curve & Standard Deviation,O,u,t,p,u,t,4,3,2,1,0,-,1,-,2,-,3,-,4,0,.,4,0,.,3,0,.,2,0,.,1,0,.,0,99.73%,95%,The Normal Distribution takes different forms,Total probability = 1, in mathematic,a,A,The Probability under the Normal Curve,Lucky we have minitab .,Given the normal curve with m = 7 and s =1.5, whats the area from infinity to 6.5,6.5,A?,By Minitab,Normal with mean = 7 and standard deviation = 1.5  x  P( X <= x )6.5     0.369441,Example,Given the normal curve with m = 7 and s =1.5, whats the area from infinity to 7 and from infinity to 8?,7,A?,By Minitab,x  P( X <= x )7          0.5,8,x  P( X <= x )8     0.747507,Example,Given the normal curve with m = 5 and s =1.2, whats the area from infinity to 4, - infinity to 6.5, and from 4 to 6.5?,4,A?,By Minitab,6.5,x  P( X <= x )4     0.202328x  P( X <= x )6.5    0.894350,And P( 4 < x < 6.5) = 0.89435-0.202328 (why?)           = 0.692022,Example,Exercise 2,Given normal curve of m = 10 and s=1.5, whats the area under the curve from - infinity to 8.5,- infinity to 7- infinity to 5.5- infinity to 11.5- infinity to 13- infinity to 14.5 ?Given the same normal curve (m=10 and s=1.5), whats the area under the curve from8.5 to 11.57 to 135.5 to 14.5 ?,IndividualExercise,Normality Test,The estimation of the process characteristics are based on modeling of normal distribution. Question, how we know respective process is normal distribution ?,Answer is Normality Test.,Normality tests quantify the discrepancy between the distribution of data and an ideal normal distribution.Three kinds of normality tests can be performed in Minitab:Anderson-Darling (A-D)Ryan-JoinerKolmogorov-Smirnov,StatBasic StatisticsGraphical Summary Variables = Normal,Normal Distribution Summary,General Guidelines : We can assume that the data is normally distributed if ALL the following criteria is fulfilledP-value > 0.05|Skewness| < 1|Kurtosis| < 1,Normal Probability Plots,Q1  :  The samples of the batch of metal rod manufacturing were tested for the tensile strength. Data in (tensile test) datasheet. If the manufacturing specification of tensile strength is minimum 1980 psi. Is the distribution normal ?Whats the expectation of the scrap rate of these metal rods.,2035203120202008197820112022196420521974198819692028207520651986190419462034198920762047197020532043,Q2 :  The Rifle 10 feet game in Olympic for participant A for first 10 shoots score as below. Is the distribution normal?Whats the probability that A is having the shoot of 10 points and above ? Do you think that is it likely that A is having the shoot of point of 10.6 and above?,9.310109.9,IndividualExercise,Exercise 5,Q3 : Received the rivets from the suppliers with the specification as shown in figure. 20 samples are taken for the measurement of the cap diameter. Please determine the mean and standard deviation of the rivets. From the date, whats the estimate percentage of the parts that beyond the specification. From your opinion, do this supplier give us the good parts ? (rivet diameter)Q4 : Please confirm the following distribution of normality The waiting time of each

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论