Extraction and Analysis of Digital Ination(7).doc_第1页
Extraction and Analysis of Digital Ination(7).doc_第2页
Extraction and Analysis of Digital Ination(7).doc_第3页
Extraction and Analysis of Digital Ination(7).doc_第4页
Extraction and Analysis of Digital Ination(7).doc_第5页
已阅读5页,还剩8页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Extraction and Analysis of Digital InformationPeng LiMingchuan ZhouDi CuiNortheast Agricultural UniversityHarbin, ChinaAdvisor:SummaryAs for some given pictures in scientific researches or relevant experiments, we design some algorithm and edit routine to extract digital information ready for future analysis.According to the knowledge of gel electrophoresis, the effective migration rate and the logarithm of molecular weight get into a linear relationship under the experimental condition. We primarily formulate a linear model using interpolation method to obtain each fragments molecular. Then a grey-box model is built based on the grey system theory to judge the relationship between the unknown breed and the top 9 breeds. Finally, by comparing the degree of association, we arrive at the conclusion that the 10th breed is a new sort. For the second problem, we build a mathematical model based on Monte Carlo simulation to calculate the mottled area of one seed. Under the assumption that the whole seed can be seen as a sphere(ellipsoid) surface, the imagine is equal to the projection of hemisphere(semiellipsoid). Then we select dots as many as possible()in accordance with the principle of random uniform distribution from the hemispherical(semielliptical) surface () to the projection drawing. We get the fuction expression in the coordinates and figure out the number of dots inside the projection. After confirm the result, finally, we conclude the seeds mottled area of the hemisphere as = .We adopt sampling method to estimate seeds mettled area when the amount of seeds is large. We can get the seeds mottled area on each floor with the same method ,then the total area of seeds metley is the sum.Finally, we find the right way through linear model, grey-box model and Monte Carlo simulation to translate pictures into digital information.IntroductionPictures in scientific researches or relevant experiments always need to be dealt with and analyzed, like data information extraction. In order to save time and manpower, we design some algorithm and edit routine in solving problems based on experiment pictures.Gel electrophoresis is a procedure for separating a mixture of molecules through a stationary material (gel) in an electrical field. The migration of the sample rate of protein molecules depends entirely on the molecular weight when they are rid of their charge effect. Therefore, the effective migration rate and the logarithm of molecular weight get into a linear relationship. According to this theoretical principle, we formulate a linear model using interpolation method to translate pictures into digital information, after which a grey-box model is built on the grey system theory to judge the relationship between the top 9 breeds and the unknown breed. By comparing the degree of association, we arrive at the conclusion that the 10th breed is a new sort.Seminal motleys often appear around the umbilicus, which are caused by virus infection. Seeds mottled area ratio of a certain breed has a close relationship with its disease resistance. In order to calculate the mottled area of one seed, we build the model based on Monte Carlo simulation. Primarily, we treat the whole seeds surface as a spherical(elliptical) surface. The whole sphere can be symmetrically divided into two parts for the picture reflects just a hemisphere(semielliptical). Then we pick dots as many as possible(the number may be )in accordance with the principle of random uniform distribution from the hemispherical surface () to the projection drawing. With the aid of coordinates, we get the fuction expression and figure out the number of dots inside the projection. So the seeds mottled area of the hemisphere can be expressed as= . Finally we also set a simple example to confirm our conclusion.Task 1: Problems based on Electrophoretogram We primarily formulate a linear model using interpolation method to translate pictures into digital information we need. Then, a grey-box model is build on the grey system theory to judge the relationship between the top 9 breeds and the unknown breed, from which we will conclude whether the 10th breed is a new one or not.In order to simplify the model, we make the following assumptions about the experiment process in this paper:No error happened during the process of experiments.No differences between these experiments on materials uniformity, the length of time.Crops can be divided into different sorts exactly according to the location of the thirteen fragments in the experiments.The influence of the inaccuracy of measurements can be neglected.We see the migration distance of the fragment whose molecule weight is 700 as zero.Model SetupModel 1: Programming Theory Model Refer to the background, under the experimental condition, the effective migration rate and the logarithm of molecular weight get into a linear relationship. Therefore, it can be based on the standard curve obtained by the linear relationship to know the molecular weight of proteins.We measure the migration distance that is the distance between the final location of the fragment and the notch in each picture. The certain distance in unit time is defined as the effective migration rate, which has a linear relationship with the logarithm of molecular weight are, so is the migration. Algorithm Design : Interpolation Method We let as independent variables representing the Markers seven fragments migration distance, as dependent variables representing the corresponding logarithm of molecular weight, as independent variables which represent migration distance of the thirteen fragments. The concrete data are as follows:Table 1: Migration Distance of Marker (unit : )location700500400300200150100picture109.922.23446.36578.4picture2010.622.735.146.666.279.3picture308.722.23444.761.474.4picture4037.758.668.2picture509.42429.240.16673.5picture607.814.927.538.259.268.7picture7013.428.838.544.553.171.6picture8039.448.466.8picture907.822.634.854.970.374picture10014.428.833.643.553.372.9Table 2: Migration Distance of the thirteen fragments (unit : )fragment12345678910111213picture127.71618.21520.415.51618.116.724.3picture222.115.946.615.518.116.927.728.541.432.7picture314.31421.21427.851.414.319.614.321.223.843.631.2picture8.813.614.617.41432.228.4picture514.314.820.914.520.244.415.316.311.724.523.519.929.8picture67.87.2108.31338.210.912.213.716.717.233.926.5picture721.145.824.619.925.953.619.425.120.428.927.952.439.6picture816.21717.923.426.542.736.1picture913.516.420.215.919.948.615.11913.720.325.941.230.5picture1020.120.521.620.92853.719.526.319.827.922.553.944.2We see Marker as the standard, utilizing interpolation method with to obtain each fragments molecular weight after indexation. Programming implementation: One dimensional linear interpolation ResultsTable 3: Molecular Weight of Each Fragment123456789101403.66404.46455.8481.23463.93500447.22446.91458.83457.732447.63453.45458.07490.53460.39513.11191.49441.92439.2454.913430.12419.66406.68463.18419.42466.6425.11405.1414.75447.224455.82460.19458.07475.14462.51492.2455.06453.22442.53452.15413.29413.51348.96407.74423.93424.62417.18404.53416.634056201.98200178.19195.91190.67200148.36126.59227.1148.767451.7456.8455.8440.15456.89453.59458.37453.86447.89462.018430.9435.42417.58423.63449.96435.43422.04417.21422.32415.819441.98445.16455.8396.35482.73415.39451.78436.37457.45459.8710380.05356.2406.68371.61389.1383.91398.83403.96414.13405.6311372.71349.65384.71402.57403.08379.55405.26366.94370.07441.0312228.94240.24208.51244.95425.88235.39153.55179.97263.67148.1413298.03317.18321.2281.78293.38306.93278.51240.23332.02195.93Model Strengths: When the picture is pulled-down, taking picture 1 for example, the new data are as follows:Table 4: New Distances of Marker 1(unit : )location700500400300200150100Marker013.329.645.960.485.4102.3Table 5: New Distances of Fragments in Picture 1(unit : )12345678910111213fragment27.519.922.318.725.760.119.62220.531.13254.345.6According to the new data, we use interpolation method again, getting new results. We carry out the error check:In the formula, means error, means the new molecular weight, means the old molecular weight.Table 6: Result comparison for molecular weightOldNewError 1403.6585411.67860.01992447.6255456.80880.02053430.1151442.04690.02774455.8187464.37370.01885413.2897421.9470.02096201.9844201.68150.00157451.7035458.68840.01558430.896443.86570.03019441.978453.07280.025110380.0486389.56260.025011372.7076383.42320.028812228.9415237.1960.036113298.0337301.5980.0120 The digital information signs that all of the error amount are smaller than 5%, showing that the model strength is relatively good. Therefore, the model is practical, which has nothing to do with the picture size.Model 2: The Gray Relational ModelWe use gray correlation analysis to obtain the degree of association between the unknown materials and the top nine materials, according to the molecular weight of each fragment. The concrete steps are as follows:Step 1: Mother Sequence and Subsequence SelectionMother sequence: We define the 10th materials molecular weight as Mother sequence : Subsequence: we define the top 9 materials from the former molecular weight as subsequences : , is defined as the serial number, Valued for 1,2, Valued for 1,2,3,Step 2: Sequence InitializationTo the initial value of subsequence and mother sequence: We use the value of the sequence to divide the initial value of each sequence. In this way, we get the initial value of each subsequence and the initial value of mother sub sequence, as follows:Step 3: Obtaining the degree of association Define the absolute value of D-value between mother sequence and subsequence as:,Then find the max and min from those, which can be defined as follows:In the subsequence, the correlation coefficient between the sub-sequence of and mother sequence at the point is :,In the formula, is the identification coefficient, and its value is. Usually, the value varies in different situations. At present, we formulate its value as 0.5. So the correlation Relational between the sub-sequence of and mother sequence is:,After computer implementation, we get the value of correlation. The results are as follows:Table 7: degree of associationsequence123456789value0.440.440.440.440.420.430.470.460.43Because the similarity can be reflected by degree of association according to the values in the table 7, we find that the similarity between the unknown material and the other materials are no more than 0.5. Therefore, we can draw the conclusion that the 10th material is a new kind of breed variety. Here is the programming language: function H(a,b) % a means Mother Sequence,b means Subsequence Selection x=size (b); a=a. /a (1,1); for j=1:13 for i=1:9 b(j,i)=b(j,i)./b(1,i); g(j,i)=sqrt(a(j,1)-b(j,i)2); endendmi=min(g);ma=max(g);for j=1:13 for i=1:9gl(j,i)=(mi+0.5*ma)/(g(j,i)+0.5*ma); endendfor i=1:9; r(1,i)=sum(gl(:,i)/13; % calculating the degree of association endrTask 2 : Seeds Mottled Area MeasurementBecause the problem needs extracting clues from pictures. In order to translate the three-dimensional problem into planar, we make the following assumptions to simplify our model : Treat the whole seed as a sphere(semi ellipsoid) The influence of shape change when projecting can be neglected We dont take extreme cases into count.Model SetupModel 1: Seeds Mottled Area CalculationWhile calculate the mottled area of a seed, what provide us the most direct information are pictures. Since we simplify the whole seed as a sphere(ellipsoid), the seeds image on the picture can be seen as the projection drawing of a hemisphere (semi ellipsoid). Then, using Monte Carlo simulation we pick dots as many as possible in accordance with the principle of random uniform distribution from the hemispherical(semielliptical) surface to the projection drawing. With the aid of coordinates, we can get the fuction expression and figure out the number of dots inside the projection. So the seeds mottled area of the hemisphere(semi ellipsoid) can be measured.The concrete steps are as follows :First we need to define some variables for illustration : radius of the hemisphere, which represents half of the seed.: the seed mottled area on one side. : area of the hemispherical(semielliptical) surface: the number of dots inside the projection.: the number of dots we select from the hemispherical(semielliptical) surface.Step 1: Selection of Testing DotsTo make sure that all the testing dots can adequately represent the hemispherical (semielliptical) surface, we must select dots as many as possible in accordance with the principle of random uniform distribution .We put the hemisphere(semi ellipsoid) into rectangular solid, which is tangent with it. Then we use Monte Carlo method to simulate testing dots in the cuboids whose number is ,100000 for example. The next work is to wipe off dots outside the hemisphere(semi ellipsoid) and project dots inside it on the hemispherical (semielliptical) surface. Finally, we can get the equally distributed dots on the surface. Step 2: Simplify the Projection of Motley Actually, the seeds image on the picture can be seen as the projection drawing of a hemisphere(semi ellipsoid). After image processing, through desalting the surface, we get the projection of the motley in coordinates. We connect the m() dots appointed at the edge of the outline, getting a polygon representing the motley. Thereby we can get the Function expression: .Step 3: Calculate Seeds Mottled AreaWe point the dots on the hemispherical (semielliptical) surface into the coordinates above. According to the function, we can figure out the number of dots() inside the projection of the motley, so the seeds mottled area of the hemisphere (semi ellipsoid) can be expressed as : = Step 4: Simulated Experiment on the Next Side The simulation process is the same as motioned above. The integrate seed mottled area is the sum of the two results:Example : For the purpose of convenient for calculating, we take a simple situation for example. Suppose: The projection of a seed motley is round, whose radius () is 5. The radius of the sphere() is 10. The center of the motley projection is reclose with the hemisphere.The following is the programming language :function p(n)% n represents the a = 20.0*rand(n,1)-10.0;%长方体内b = 20.0*rand(n,1)-10.0;c = 10.0*rand(n,1);q=0;for i=1:nif a(i).*a(i)+b(i).*b(i)+c(i).*c(i)=100.00q=q+1; r=0;r = sqrt(a(i)*a(i)+b(i)*b(i)+c(i)*c(i);v(q)=10*a(i)/r;w(q)=10*b(i)/r;z(q)=10*c(i)/r;endendqv;w; t=0; for j=1:q if (v(j)2+(w(j)2=25.00; t=t+1; end end t S=2*pi*(102)*t/q plot(v,w,.) Result :TimeRptSn110.00 52124691383.33 100000210.00 52240698083.95 100000310.00 52531711985.15 100000410.00 52358702484.2

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论