Strassen算法-介绍.docx_第1页
Strassen算法-介绍.docx_第2页
Strassen算法-介绍.docx_第3页
Strassen算法-介绍.docx_第4页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Strassen算法在了解Strassen算法之前,先来了解一下矩阵乘法:矩阵乘法的c语言程序:#includestdio.hfloat main()float a100100,b100100,c100100; /定义三个数组分别储存三个矩阵A,B,Cint m1,n1,m2,n2,i1,j1,i2,j2,i3,j3,i4,j4,k;float s100100=0; /初始化数组sprintf(请输入矩阵A的行数m1和列数n1:);scanf(%d%d,&m1,&n1); printf(请输入矩阵B的行数m2和列数n2:);scanf(%d%d,&m2,&n2);printf(n); /便于观看结果,将结果与输入分开if(n1!=m2) printf(不可以相乘!nn);if(m1100|n1100|m2100|n2100) printf(数目过多,溢出!nn);else for(i2=1;i2=m2;i2+) for(j2=1;j2n); for(i2=1;i2=m2;i2+) for(j2=1;j2=n2;j2+) printf(B%d%d=,i2,j2); scanf(%f,&bi2j2); /输入矩阵B的元素 printf(矩阵Ann); /打印矩阵A便于观看与检查 for(i3=1;i3=m1;i3+) for(j3=1;j3nn矩阵B: ); /与矩阵B的打印隔开,便于观看 for(i4=1;i4=m2;i4+) for(j4=1;j4nn矩阵C=A*B= n);for(i4=1;i4=m1;i4+) for(j4=1;j4=n2;j4+)for(k=1;k=n1;k+)si4j4=si4j4+ai4k*bkj4; /定义矩阵的乘法,相乘时,有一个指标是一样的,都用kci4j4=si4j4; printf(矩阵C是: n);for(i4=1;i4=m1;i4+) for(j4=1;j4=n2;j4+)printf(%ft,ci4j4);if(j4=n2)printf(n);return 0;设甲,乙两个方阵通过环。我们要计算的矩阵产品 如果2型2矩阵甲,乙,我们填零丢失的行和列。我们分区甲,乙,成大小相等的块矩阵【LetA,Bbe twosquare matricesover aringR. We want to calculate the matrix productCasIf the matricesA,Bare not of type 2nx 2nwe fill the missing rows and columns with zeros.We partitionA,BandCinto equally sizedblock matrices】The left column represents 2x2matrix multiplication. Nave matrix multiplication requires one multiplication for each 1 of the left column. Each of the other columns represents a single one of the 7 multiplications in the algorithm, and the sum of the columns gives the full matrix multiplication on the left. 【左边的列表示的2x2矩阵乘法。朴素的矩阵乘法,需要为每一个“1”的左边的列的一个乘法。7乘法算法的其他列中的每一个都代表一个单一的一个,并在左侧的列的总和给出了完整的矩阵乘法。】 with Then 采用这种结构,我们并没有减少乘法的次数。我们还需要8次乘法计算CI,J矩阵,我们需要的时候使用标准的矩阵乘法相同数量的乘法。现在来的重要组成部分。我们定义新的矩阵 只使用7的乘法(每个Mk个 之一)而不是8。现在我们就可以在MK 表示Ci.j是这样的We may now express the C i,j in terms of M k , like this: 【We iterate this division processntimes (recursively) until thesubmatricesdegenerate into numbers (elements of the ringR). The resulting product will be padded with zeroes just likeAandB, and should be stripped of the corresponding rows and columns.Practical implementations of Strassens algorithm switch to standard methods of matrix multiplication for small enough submatrices, for which those algorithms are more efficient. The particular crossover point for which Strassens algorithm is more efficient depends on the specific implementation and hardware. Earlier authors had estimated that Strassens algorithm is faster for matrices with widths from 32 to 128 for optimized implementations.1However, it has been observed that this crossover point has been increasing in recent years, and a 2010 study found that even a single step of Strassens algorithm is often not beneficial on current architectures, compared to a highly optimized traditional multiplication, until matrix sizes exceed 1000 or more, and even for matrix sizes of several thousand the benefit is typically marginal at best (around 10% or less).2】我们本的分裂过程次(递归)重复,直到子矩阵退化成号(环的元素)。将所得的产物将被填充零就像甲和乙,并应被剥离的相应的行和列。实用的Strassen算法开关矩阵乘法的足够小的子矩阵的标准方法的实现,这些算法是更有效的。取决于具体的实现和硬件的特定的交叉点的Strassen算法是更有效的。此前笔者曾预计,Strassen重的算法是更快的矩阵与宽度从32到128优化的实现。1然而,它已被观察到,这个交叉点已被越来越多,近年来,和一个2010年的研究发现,即使一个单步骤的Strassen重的算法是不利于对当前的体系结构中,一个高度优化的传统乘法相比,直到矩阵大小超过1000个或更多,和甚至为矩阵大小几千的好处是通常边际充其量(约10或更少的)2渐近复杂性(Asymptotic complexity)标准的矩阵乘法大约需要23(= 2)算术运算(加法和乘法)的渐近复杂度为O(3)。Strassen算法所需的加法和乘法的数目可以计算如下:让f()的操作的数量为22n的矩阵。然后通过递归应用Strassen算法,我们看到,f(n )的= 7f(-1)+14正 ,对于某个常数,在每个应用程序的算法进行的加法的数目取决于。因此,f()=(7 + O(1),即乘以矩阵大小= 2n的Strassen算法的渐近复杂性然而,算术运算的数目的减少在一个有所减少的数值稳定性的价格,并且该算法也需要天真算法相比显着更多的内存。必须具有它们的尺寸扩展到在存储多达四次一样多的元素的下一个2的幂,从而导致这两个初始矩阵,和七个辅助矩阵的每一个包含在扩大的四分之一的元素【The reduction in the number of arithmetic operations however comes at the price of a somewhat reducednumerical stability, and the algorithm also requires significantly more memory compared to the naive algorithm. Both initial matrices must have their dimensions expanded to the next power of 2, which results in storing up to four times as many elements, and the seven auxiliary matrices each contain a quarter of the elements in the expanded ones.】等级或双线性复杂的(Rank or bilinear complexity)【The bilinear complexity orrankof abilinear mapis an important concept in the asymptotic complexity of matrix multiplication. The rank of a bilinear mapover a fieldFis defined as (somewhat of anabuse of notation)】一个双线性映射的双线性的复杂性或职级的渐近复杂的矩阵乘法的一个重要概念。被定义为一个双线性映射的排名超过F(有点滥用的符号)【In other words, the rank of a bilinear map is the length of its shortest bilinear computation.3The existence of Strassens algorithm shows that the rank of 22 matrix multiplication is no more than seven. To see this, let us express this algorithm (alongside the standard algorithm) as such a bilinear computation. In the case of matrices, thedual spacesA* andB* consist of maps into the fieldFinduced by a scalardouble-dot product, (i.e. in this case the sum of all the entries of aHadamard product.)】换句话说,一个双线性映射的排名是长度最短的双线性计算。3的Strassen算法的存在表明,22矩阵乘法的排名是不超过7。为了说明这一点,让我们表达这种算法的标准算法一起,这样的双线性计算。在矩阵的情况下,双空格甲*和乙*包括映射到由一个标量双点产品诱导的场F ,(即在这种情况下的所有条目的Hadamard乘积的总和。)【It can be shown that the total number of elementary multiplicationsLrequired for matrix multiplication is tightly asymptotically bound to the rankR, i.e., or more specifically, since the constants are known,One useful property of the rank is that it is submultiplicative fortensor products, and this enables one to show that 2n2n2nmatrix multiplication can be accomplished with no more than 7nelementary multiplications for anyn. (Thisn-fold tensor product of the 222 m

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论