三维稳态传热问题的并行计算_第1页
三维稳态传热问题的并行计算_第2页
三维稳态传热问题的并行计算_第3页
三维稳态传热问题的并行计算_第4页
三维稳态传热问题的并行计算_第5页
已阅读5页,还剩7页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、The Parallel Computation of a 3D_Steady Conduction Problem with Gauss-Seidel Method三维稳态传热问题的并行计算Cheng MuLinMechanical and Engineering Science Department, PeiKing UniversityAbstract In this paper, I use MPICH to implement the parallel computation of a 3D-Steady conduction problem. Running cases with

2、different mesh and processor number closely tests the parallel performance of this program.摘要本文中采用MPICH实现了一个3D-Steady 的传热问题的平行计算求解.通过运行具有不同的网格数目,进程数目的程式,对该程式的平行效率进行了测试,发现具有线性加速比.这表明本程式具有较高的平行效率.IntroductionHistorySingle-Processor supercomputers achieved unheard of speeds beyond 100 million instructi

3、ons per second, and pushed hardware technology to the physical limits of chip building. And so it will come to the end, because there are physical and architectural bounds that limit the computational power that can be achieved with a single-processor system. But the computing tasks from the scienti

4、fic field, such as CFD (Computational Fluid Dynamic), nuclear physics and so on, are more and more complex which demand huge memory and high computing speed. Thus the parallel computer system is designed to match this need. Because the whole task is split to some small pieces or steps and each proce

5、ssor has one or more pieces or steps running on itself, different pieces or steps are done at the same time and the whole task can be finished more quickly than on a single-processor computer. But different processors in a parallel computation are not independent with each other in most cases, so da

6、ta and message exchanging are unavoidable which are very slow comparing to the CPU speed. These data and message passing is the most important factor that limits the speed of parallel computers speed.During recent years, different paradigms of parallelism are developed suitable for different applica

7、tion field. Following table (tab.1) shows a classification system, which is not a complete one, but includes the major approaches taken by scientists, engineers, and researchers in a variety of fields, who apply parallel computing. Vector/Array is taken as the parallelism paradigm at the beginning p

8、eriod of parallel computation research. Now, MIMD (Mutiple-Instructions-Mutiple-Data) is the most general form and SIMD (Single-Instructions-Multiple-Data) and SPMD (Single-Program-Multiple-Data) forms of parallelism appear to be appropriate for scientific problems whose data are regular and whose c

9、alculations are uniform and repetitious.Table.1During this summer holiday, I study MPI and MPICH, and then develop a parallel program with MPICH for a 3D-Steady Conduction problem with the guidance of Pro. Lin. This paper includes the most part of my work.Basic Idea of Parallel ComputationMPI and MP

10、ICHMessage Passing is a Paradigm used widely on certain classes of parallel machines, especially those with distributed memory. To reduce the repetitious work of vendors who apply parallel computing, MPI(Message Passing Interface) is defined which try to define both the syntax and semantics of a cor

11、e of library routines that will be useful to a wide users and efficiently implementable on a wide range of computers. MPI describes all MPI function in the language-independent notation and the ANSI C version of the functions is provided, the FORTRAN 77 version of the same functions is also provided

12、. MPICH is a portable implementation of the full MPI specification for a wide variety of parallel and distributed computing environments.Measure of PerformanceFor a single-processor computer, MIPS (Million Instructions Per Second) and MFLOPS (Million Floating Point Operation Per Second) are traditio

13、nal measures for the performance. For a parallel computer system, Speedup is an often-quoted measure for parallel performance, although it is also a controversial one. Speedup is defined as following: (1)where T0 is the time to compute a certain problem using a serial program on one processor. And T

14、(N) is the time to compute a certain problem using a parallel program on N processors. That is to say Speedup is computed by dividing the time to compute a solution using one processor by the solution time using N processors. But in practice, T(1) is used for T0 instead for simplicity. Thus speedup

15、can be computed as following: (2)However, we should remember the slight difference between T(1) and T0, which comes from using different programs in which one is a serial one and the other is a parallel one.Parallel ComputationProblem DescriptionA 3D-Steady Conduction Problem is considered in this p

16、aper. The Problem is shown as figure 1. The Length (L) of the bar is 0.4m, the width(D) of the bar is 0.1m and the height(H) of the bar is 0.1m.too. Aluminum is selected as the material of the bar and the material is homogeneous through the whole bar. Parameters used about Aluminum is shown as follo

17、wing: There is a temperature difference at two ends of the bar, the left end is heated to 100K and the right end is kept at 0K, so heat will move from the left to the right and temperature will reach a steady distribution through the whole bar. For other four faces of the bar, the adiabatic boundary

18、 condition is set, that is to say, no heat escapes from these four faces of the bar. A heat source S is under consideration, and S is the function of temperature T. S(T) can be used to represent many cases in which the bar gets or losses heat through no-mechanical process, such as radiation, chemica

19、l reaction and so on.fig.1 Problem descriptionEquationsBecause this is a conduction problem without fluid motion, governing equation is a Poisson Equation, as following. (3)At two ends, the boundary condition is: (4) (5)For four faces of the bar, the adiabatic condition can be expressed as following

20、: (6) (7)Equation610 decide the distribution of temperature T through the bar. Because my focus is the parallel computation performance, the boundary condition is designed carefully so that the problem can be solved analytically when S(T) is set to ZERO. Obviously, a linear solution can be given: (8

21、)This equation will be used to compare with the numerical result from parallel computation.Discretization and Solution MethodA constructral mesh is used as is shown in figure.1. The finite-difference method is used. First, S(T) is linearized to (9)where Sc, Sp are not constants and vary with Tempera

22、ture T. Second, equ.6 is integrated on the control volume around the gird point. At last, the temperature on the gird points is substituted into the equations and the finite-difference equation can be expressed in this form:where TP, TE TW, TN, TS, TT, TB are the value on the center point, east one,

23、 west one, north one, south one, top one, bottom one respectively. is the dimension of the control volume. Additionally, the boundary condition need some carefully consideration without basic difference to above.Although the Gauss-Seidel line-by-line method will make the iteration of the solution co

24、nverge more quickly than the Gauss-Seidel point-by-point method, we still use the point-by-point method for the reason of parallel programming.For parallel computing, the mesh are split by several faces perpendicular to the x direction to some approximate equivalent blocks. Each processor will burde

25、n computing on one block, and the value of grid points on splitting faces should be passing between processors. The computing process is split and the data resource is not split. That is to say, at the beginning of parallel computing, all the processors finish initialization at the same time and the

26、n compute its own block, at last each computing node sends the result to node 0.Node 0 collects the result and outputs it to file.ResultThree computers, which has two CPUs, are used to construct a parallel computer. Using a mesh with 20*5*5 grid points, the iteration converges to a numerical solutio

27、n with 1.0E-6 residual, after 3142 times iterations.Figure.2 shows the distribution of temperature through the bar. fig.2 Temperature DistributionThe figure shows that the temperature is constant when x is constant and distributes linearly along the x direction. This numerical result is coherent wit

28、h the analytical result(equ.8), which shows the correction of the parallel program. Figure.3To test the parallel performance of my program, more cases with different mesh and processors number have been tested on the same parallel computer. The iteration times and the solution time consuming on each

29、 processor for every case are recorded. We find that iteration times very slight increase when processors number increases from 1 to 5(figure.3). fig.3 Relative Iteration Times IncreasementIf comparing the parallel program with the serial one, the reason for the iteration times increasing can be fou

30、nd easily.The solution time consuming on the node 0 is slightly larger than that on other nodes, which is caused by the last step Reduction Operation in parallel computing. So the solution time consuming on the node 0 is used as the whole solution time. The solution time increases with the grid poin

31、ts number increasing when the residual is fixed which is shown in figure.4. Fig.4 Computation Timefig.5 SpeedupFigure.5 shows the speedup curves. Each curve represents a kind of mesh, which has different grid points number. When the grid points number is small, such as 20*5*5 in figure.5, the speedu

32、p will be less than 1 and decrease with processors number increasing. Because there are relative massive data passed between processors comparing to the grid points number, the parallel computing speed is greatly cut down that it is more slowly than the single-processor computing. When the grid poin

33、ts number is large enough, such as 60*15*15, 90*25*25 or 120*30*30, the speedup will be more than 1 and increases when more processors are added to parallel computation. What is more, the speedup curve approaches to a linear line from a curve when the grid points number is large enough, such as 120*

34、30*30. A linear speedup curve whose slope is approximate 0.66 shows the program has good parallel computation performance.Because the speedup is the function of mesh and processors number, another speedup curves figure is given as following (figure.6), which shows the relation between grid points nu

35、mber and speedup. fig.6 SpeedupDiscussionAnother kind of mesh partition is also used, but less speedup is got because more data needs to be passed between processors. All the result shows that the time consuming on communication between processors greatly limit the parallel computation speed. There are three traditional methods to conquer this defect. One is improving the hardware of parallel computer, but this always leads to the expensive price. The second one is to change the interconnection network(IN) topology of parallel computer.

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论