已阅读5页,还剩38页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
CS 267: Applications of Parallel Computers Lecture 23: Solving the Poisson Equation Kathy Yelick /cs267 *1CS267, Yelick Lecture Schedule 11/19: Solving the Poisson Equation 11/21: Solving the Poisson Equation 11/26: Tree-based computation (Poisson Again) 11/28: Visit to NERSC Visualization group need to pick up “pass” for the bus 12/3: TBD 12/5: The Future of Parallel Computing 12/12: CS267 Poster Session (1-3pm, Woz) 12/14: Final Papers due Date2CS267, Yelick Outline Review Poisson equation Overview of Methods for Poisson Equation Jacobis method Red-Black SOR method Conjugate Gradients FFT Multigrid Comparison of methods Particle methods (next week) 2D Poissons equation Consider the continuous 2D Poisson equation, again d2u/dx2 + d2u/dy2 = b The discrete version is: T * x = b 4 -1 -1 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4 -1 -1 4 -1 -1 -1 -1 4 -1 -1 -1 -1 4 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4 T = Graph and “stencil” Date4CS267, Yelick Details of Discretization Approximate d2u/dx2 by differences u(x,y) = u(x+1/2, y) u(x-1/2, y) = u(x+1,y)-u(x,y) (u(x,y) u(x-1,y) = -2u(x,y) + u(x-1,y) + u(x+1,y) Similarly for d2u/dy2 So discrete Poisson for 2D mesh is: 4u(x,y) u(x-1,y) u(x+1,y) u(x,y-1) u(x,y+1) (with sign change) Date5CS267, Yelick Algorithms for 2D Poisson with N Unknowns AlgorithmSerialPRAMMemory#Procs Dense LUN3NN2N2 Band LUN2NN3/2N JacobiN2 NNN Explicit Inv.N log NNN Conj.Grad.N 3/2N 1/2 *log NNN RB SORN 3/2N 1/2NN Sparse LUN 3/2N 1/2N*log NN FFTN*log Nlog NNN MultigridNlog2 NNN Lower bound Nlog NN PRAM is an idealized parallel model with zero cost communication 222 Date6CS267, Yelick Multigrid Motivation Recall that Jacobi, SOR, CG, or any other sparse-matrix -vector-multiply-based algorithm can only move information one grid call at a time Can show that decreasing error by fixed factor c= k, then cost = O(4j-k ) . Flops, proportional to the number of grid points/processor + O( 1 ) a . Send a constant # messages to neighbors + O( 2j-k) b . Number of words sent If level j b Date32CS267, Yelick Practicalities In practice, we dont go all the way to P(1) In sequential code, the coarsest grids cost is negligible, but on a parallel machine they are not. Consider 1000 points per processor In 2D, the surface to communicate is 4xsqrt(1000) = 128, or 13% In 3D, the surface is 1000-83 = 500, or 50% See Tuminaro and Womble, SIAM J. Sci. Comp., v14, n5, 1993 for analysis of MG on 1024 nCUBE2 on 64x64 grid of unknowns, only 4 per processor efficiency of 1 V-cycle was .02, and on FMG .008 on 1024x1024 grid efficiencies were .7 (MG Vcycle) and .42 (FMG) although worse parallel efficiency, FMG is 2.6 times faster that V- cycles alone nCUBE had fast communication, slow processors Date33CS267, Yelick Multigrid on an Adaptive Mesh For problems with very large dynamic range, another level of refinement is needed Build adaptive mesh and solve multigrid (typically) at each level Cant afford to use finest mesh everywhere Date34CS267, Yelick Multiblock Applications Solve system of equations on a union of rectangles subproblem of AMR E.g., Date35CS267, Yelick Adaptive Mesh Refinement Data structures in AMR Usual parallelism is to deal grids on each level to processors Load balancing is a problem Date36CS267, Yelick Support for AMR Domains in Titanium designed for this problem Kelp, Chombo, and AMR+ are libraries for this Primitives to help with boundary value updates, etc. Date37CS267, Yelick Multigrid on an Unstructured Mesh Another approach to variable activity is to use an unstructured mesh that is more refined in areas of interest Controversy over adaptive rectangular vs. unstructured Numerics easier on rectangular Supposedly easier to implement (arrays without indirection) but boundary cases tend to dominate Date38CS267, Yelick Multigrid on an Unstructured Mesh Need to partition graph for parallelism What does it mean to do Multigrid anyway? Need to be able to coarsen grid (hard problem) Cant just pick “every other grid point” anymore Use “maximal independent sets” again How to make coarse graph approximate fine one Need to define R() and In() How do we convert from coarse to fine mesh and back? Need to define S() How do we define coarse matrix (no longer formula, like Poisson) Dealing with coarse meshes efficiently Should we switch to using fewer processors on coarse meshes? Should we switch to another solver on coarse meshes? See, for example, the Prometheus system by Mark Adams Solved up to 39M unknowns on 960 processors with 50% efficiency Date39CS267, Yelick Irregular mesh: Tapered Tube (Multigrid) Date40CS267, Yelick Coarsening Unstructured Meshes The Prometheus system uses maximal independent sets for coarsening Same idea used for multilevel spectral partitioning A simple greedy algorithm computes a maximal independent set Prometheus uses several heuristics for FEM problems: Order vertices for MIS algorithm Corner, edge, surface, interior Modify graph by deleting edges within a class Move nodes to cover “lost” vertices Date41CS267, Yelick Interpolation and Restriction Operators Prometheus functions (high level) Date42CS267, Yelick a:73N)J%G#CWySuOqhnd9;52=M(I$EYAVxRtjpflb 8.40-K*H!DXzTvPsioeka:63N)J%FZCWySuOqgmd9;51+M(I$EYAUwRtjpflb8.40-K63N)J%G#CWySuOrhnd9;52=M(I$EYBVxRtjpflc 8.40-K*H!DXzTvQsioeka:63N)J%F#CWySuOqgnd9;51+M(I$EYAUxRtjpflb8.40-K53N)J%F#CWySuOqhnd9;51=M(I$EYAUxRtjpflb 8.40-K63N)J53N)J%F#CWySuOqgnd9;51=M(I$EYAUxRtjpflb 8.40-K53N)J%G#CWySuOqhnd9;51=M(I$EYAVxRtjpflb 8.40-K*H!DXzTvPsioeka:640-K53N)J%G#CWySuOqhnd9;51=M(I$EYAVxRtjpflb 8.40-K*H!DXzTvPsioeka:640-K51=M(I$EYAUxRtjpflb8.40-K53N)J%F#CWySuOqhnd9;51=M(I$EYAVxRtjpflb 8.40-K51=M(I$EYAUxRtjpflb8.40-K53N)J%F#CWySuOqhnd9;51=M(I$EYAVxRtjpflb 8.40-K52=M(I$EYBVxRtjpflc 8.40-L*H!DXzTvQsioeka:63N)J%F#CWySuOqgnd9;51=M(I$EYAUxRtjpflb8.40-K530-K63N)J%G#CWySuOrhnd9;52=M(I$EYBVxRtjpflc 8.40-K*H!DXzTvQsioeka:63N)J%F#CWySuOqgnd9;51+M(I$EYAUxRtjpflb8.40-K52=M(I$EYAVxRtjpflb 8.40-K*H!DXzTvPsioeka:63N)J%FZCWySuOqgmd9;51+M(I$EYAUwRtjpflb8.40-K63N)J%G#CWySuOrhnd9;52=M(I$EYBVxRtjpflc 8.40-K*H!DXzTvQsioeka:63N)J%F#CWySuOqgnd9;51*H!DYAUwQsioflb73N-K53N)J%G#CWySuOqhnd9;51=M(I$EYAVxRtjpflb 8.40-K*H!DXzTvPsioeka:640-K530-K63N)J52=M(I$EYBVxRtjpflc 8.40-L*H!DXzTvQsioeka:63N)J%F#CWySuOqgnd9;51=M(I$EYAUxRtjpflb8.40-KG#CWyTvPrhnd9:63N)K63N)J%G#CWySuOqhnd9;52=M(I$EYAVxRtjpflc 8.40-K*H!DXzTvPsioeka:63N)J%FZCWySu
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年鄂州市华容区辅警招聘考试题库附答案解析
- 2025吉林大学白求恩第一医院内镜中心洗消工招聘笔试考试参考试题及答案解析
- 2025福建南平松溪县人民法院招聘工作人员4人考试笔试参考题库附答案解析
- 2025年柳州市柳南区辅警招聘考试题库附答案解析
- 2025年信阳市生态环境局潢川分局招聘6名全日制公益性岗位笔试考试备考试题及答案解析
- 2025广东广州天河区同仁艺体实验中学招聘英语教师1人考试笔试参考题库附答案解析
- 江西省赣州市十三校2025-2026学年高一上学期期中联考语文试题(含答案)
- 2025年江西省吉安市永新县保安员招聘考试题库附答案解析
- 2025北京市大兴区司法局招聘临时辅助用工人员1人考试笔试备考题库及答案解析
- 廉洁考试试题及答案下载
- 2026品牌营销日历【营销节点】
- 2025高中历史时间轴与大事年表
- 肾癌病人教育知识培训课件
- 相贯线课件教学课件
- 【地理】跨学科主题学习 认识我国的“世界灌溉工程遗产”课件-2025-2026学年八年级地理上学期(人教版2024)
- 道路监控维护合同范本
- 高一力学知识点总结
- 咯血病人的护理小讲课
- 2025年劳动合同法全文
- Python图像处理课件
- 安全生产违法行为行政处罚办法新
评论
0/150
提交评论