## Petros Koumoutsakos: Catalogue data in Autumn Semester 2020 |

Name | Dr. Petros Koumoutsakos |

Field | Computational Science |

Address | Professur f. Computational Science ETH Zürich, CLT F 12 Clausiusstrasse 33 8092 Zürich SWITZERLAND |

Telephone | +41 44 632 52 58 |

URL | http://www.cse-lab.ethz.ch/index.php?&option=com_content&view=article&id=100&catid=38 |

Department | Mechanical and Process Engineering |

Relationship | Full Professor |

Number | Title | ECTS | Hours | Lecturers | |
---|---|---|---|---|---|

151-0107-20L | High Performance Computing for Science and Engineering (HPCSE) I | 4 credits | 4G | P. Koumoutsakos, S. M. Martin | |

Abstract | This course gives an introduction into algorithms and numerical methods for parallel computing on shared and distributed memory architectures. The algorithms and methods are supported with problems that appear frequently in science and engineering. | ||||

Objective | With manufacturing processes reaching its limits in terms of transistor density on today’s computing architectures, efficient utilization of computing resources must include parallel execution to maintain scaling. The use of computers in academia, industry and society is a fundamental tool for problem solving today while the “think parallel” mind-set of developers is still lagging behind. The aim of the course is to introduce the student to the fundamentals of parallel programming using shared and distributed memory programming models. The goal is on learning to apply these techniques with the help of examples frequently found in science and engineering and to deploy them on large scale high performance computing (HPC) architectures. | ||||

Content | 1. Hardware and Architecture: Moore’s Law, Instruction set architectures (MIPS, RISC, CISC), Instruction pipelines, Caches, Flynn’s taxonomy, Vector instructions (for Intel x86) 2. Shared memory parallelism: Threads, Memory models, Cache coherency, Mutual exclusion, Uniform and Non-Uniform memory access, Open Multi-Processing (OpenMP) 3. Distributed memory parallelism: Message Passing Interface (MPI), Point-to-Point and collective communication, Blocking and non-blocking methods, Parallel file I/O, Hybrid programming models 4. Performance and parallel efficiency analysis: Performance analysis of algorithms, Roofline model, Amdahl’s Law, Strong and weak scaling analysis 5. Applications: HPC Math libraries, Linear Algebra and matrix/vector operations, Singular value decomposition, Neural Networks and linear autoencoders, Solving partial differential equations (PDEs) using grid-based and particle methods | ||||

Lecture notes | https://www.cse-lab.ethz.ch/teaching/hpcse-i_hs20/ Class notes, handouts | ||||

Literature | • An Introduction to Parallel Programming, P. Pacheco, Morgan Kaufmann • Introduction to High Performance Computing for Scientists and Engineers, G. Hager and G. Wellein, CRC Press • Computer Organization and Design, D.H. Patterson and J.L. Hennessy, Morgan Kaufmann • Vortex Methods, G.H. Cottet and P. Koumoutsakos, Cambridge University Press • Lecture notes | ||||

Prerequisites / Notice | Students should be familiar with a compiled programming language (C, C++ or Fortran). Exercises and exams will be designed using C++. The course will not teach basics of programming. Some familiarity using the command line is assumed. Students should also have a basic understanding of diffusion and advection processes, as well as their underlying partial differential equations. |