Skip to content

Latest commit

 

History

History
128 lines (113 loc) · 4.5 KB

File metadata and controls

128 lines (113 loc) · 4.5 KB

Introduction

HPC Programming With Python

Omar Padron
Blue Waters Science and Engineering Application Support
NCSA - Illinois, USA

Setup

  • Download and install the Python 2.7 version of miniconda
cd $HOME
mkdir conda
cd conda
wget $URL_TO_MINICONDA
sh $MINICONDA_FILE -f -b -p $HOME/conda
  • add the following to your ~/.bashrc file
export PYTHONHOME="$HOME/conda"
export PATH="$HOME/conda/bin:$PATH"
  • grab the code and materials
git clone git://github.com/opadron/hpc-python-ihpcss-2015
cd hpc-python-ihpcss-2015
git checkout anaconda
  • install the packages that we'll need
echo y | conda install numpy cython mpi4py

If all goes well, you should be ready to go!

Python for HPC

  • Why Python?
    • Faster development cycle
      • Usually only need 1/10th of the number of lines of code
    • More flexibility
    • Greater software ecosystem

- Outstanding technical challenges - [Global Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock) - Development Tools Still Lacking - Need better profilers/debuggers - Tools are getting better, but still not "there", yet. - Parallel import problem - When a large number of nodes all try to read from the same file on a shared filesystem. - Not unique to Python, but particularly bad for Python
- Can you really use Python productively in HPC? - Answer depends on what you consider "Python" - Pure, interpreted Python code is too slow to be useful - however, Python software has tools to help with this - combines high-level, interpreted code with low-level, compiled code - gives you flexibility where you want it, and speed where you need it
- Some expectation management: - Python is a great tool, but is neither magic nor a silver bullet. - A sufficiently well-tuned, compiled code will almost always outperform even the best hand-crafted Python codes. - Using Python is consciously choosing to sacrifice some computer power for the sake of "people" power. - The point of this session is to show how we can avoid throwing the performance baby out with the development bath water.
- There's too much to see in Python land to fit in an afternoon. - Stuff we won't be covering - Python + Fortran (sorry, Fortran programmers) - Python + C++ - GPGPU Programming - Visualization - there's a far better talk happening in the other room (right now!)

Goals and Agenda

  • Our goal
    • to explore, by example, how to use the tools Python gives you to produce HPC applications in less time and effort than with strictly compiled languages
      • while retaining an appreciable fraction of the performance of a strictly compiled application.

- Agenda - Description of our two sample applications - one is embarrassingly parallel - the other has more shame :) - Walkthrough of the code for each serial implementation - Both pure, sequential, sloooooow Python - Then, we'll look at some Python tools that can help us to iteratively improve its performance - I will demonstrate each on the simpler problem - _You_ will work on the more challenging problem (with some help)

Problem Descriptions

  • The Basel problem
    alt
    • The whole program just computes a big sum
    • Not a practical way of approximating pi
      • Converges very slowly
      • But a great problem for demonstrating Python programming

- **Solution of [Laplace's Equation](https://en.wikipedia.org/wiki/Laplace%27s_equation) in 2D**
![alt](https://upload.wikimedia.org/math/a/8/7/a87b3b4e89ea9a5f4c749f9fb56e3336.png) - Jacobi Iterations on a 2D array - New value at each point in the array is the average of its four cardinal neighbors - Repeat until the values settle - Not as simple, but not too complicated, either

Schedule (Hopefully)

  • Code Walkthrough
  • Lesson 1: Using Numpy as a faster array data type
  • Lesson 2: Using mpi4py for distributed memory parallelism
  • BREAK
  • Lesson 3 (Advanced!): Generating optimized C code with Cython