(PDF version) 

Course Web Site

Meeting Time :
Monday, Thursday 10:00–12:00 a.m.

Course starts on September 12th

Professor :
Jordi Torres
Office: Mòdul C6- 217 (second floor)

Office Hours:
By appointment

Additional documentation:
Class handouts and materials associated with this class can be found on the Racó-FIB web server at

Course Description:
Supercomputers represent the leading edge in high performance computer technology. This course will describe all elements in the system architecture of a supercomputer, from the shared memory multiprocessor in the compute node, to the interconnection network and distributed memory cluster, including infrastructures that host them. We will also discuss the their building blocks and the system software stack, including their parallel programming models, exploiting parallelism is central for greater computational power . We will discuss the continuous development of supercomputing systems enabling the convergence of advanced analytic algorithms and big data technologies driving new insights based on the massive amounts of available data. The course contains different labs that will use supercomputing facilities from the Barcelona Supercomputing Center (BSC-CNS).

6.0 ECTS


Programming in C and Linux basics will be expected in the course. Prior exposure to parallel programming constructions, Python language, experience with linear algebra/matrices or machine learning knowledge, will be very helpful.

Course workload: important warning

The student should be aware that SA-MIRI 2016 edition is a 6.0 ECTS course that require an effort from the student equivalent to 150 hours. This means more than 10 hours per week (4 hours in class + 6 hours outside class in average) during 14 weeks. It is not recommended to take this course if the student has other commitments during this quarter that will prevent to dedicate this amount of hours to this course. You can wait for the next course edition.

Course Activities:


Class attendance and participation

Regular and consistent attendance is expected and to be able to discuss concepts covered during class.

Homework Assignments

Homework will be assigned weekly that includes reading documentation that expands the concepts introduced during lectures, and periodically will include reading research papers related with the lecture of the week, and prepare short presentations (with slides).


There could be some pop quiz along the course.

Student presentation

Students randomly chosen will present the presentations homework.

Final Project

Final project will be assigned to each student or group of students (TBD). This assignment typically consists of (a) studying recent research literature on a hot topic related to the course or/and (b) study a contemporary HPC system in TOP10 position (*) .

Lab activities. Hands-on sessions will be conducted during lab sessions using supercomputing facilities. Each hands-on will involve writing a lab report with all the results to be delivered one week later.


Grading Procedure

The evaluation of this course will take into account different items:

  • Attendance (minimum 80% required) & participation in class will account for 20% of the grade.
  • Homework, papers reading, paper presentations, assessments, will account for 20% of the grade.
  • Final Project deliverable/presentation will account of 10% of the grade (*)
  • Lab sessions (+ Lab reports) will account for 50% of the grade


Tentative course content (**):

  1. Supercomputing Basics
  2. HPC Building Blocks
  3. Parallel Computer Architecture
  4. Parallel Programming Models
  5. Parallel Performance Metrics and Measurements
  6. Benchmarking in Supercomputers
  7. Coprocessors and Programming Models
  8. Powering Machine Learning with Supercomputing

(**) The different background of students is a major difficulty to teach SA-MIRI. A “SA-MIRI Entry Survey“ is designed to help assess students background (PA-MIRI, CPDS-MIRI, MA-MIRI, CHPC-MIRI, PD-MIRI, SCA-MIRI, PPTM-MIRI, APA-MIRI), expectations, and preferences in order to better customize the course.


Tentative Labs:
In this course the students will use Marenostrum III supercomputer and MinoTauro supercomputer from Barcelona Supercomputing Center. Marenostrum supercomputer has 3,056 nodes with 2x Intel SandyBridge with 8 cores (link). MinoTauro is a heterogeneous cluster with 61 Bull B505 blades (2 Intel processor with 6 cores + 2 M2090 NVIDIA GPU per blade) and with 39 bullx R421-E4 servers (2 Intel Xeon with 8 cores + 2 K80 NVIDIA GPU) (link) .

(Tentative grading in brackets)

PART I: Marenostrum (70%)

  1. Supercomputing Building Blocks: Marenostrum visit (5 %)
  2. Getting Started with Supercomputing (10 %)
  3. Getting Started with Parallel Programming Models (15%)
  4. Getting Started with Parallel Performance Metrics (15%)
  5. Getting Started with Parallel Performance Models (15%)
  6. Getting Started with Performance Analysis Tools (10%)

PART II: MinoTauro (30%)

  1. Getting Started with GPU based Supercomputing (10%)
  2. Getting Started with CUDA programming model (10%)
  3. GPU based Supercomputer and Machine Learning (10%)



Tentative Schedule:  download last PDF version

This syllabus could be revised/updated until the start of the course (last modified 10/sep/2016)