You are here: Home » Study Plan » Subject



master degree course in COMPUTER SCIENCE

Course year 1
Teaching units Unit High Performance Computing
Information Technology (lesson)
  • TAF: Compulsory subjects, characteristic of the class SSD: ING-INF/05 CFU: 9
Teachers: Andrea MARONGIU
Exam type oral
Evaluation final vote
Teaching language Italiano
Contents download pdf download




The aims of this course are:

- to illustrate the main characteristics and architectures for high-performance computing systems, both from the embedded and high-end domains: multi-/many-core, GPU, FPGA.

- introduce the main problematics of parallel programming and the methodologies for program decomposition.

- introduce the main advanced parallel programming techniques using OpenMP and CUDA.

- introduce the key concepts of the High-Level Synthesis design methodology for FPGA-based systems.

- introduce the key concepts of program compilation and compiler optimization for modern heterogeneous systems.

Admission requirements

To better understand the content of this course, it is advisable that the student takes the following classes:
- Computer architecture
- Computer programming I and II
- (optional) Parallel computing

Course contents

Parallel architectures:
- Evolution of computer systems into heterogeneous and parallel architectures;
- Taxonomy of multicores. Shared memory VS distributed memory systems. Homogeneous VS Heterogeneous systems;
- Novel challenges: Coherency, synchronization and consistency in shared-memory systems;
- The architecture of modern multicore CPUs, General-Purpose Graphics Processing Units (GP-GPU) and Field Programmable Based Arrays (FPGA);

Design of Parallel code:
- Performance in multicores: coverage, granularity, locality;
- Parallel design patterns: architecting parallel software;
- Software analysis and profiling;
- Introducing programming models for massively parallel heterogeneous systems;

Parallel programming models:
- Shared memory systems programming with OpenMP;
- Programming GPU-based heterogeneous systems: CUDA;
- High-Level Synthesis for FPGA acceleration;

Compilation for parallel heterogeneous systems:
- Structure of a modern compiler. Intermediate representations;
- Examples of code analysis and optimization;
- The OpenMP accelerator model: a case study:

Teaching methods

Lectures are mostly based on slides for the theory; laboratory exercises will also be adopted for the use of the various programming models. Remote access to the classes and materials will be provided. Depending on the evolution of the COVID19 pandemic situation traditional taught classes will be delivered.

Assessment methods

There are six examination sessions in a year. The examination is composed by a written test and an oral part. The written test, approximately one hour and a half long, consists of questions with a single or multiple correct answers, to assess the knowledge of the theory, plus exercises or questions with an open answer to evaluate the understanding of more practical topics from the course. During this examination it is forbidden to use any tipe of teaching material, books or similar. The oral part covers all the theoretical and practical concepts, and is based on the outcome of the written test (typically a barely sufficient score at the written test requires an in-depth oral examination). The final score is the average of the scores achieved in the written and oral tests. The tests might be conducted remotely or in presence depending of the evolution of the COVID19 pandemic situation.

Learning outcomes

- Knowledge and understanding:
the student will be able to develop parallel software suitable for facing scientific calculation problems on multiprocessor architectures.
- Ability to apply knowledge and understanding:
the student will have sufficient knowledge to deal with the resolution of some scientific calculation problems coming from complex computer applications on parallel systems.
- Making judgments:
the student will possess the necessary ability to identify the architectures, programming environments and algorithms appropriate to a parallel resolution of a specific scientific computing problem.
- Communication skills:
the student will be able to clearly explain the main characteristics of the parallel algorithms studied and discuss their applicability in practical contexts.
- Learning skills:
the student will be able to study in depth the main aspects of the topics proposed in the course.


The teaching materials consist mainly of the slides presented in class, within which there are links for further study