Home
Employee Pages
Course - IT

Distributed LLM Fine-Tuning & Inference on HPC Systems

The Norwegian Research Infrastructure Services (NRIS) is hosting an in-person, hands-on physical course in Bergen. Gain practical, hands-on experience over two days working with single-GPU fine-tuning, multi-GPU scaling, and optimized LLM inference on a high-performance computing (HPC) system. Build applied skills in optimizing large language models in HPC environments.

illustration photo of earth from space with NRIS logo
Photo:
NRIS

Main content

Content: In this course, you will learn to:

  • Implement parameter-efficient fine-tuning using LoRA and QLoRA
  • Configure and launch distributed training workloads across multiple GPUs
  • Perform distributed LLM inference
  • Monitor and analyze GPU utilization and profiling GPU memory

HPC System: Olivia Supercomputer

Target audience: This course is ideal for researchers, developers, and students with Python experience who want hands-on skills in scalable LLM training and inference on an HPC system.

Prerequisites:

  • Familiarity with machine learning (ML) frameworks (e.g. PyTorch)
  • Basic understanding of large language models (LLMs)

Registration: Register her 

Practical information: The course is free of charge, and has a maximum capacity of 30 participants. There will be serving of food, some light pastries and coffee/tea both days.

Instructor: Hicham Agueny

Coordinator: Eirik Skjerve