Genome-scale algorithms

Postgraduate course

Course description

Objectives and Content

The course provides an introduction to the technologies, methods, and algorithms used in genomics. High-thoughput sequencing technologies have revolutionized the field of genomics, allowing the reconstruction of genomes, epigenomes, and transcriptomes across entire populations and at the level of individual cells. These ¿omics¿ technologies provide unique insights into the core program of life but analyzing the resulting data poses significant bioinformatics challenges.

The course is divided into two parts. The first part gives an overview of modern long and short-read sequencing technologies and their applications, including capturing global biodiversity, meta-genomics, single-cell sequencing, epi-genetics, and genomic medicine.

The second part of the course focuses on state-of-the-art algorithms and data structures for processing and analyzing high-throughput sequencing data, including the use of machine learning methods for integrating and obtaining biological insights from large-scale omics data. These techniques include de Bruijn graphs and genome assemblers, the Burrows-Wheeler transform, suffix arrays for indexing the genome and detecting repeats, and methods for clustering and network reconstruction from time series, perturbational and population-based omics data. An introduction to the complexity of the presented algorithms and their comparison will be given.

Learning Outcomes

On completion of the course the student should have the following learning outcomes defined in terms of knowledge, skills and general competence:

Knowledge:

The student can

  • explain different sequencing technologies, their potential and limitations, and their applications
  • choose and design approaches to process, analyze, and interpret genome, epigenome, and transcriptome sequencing data,
  • choose and design approaches to collect, integrate, analyze, and interpret large datasets of high-throughput sequencing experiments, including single-cell data and multi-omics data from the same samples,
  • choose, integrate, and apply appropriate tools from diverse bioinformatics and machine learning libraries.

Skills:

The student is able to:

  • implement algorithms for the processing and analysis of high-throughput sequencing data, e.g. de-novo assembly
  • implement algorithms for the integration and analysis of large datasets with samples from multiple high-throughput sequencing experiments, including clustering and network reconstruction from time series, perturbational and population-based omics data,
  • apply bioinformatics tools on the Linux command-line,
  • efficiently query sequence databases,
  • implement their own scripts and programs using existing bioinformatics and machine learning libraries,
  • interpret the results of bioinformatics and machine learning pipelines using functional annotation, pathway, and interaction databases,
  • argue for the choice of specific algorithms and detect causes of failure

General competence:

The student is able to

  • work on a high-throughput sequencing data analysis task on their own and in a small group,
  • communicate their analysis and its biological interpretation to an interdisciplinary audience,
  • select and present relevant new topics in the field of genomics from the published literature.

Level of Study

Master

Semester of Instruction

Spring.
Required Previous Knowledge
Recommended Previous Knowledge
Be able to implement basic algorithms in a programming language of their own choice (preferentially Python, Java, R, or Perl). A basic understanding of algorithms and efficiency, as well as statistics, is required. Good background within algorithms is recommended, at least corresponding to INF102. In addition, a good background in bioinformatics is recommended, corresponding to BINF200, and BINF201.
Access to the Course
Access to the course requires admission to a programme of study at The Faculty of Mathematics and Natural Sciences.
Teaching and learning methods

The course is given as lectures and mandatory exercises

Lectures, 4 hours per week

Exercises, 2 hours per week

Compulsory Assignments and Attendance
Compulsory assignments are valid for 1 subsequent semester
Forms of Assessment

The forms of assessment are:

  • Mandatory exercises, 50 % of total grade.
  • Written examination (3 hours), 50% of total grade.

All compulsory assignments must be approved before examination.

Grading Scale
The grading scale used is A to F. Grade A is the highest passing grade in the grading scale, grade F is a fail.
Assessment Semester
Examination both spring semester and autumn semester. In semesters without teaching the examination will be arranged at the beginning of the semester.
Reading List
The reading list will be available within June 1st for the autumn semester and December 1st for the spring semester.
Course Evaluation
The course will be evaluated by the students in accordance with the quality assurance system at UiB and the department.
Examination Support Material
Non-programmable calculator, according to the faculty regulations
Programme Committee
The Programme Committee is responsible for the content, structure and quality of the study programme and courses.
Course Coordinator
Course coordinator and administrative contact person can be found on Mitt UiB, or contact Student adviser
Course Administrator
The Faculty of Mathematics and Natural Sciences represented by the Department of Informatics is the course administrator for the course and study programme.