Machine learning seminar series
Main content
In this seminar series, we discuss diverse topics related to machine learning research. In addition, members of the machine learning group and visitors regularly present their work. Each seminar lasts for roughly 45 minutes (including questions) and is directly followed by a quick lunch offered to all attendees to stimulate more informal discussions.
Anyone is welcome! From master students, industry participants, researchers, and others are invited. No need to register (not even for the lunch), and the location is accessible without access card.
If you wish to be included in the seminar mailing list to be informed of next seminars, please contact Willem Theodorus Schooltink (willem.schooltink@uib.no) or Victor Lacerda Botelho (victor.botelho@uib.no).
If you are interested in presenting, please contact Willem Theodorus Schooltink (willem.schooltink@uib.no) or Victor Lacerda Botelho (victor.botelho@uib.no).
Date and location
For the fall semester 2023, we aim to have seminars every other Wednesday at 11.15am. Please check the seminars below for more detailed information about specific seminars.
The seminars take place at University of Bergen, Department of informatics (Thormøhlens gate 55), whenever possible in the Blåbær seminar room. The room will be confirmed for each seminar. Please check below.
Seminars
(01.10.24) Allah Bux and Vadim Kimmelman: Applications of Machine Learning and Deep Learning for Sign Language Analysis
Speakers: Allah Bux and Vadim Kimmelman
Abstract: Sign languages utilize the hands and other nonmanual articulators such as the head, mouth, eyebrows, eyes, and eyelids to convey lexical, grammatical, and prosodic information. This linguistic phenomenon is known as “nonmanuals” (https://www.uib.no/en/nonmanual). Understanding these nonmanual elements is crucial for a comprehensive analysis of sign languages. This seminar will explore how computer vision and machine learning techniques are applied to analyze these elements, focusing specifically on head pose estimation (HPE). HPE plays an essential role in sign language research by providing insights into head movements that carry linguistic significance. Traditional optoelectronic motion capture systems, while accurate, face challenges in terms of cost, accessibility, and the need for facial markers. To address this, we evaluated three advanced HPE algorithms, MediaPipe, OpenFace, and 6DRepNet, using RGB video data from Finnish Sign Language recordings, comparing their accuracy to a motion capture system as the gold standard. In this seminar, we will share our key findings, discuss the strengths and limitations of these algorithms, and highlight future directions for enhancing nonmanual analysis in sign language research.
Date & Time: Tuesday, 01.10.24, 11.15am
Location: Blåbær
(12.09.24) Fabio Massimo Zennaro: Multi-level decision making with causal bandits
Speaker: Fabio Massimo Zennaro
Abstract: Multi-armed bandits are a standard formalism to represent simple yet realistic decision-making problems in which a policy-maker has to find an optimal balance between choosing well-known options or exploring new alternatives. Traditionally, such decision-making problems are encoded using a single model; however, in reality, a decision-maker may have multiple related models of the same problem at different level of resolution, each one providing information about the value and the effects of the available choices. In this talk we will recall the standard multi-armed bandits framework, extend it to a causal settings, and explain how multiple models can be related via causal abstractions. Finally, we will discuss a few theoretical results about transporting information across the models via abstraction using basic algorithms inspired by reinforcement learning.
Date & Time: Thursday, 12.09.24, 11.15am
Location: Large auditorium
(18.06.24) Roland Netz: Modelling of equilibrium and non-equilibrium time-series data: from protein folding to weather forecasting
Speaker: Roland Netz
Abstract: Most systems of scientific interest are interacting many-body systems. One typically describes their kinetics in terms of a low-dimensional reaction coordinate, which in general is influenced by the entire system. The dynamics of such a reaction coordinate is governed by the generalized Langevin equation (GLE), an integro-differential stochastic equation, and involves a memory function [1]. I discuss a few examples where the GLE can be used to interpret and model data in different fields of science.
Protein-folding kinetics is typically described as Markovian (i.e., memoryless) diffusion in a one-dimensional free energy landscape. By analysis of large-scale molecular-dynamics simulation trajectories of fast-folding proteins from the Shaw group using the special-purpose computer ANTON, I demonstrate that the friction characterizing protein folding exhibits significant memory with a decay time that is of the same order as the folding and unfolding times [2,3]. Memory friction effects lead to anomalous and drastically modified protein kinetics. For the set of proteins for which simulations are available, it is shown that the folding and unfolding times are not dominated by the free-energy barrier but rather by the non-Markovian friction.
Memory effects are also present for non-equilibrium systems. Using an appropriate non-equilibrium formulation of the GLE, it is demonstrated that the motion of living organisms is characterized by memory friction, which allows to characterize internal feedback loops of such organisms and to classify and sort individual organisms [4]. The GLE can be even used to predict complex phenomena such as weather data.
[1] Generalized Langevin equation with a nonlinear potential of mean force and nonlinear memory friction from a hybrid projection schemeCihan Ayaz , Laura Scalfi , Benjamin A. Dalton, and Roland R. NetzPHYSICAL REVIEW E 105, 054138 (2022)
[2] Non-Markovian modeling of protein folding,Cihan Ayaza, Lucas Tepper, Florian N. Brünig, Julian Kappler, Jan O. Daldrop, Roland R. NetzProc. Natl Acad. Sci. 118, e2023856118 (2021)
[3] Fast protein folding is governed by memory-dependent frictionBenjamin A. Dalton, Cihan Ayaz, Lucas Tepper, and Roland R. Netz.Proc. Natl Acad. Sci. 120, e2220068120 (2023), DOI: 10.1073/pnas.2220068120
[4] Data-driven classification of individual cells by their non-Markovian motionAnton Klimek, Debasmita Mondal, Stephan Block, Prerna Sharma, and Roland R. NetzBiophysical Journal 123, 1–11, May 7, 2024, https://doi.org/10.1016/j.bpj.2024.03.023
Date & Time: Tuesday, 18.06.24, 11.15am
Location: Blåbær
(11.06.24) Pekka Parviainen: A structural perspective on learning probabilistic graphical models
Speaker: Pekka Parviainen
Abstract: Probabilistic graphical models representations of multivariate probability distribution where conditional independencies between variables are expressed with a graph structure. The structure can be, for example, a directed acyclic graph (DAG) as in case of Bayesian networks or an undirected graph as in case of Markov networks. The structure can be learned from data. In this talk, I concentrate on structure learning using the constraint-based approach where one conducts statistical conditional independence tests and constructs a structure that expresses the same set of independencies. I take a structural perspective and study how does the structure of the distribution that we are trying to learn affect the complexity of learning. I will present some recent results on complexity of learning Markov networks and contrast these with Bayesian networks. Based on joint work with Fedor Fomin and Tuukka Korhonen.
Date & Time: Tuesday, 11.06.24, 11.15am
Location: Blåbær
(31.05.24) Kenneth Langedal: Graph Neural Networks as Ordering Heuristics for Parallel Graph Coloring
Speaker: Kenneth Langedal
Abstract: The graph coloring problem asks for an assignment of the minimum number of distinct colors to vertices in an undirected graph with the constraint that no pair of adjacent vertices share the same color. The problem is a thoroughly studied NP-hard combinatorial problem with several real-world applications. As such, a number of greedy heuristics have been suggested that strike a good balance between coloring quality, execution time, and also parallel scalability.
In this work, we introduce a graph neural network (GNN) based ordering heuristic and demonstrate that it outperforms existing greedy ordering heuristics both on quality and performance. Previous results have demonstrated that GNNs can produce high-quality colorings but at the expense of excessive running time. The current paper is the first that brings the execution time down to compete with existing greedy heuristics. Our GNN model is trained using both supervised and unsupervised techniques. The experimental results show that a 2-layer GNN model can achieve execution times between the largest degree first (LF) and smallest degree last (SL) ordering heuristics while outperforming both on coloring quality. Increasing the number of layers improves the coloring quality further, and it is only at four layers that SL becomes faster than the GNN. Finally, our GNN-based coloring heuristic achieves superior scaling in the parallel setting compared to both SL and LF.
Date & Time: Friday, 31.05.24, 11.15am
Location: Blåbær
(24.05.24) Tom Michoel: Causal inference for dynamical systems
Speaker: Tom Michoel
Abstract: Causal inference combines models and data to identify causations from correlations. Structural causal models (SCMs) are data generating models that represent causal interactions using directed acyclic graphs and equations that express how the distribution of a variable depends on its causal parents. There is a growing realization that to model feedback cycles, an extension of SCMs to dynamical systems is essential. I will review basic facts about SCM, stochastic differential equations (SDEs), and recent literature on trying to connect these two types of models. I will show how SCMs can be viewed as equilibrium states of SDEs and highlight some interesting open problems in the field. To conclude I will present potential application areas for causal inference for dynamical systems in biology.
Date & Time: Friday, 24.05.24, 11.15am
Location: Blåbær
(10.05.24) Yushu Li: Sparse Bayesian Learning and Time series forecasting
Speaker: Yushu Li
Abstract: During the seminar, I will first present a new sparse Bayesian learning model, named the Bayesian Lasso Sparse (BLS) model. The BLS model takes the hierarchical model formulation of the Bayesian Lasso (Park & Casella, 2008) and can provide sparse estimates of the regression parameters. We compare the BLS model with the well-known Relevance Vector Machine, the Fast Laplace, the Bayesian Lasso, and the Lasso, on both simulated and real data. Our results show that the BLS is sparse and precise, especially when dealing with noisy and irregular dataset.Then I will present briefly certain research topics I have worked on, including nonstationary time series forecasting, wavelet method in time series analysis and support vector regression in volatility forecasting.
Reference: Park, T., & Casella, G. (2008). The bayesian lasso. Journal of the american statistical association, 103(482), 681-686.
Date & Time: Friday, 10.05.24, 11.15am
Location: Blåbær
(06.12.23) Julien Brajard: Super-resolution of satellite observations of sea ice thickness using diffusion models and physical modeling
Speaker: Julien Brajard
Abstract: I will present a simulator of high-resolution sea ice thickness in the Arctic based on diffusion models, which is a type of artificial intelligence (AI) generative model. The high-resolution product obtained will contribute to address critical questions related to Arctic sea ice predictability at seasonal timescales and its role in the Earth's climate system.
Current satellite-based observations of sea ice thickness provide valuable data but are limited by their coarse spatial resolution. High-resolution information is crucial for useful predictions and understanding small-scale features such as leads and thin ice, which significantly impact seasonal forecasting and heat flux calculations.
To overcome these limitations, we propose a multi-step approach. First, a physically-based sea ice model, neXtSIM, is employed to generate a synthetic but realistic high-resolution sea ice thickness dataset. This synthetic dataset is then filtered to mimic the resolution of present satellite products. An AI-based diffusion model is then trained to enhance the low-resolution SIT data. Finally, the AI-based model is applied to real Earth observation (EO) data.
This project is a work in progress. In the seminar, I will describe the methodology and focus on the first achievement and validation of the high-resolution field produced by our simulator. I will discuss the potential applications, and also the source of uncertainty, the metrics used for validation, and the margin for improvement.
This work is held in the framework of the project SuperIce, funded by ESA.
Date & Time: 06.12.23, 11.15am
Location: Lille auditorium
(22.11.23) Fabio Massimo Zennaro: Learning Causal Abstractions
Speaker: Fabio Massimo Zennaro
Abstract: In this presentation we review the definition of structural causal models and we introduce the problem of relating these models via an abstraction map. We formalize the problem of learning such a causal abstraction map as a minimizer of an abstraction error expressed in terms of interventional consistency, and we discuss some of the challenges involved in this optimization problem. We then present an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.
Date & Time: 22.11.23, 11.00am
Location: Lille auditorium
(01.11.23) Nello Blaser: An introduction to Geometric Deep Learning
Speaker: Nello Blaser
Abstract: Geometric Deep Learning aims to take advantage of structural properties of learning tasks by providing a principled way to designing neural network architectures. This seminar is based on a special topics course taught in spring 2023. It explains the basic building blocks of geometric deep learning(groups, equivariance, non-linearity, local pooling, and global pooling), together with examples of some of the most frequently used domains (sets, graphs, grids, groups).
Date & Time: 01.11.23, 11.15am
Location: Blåbær
(18.10.23) Changkyu Choi: DIB-X: Formulating Explainability Principles for a Self-explainable Model through Information Theoretic Learning
Speaker: Changkyu Choi
Abstract: The recent development of self-explainable deep learning approaches has focused on integrating well-defined explainability principles into learning process, with the goal of achieving these principles through optimization. In this work, we propose DIB-X, a self-explainable deep learning approach for image data, which adheres to the principles of minimal, sufficient, and interactive explanations. The minimality and sufficiency principles are rooted from the trade-off relationship within the information bottleneck framework. Distinctly, DIB-X directly quantifies the minimality principle using the recently proposed matrix-based R ́enyi's α-order entropy functional, circumventing the need for variational approximation and distributional assumption. The interactivity principle is realized by incorporating existing domain knowledge as prior explanations, fostering explanations that align with established domain understanding. Empirical results on MNIST and two marine environment monitoring datasets with different modalities reveal that our approach primarily provides improved explainability with the added advantage of enhanced classification performance.
Date & Time: 18.10.23, 11.15am
Location: Blåbær
(16.05.23) Samia Touileb: Transformers and Pretrained Language Models
Speaker: Samia Touileb
Abstract: In this talk I will introduce the currently most common architecture for language modeling, called transformer. The transformer architecture has revolutionized the field of Natural Language Processing (NLP). Transformers have two key mechanisms: self-attention and positional encoding, which helps temporal representation and permits focus on the relationship between words even over long distances. I will show how these types of models can be applied to the task of language modeling. I will also discuss the notion of pretraining, which is the process of learning how to represent words or sentences in very large amounts of texts. We can pretrain a language model, and refer to the resulting model as a pretrained language model.
Date & Time: 16.05.23, 13.15pm
Location: Lille auditorium
(11.05.23) Juha Harviainen: Revisiting Bayesian Network Learning with Small Vertex Cover
Speaker: Juha Harviainen
Abstract: The structure learning problem of Bayesian networks asks for a directed acyclic graph (DAG) maximizing a given scoring function. Since the problem is NP-hard, parameterized algorithms have been of interest recently with the goal of obtaining polynomial-time algorithms by focusing on restricted classes of DAGs. In this talk, we seek to initiate investigation of two questions: Is there room for significant improvements in the known polynomial algorithms, and can we obtain similar complexity results for weighted sampling and counting of DAGs? We revisit the class of DAGs with a bounded vertex cover number and answer both questions in the affirmative for that class.
Date & Time: 11.05.23, 11am
Location: Lille auditorium
(24.04.23) Fabio Massimo Zennaro: Introduction to Causality: Structural Causal Modelling
Speaker: Fabio Massimo Zennaro
Abstract: In this talk we will introduce one of the most important formalisms to represent causal systems in computer science. We will start with a brief review of causality, highlighting the meaning of causal queries and the limitations of standard statistics and machine learning in answering them. To address these shortcomings, we will present the formalism of structural causal models (SCMs). We will then show how these models can be used to rigorously answer different types of causal questions, including observational, interventional and counterfactual questions. Finally, we will conclude by discussing how this formalization gives rise to a rich theory of causality, and how the ideas underlying causality have strong and promising intersections with artificial intelligence and machine learning.
Date & Time: 24.04.23, 13.15pm
Location: Rips
(19.04.23) Tung Kieu: Deep Learning for Spatio-Temporal Sequence Forecasting
(14.04.23) Zheng Zhao: Introduction to Gaussian processes
Speaker: Zheng Zhao
Abstract: Gaussian processes are a class of prior distributions over functions widely used in machine learning. The merit of Gaussian processes is that we can handily encode the prior knowledge into a model at hand, and simultaneously quantify uncertainty. In this lecture, we introduce the fundamentals of Gaussian processes by explaining the definition and motivations of using them. Then, we show how to draw samples from Gaussian processes, and solve the regression problems with real examples. The lecture is partially based on and simplified from https://github.com/spdes/gp.
Date & Time: 14.04.23, 13.15pm
Location: Store auditorium
(30.03.23) Ricardo Guimarães: An Introduction to Knowledge Graph Embeddings
Speaker: Ricardo Guimarães
Abstract: Knowledge Graphs (KGs) are data structures that represent entities as nodes and the different relations between them as edges, usually directed and labelled. KGs have become vital tools in applications such as information retrieval, knowledge management, data integration, disambiguation, and recommendation systems. However, most important knowledge graphs are highly incomplete. The need for deriving more information from KGs despite the missing data motivated the development of different KG Embedding models in Machine Learning to predict the missing links in a KG. In this talk, we introduce the fundamentals of KG embeddings and give an overview of the existing models and current challenges.
Date & Time: 30.03.23, 11am
Location: Blåbær
(16.03.23) Pekka Parviainen: Bayesian networks in modern times
Speaker: Pekka Parviainen
Abstract: Bayesian networks are probabilistic graphical models that are used to represent joint probability distributions of several variables. They can be found under the hood in many machine learning methods. In this talk, I will give a short introduction to Bayesian networks and discuss some recent research directions in the field.
Date & Time: 16.03.23, 11am
Location: Lille auditorium
(02.03.23) Ketil Malde: Deep metric learning
Speaker: Ketil Malde
Abstract: Metric learning aims to learn a distance measure between data points. This distance can in turn be used for verification (do data points represent the same object?), recognition (does a data point represent an object in a database of known objects?) and for clustering and other types of data analysis. Here we visit contemporary methods for deep metric learning (including contrastive, non-contrastive, self-supervised, and variational) and look at some of their applications.
Date & Time: 02.03.23, 11am
Location: Lille auditorium
(07.02.23) Vinay Chakravarthi Gogineni: Personalized Federated Learning
Speaker: Vinay Chakravarthi Gogineni
Abstract: Federated learning (FL) is a distributed learning paradigm that enables geographically dispersed edge devices, often called clients, to learn a global shared model on their locally stored data without revealing it. Due to its ability to handle system and statistical heterogeneity, FL has received enormous attention. Despite its success, the traditional FL is not well suited to many practical applications such as those that involve the internet-of-things (IoT) or cyber-physical systems (CPS), where edge devices are semi-independent with device-specific dynamic behavior characteristics. A single universal model for device-specific tasks is neither reasonable nor realistic. Therefore, it is necessary to allow each device to learn and use a local, personalized model. In this talk, a brief overview of traditional federated learning, personalized federated learning and challenges associated with them will be presented.
(02.02.23) Ming-Chang Lee: Real-time lightweight anomaly detection approaches for open-ended time series
Speaker: Ming-Chang Lee
(26.10.22) Roman Khotyachuk: Dimensionality Reduction Methods for Numerical Partial Differential Equations (PDEs)
Speaker: Roman Khotyachuk
Abstract: In this talk, we will consider dealing with high-dimensional data from numerical PDEs. First, I will review some approaches to Dimensionality Reduction (DR) when solving PDEs numerically. The second part will be devoted to general DR methods with applications to PDEs. And finally, I will present some examples of DR from my PhD project.
(05.10.22) Nello Blaser: Open Problems in Topological Machine Learning
Speaker: Nello Blaser
Abstract: Topological methods have recently gotten more traction in the machine learning community and are actively applied and developed. In this talk I will first give a short primer on topological machine learning and showcase how topological methods can be used in machine learning. The main part of the talk will then be devoted to give my perspective on important open problems that need to be addressed to allow for more wide-spread use of topological methods in machine learning.
(21.09.22) Ramin Hasibi: Geometric Machine Learning and Applications in Biology and Computer Vision
Speaker: Ramin Hasibi
Abstract: In my talk, I will discuss machine learning and specifically, deep learning methods applicable on geometric datasets. In geometric deep learning for datasets such as graphs, sets, 3D shapes or point clouds, the underlying structure of the dataset is utilized in the deep learning methods to improve the performance of the machine learning framework. Furthermore, graphs do not obey a certain structure pattern and size constrains. Therefore, the methods investigated in this work should be invariant to the size, structure and order of the elements in the dataset. This presentation is based on our recent works in the field of biology as well as our collaboration with colleagues at Aalto University of Technology for applications in computer vision and robotics.