Home
Machine Learning
thesis topics

Available Master's thesis topics in machine learning

Main content

Learning and inference with large Bayesian networks

Most learning and inference tasks with Bayesian networks are NP-hard. Therefore, one often resorts to using different heuristics that do not give any quality guarantees.

Task: Evaluate quality of large-scale learning or inference algorithms empirically.

Advisor: Pekka Parviainen

Sum-product networks

Traditionally, probabilistic graphical models use a graph structure to represent dependencies and independencies between random variables. Sum-product networks are a relatively new type of a graphical model where the graphical structure models computations and not the relationships between variables. The benefit of this representation is that inference (computing conditional probabilities) can be done in linear time with respect to the size of the network.

Potential thesis topics in this area: a) Compare inference speed with sum-product networks and Bayesian networks. Characterize situations when one model is better than the other. b) Learning the sum-product networks is done using heuristic algorithms. What is the effect of approximation in practice?

Advisor: Pekka Parviainen

Bayesian Bayesian networks

The naming of Bayesian networks is somewhat misleading because there is nothing Bayesian in them per se; A Bayesian network is just a representation of a joint probability distribution. One can, of course, use a Bayesian network while doing Bayesian inference. One can also learn Bayesian networks in a Bayesian way. That is, instead of finding an optimal network one computes the posterior distribution over networks.

Task: Develop algorithms for Bayesian learning of Bayesian networks (e.g., MCMC, variational inference, EM)

Advisor: Pekka Parviainen

Large-scale (probabilistic) matrix factorization

The idea behind matrix factorization is to represent a large data matrix as a product of two or more smaller matrices.They are often used in, for example, dimensionality reduction and recommendation systems. Probabilistic matrix factorization methods can be used to quantify uncertainty in recommendations. However, large-scale (probabilistic) matrix factorization is computationally challenging.

Potential thesis topics in this area: a) Develop scalable methods for large-scale matrix factorization (non-probabilistic or probabilistic), b) Develop probabilistic methods for implicit feedback (e.g., recommmendation engine when there are no rankings but only knowledge whether a customer has bought an item)

Advisor: Pekka Parviainen

Bayesian deep learning

Standard deep neural networks do not quantify uncertainty in predictions. On the other hand, Bayesian methods provide a principled way to handle uncertainty. Combining these approaches leads to Bayesian neural networks. The challenge is that Bayesian neural networks can be cumbersome to use and difficult to learn.

The task is to analyze Bayesian neural networks and different inference algorithms in some simple setting.

Advisor: Pekka Parviainen

Deep learning for combinatorial problems

Deep learning is usually applied in regression or classification problems. However, there has been some recent work on using deep learning to develop heuristics for combinatorial optimization problems; see, e.g., [1] and [2].

Task: Choose a combinatorial problem (or several related problems) and develop deep learning methods to solve them.

References: [1] Vinyals, Fortunato and Jaitly: Pointer networks. NIPS 2015. [2] Dai, Khalil, Zhang, Dilkina and Song: Learning Combinatorial Optimization Algorithms over Graphs. NIPS 2017.

Advisors: Pekka Parviainen, Ahmad Hemmati

Estimating the number of modes of an unknown function

Mode seeking considers estimating the number of local maxima of a function f. Sometimes one can find modes by, e.g., looking for points where the derivative of the function is zero. However, often the function is unknown and we have only access to some (possibly noisy) values of the function. 

In topological data analysis,  we can analyze topological structures using persistent homologies. For 1-dimensional signals, this can translate into looking at the birth/death persistence diagram, i.e. the birth and death of connected topological components as we expand the space around each point where we have observed our function. These observations turn out to be closely related to the modes (local maxima) of the function. A recent paper [1] proposed an efficient method for mode seeking.

In this project, the task is to extend the ideas from [1] to get a probabilistic estimate on the number of modes. To this end, one has to use probabilistic methods such as Gaussian processes.

[1] U. Bauer, A. Munk, H. Sieling, and M. Wardetzky. Persistence barcodes versus Kolmogorov signatures: Detecting modes of one-dimensional signals. Foundations of computational mathematics17:1 - 33, 2017.

Advisors: Pekka ParviainenNello Blaser

Enhancing weather forecasts using deep learning methods

For most weather prediction applications, state-of-the-art machine learning methods are still outperformed by weather forecasts produced using atmospheric model approaches [1](https://doi.org/10.1098/rsta.2020.0097). Although usually more accurate, these more classical atmospheric model-based methods have some disadvantages. For example, their output is hard to analyse and their accuracy may drop in specific cases such as exceptional weather events, which are precisely cases that meteorologists would like to be able to analyse properly. So, instead of completely replacing model-driven methods by purely data-driven methods such as deep neural networks, one could combine currently produced weather forecasts with deep learning techniques in order to mitigate their flaws [2](https://doi.org/10.1016/j.bdr.2020.100178).

In this project, you will design and train deep neural networks using actual weather prediction data, which are essentially ensemble of time series. This project is applied but no a priori knowledge about weather prediction / physics is required.

[1](https://doi.org/10.1098/rsta.2020.0097) Schultz, Martin G., et al. "Can deep learning beat numerical weather prediction?." Philosophical Transactions of the Royal Society A 379.2194 (2021): 20200097.

[2](https://doi.org/10.1016/j.bdr.2020.100178) Ren, Xiaoli, et al. "Deep learning-based weather prediction: a survey." Big Data Research 23 (2021): 100178.

Advisors: Natacha Galmiche, Nello Blaser

Automatic hyperparameter selection for isomap

Isomap is a non-linear dimensionality reduction method with two free hyperparameters (number of nearest neighbors and neighborhood radius). Different hyperparameters result in dramatically different embeddings. Previous methods for selecting hyperparameters focused on choosing one optimal hyperparameter. In this project, you will explore the use of persistent homology to find parameter ranges that result in stable embeddings. The project has theoretic and computational aspects.

Advisor: Nello Blaser

Validate persistent homology

Persistent homology is a generalization of hierarchical clustering to find more structure than just the clusters. Traditionally, hierarchical clustering has been evaluated using resampling methods and assessing stability properties. In this project you will generalize these resampling methods to develop novel stability properties that can be used to assess persistent homology. This project has theoretic and computational aspects.

Advisor: Nello Blaser

Topological Ancombs quartet

This topic is based on the classical Ancombs quartet and families of point sets with identical 1D persistence (https://arxiv.org/abs/2202.00577). The goal is to generate more interesting datasets using the simulated annealing methods presented in (http://library.usc.edu.ph/ACM/CHI%202017/1proc/p1290.pdf). This project is mostly computational.

Advisor: Nello Blaser

Persistent homology benchmarks

Persistent homology is becoming a standard method for analyzing data. In this project, you will to generate benchmark data sets for testing different aspects of the persistence pipeline. You will generate benchmarks for different objectives, such as data with known persistence diagram, where for example bottleneck distance can be minimized and data with classification and regression targets. Data sets will be sampled from a manifold with or without noise or from a general probability distribution. This project is mostly computational.

Advisor: Nello Blaser

Divisive covers

Divisive covers are a divisive technique for generating filtered simplicial complexes. They original used a naive way of dividing data into a cover. In this project, you will explore different methods of dividing space, based on principle component analysis, support vector machines and k-means clustering. In addition, you will explore methods of using divisive covers for classification. This project will be mostly computational.

Advisor: Nello Blaser

Binarized Neural Networks

Binarized neural networks (BNNs) have recently attracted a lot of attention in the AI research community as a memory-efficient alternative to classical deep neural network models. In 2018, Narodytska et al. proposed an exact translation of BNNs into propositional logic. Using this translation, various properties such as robustness against adversarial attacks can be proved. The main tasks in this project are to study BNNs and the translation into propositional logic, implement an optimised version of the translation, and perform experiments verifying its correctness.

References:

Binarized neural networks by Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv,Yoshua Bengio (NeurIPS-16)

Verifying Properties of Binarized Deep Neural Networks by Nina Narodytska, Shiva PrasadKasiviswanathan, Leonid Ryzhyk, Mooly Sagiv, Toby Walsh (AAAI-18)

Advisor: Ana Ozaki

Quantum Neural Networks

Quantum computers can solve certain types of problems exponentially faster than classical computers - so-called quantum supremacy. However, it is still mostly unclear how far quantum supremacy goes, i.e. for what types of problems quantum computing outperforms classical computing. As quantum computers become larger (more qubits) and more reliable (lower error rates), we approach the point where they may become relevant for machine learning applications.One of the proposed methods in this field are so-called quantum neural networks (QNN). Where classical neural networks (CNN) use real-valued weights, activation functions, input and output data, in a QNN all of these are represented by complex quantum states and quantum operations. This allows for a much denser encoding of information, so that a small QNN may be functionally equivalent to a much larger CNN. For larger QNN, the equivalent CNN would have to be so enormously large that it is completely infeasible.This leads to the central objective of this project:Under which conditions can a QNN achieve quantum supremacy? How do QNN and CNN compare in terms of learning speed, accuracy, etc. for different classes of problems, and how does their performance scale with size?In the foreseeable future, quantum computers will be relatively noisy; that means they will have high error rates. This poses an additional problem:How does noise affect the performance of a QNN? Are there limits to how much noise a QNN can tolerate? How does the effect of noise scale with the size of the QNN?You can approach this project in two ways:

As a theoretical thesis based on mathematical models and learning theory

As a practical thesis based on coding and benchmarking prototypes

… or any combination of (a) and (b).If you are interested, please contact Philip Turk or Ana Ozaki

References

[1] Beer, K., Bondarenko, D., Farrelly, T. et al. Training deep quantum neural networks. Nat Commun 11, 808 (2020). https://doi.org/10.1038/s41467-020-14454-2https://www.nature.com/articles/s41467-020-14454-2

[2] Schuld, M. and Petruccione, F. Supervised Learning with Quantum Computers, Springer, 2018.

Neural Network Verification

Neural networks have been applied in many areas. However, any method based on generalizations may fail and this is by design. The question is how to deal with such failures. To limit them, one can define rules that a neural network should follow and devise strategies to verify whether the rules are obeyed. The main tasks of this project are to study an algorithm for learning rules formulated in propositional Horn, implement the algorithm, and apply it to verify neural networks.       

References:

Queries and Concept Learning by Angluin (Machine Learning 1988)

Exact Learning: On the Boundary between Horn and CNF by Hermo and Ozaki (ACM TOCT 2020).

Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples by Weiss, Goldberg, Yahav (ICML 2018)

Advisor: Ana Ozaki

Knowledge Graph Embeddings

Knowledge graphs can be understood as labelled graphs whose nodes and edges are enriched with meta-knowledge, such as temporal validity, geographic coordinates, and provenance. Recent research in machine learning attempts to complete (or predict) facts in a knowledge graph by embedding entities and relations in low-dimensional vector spaces. The main tasks of this project are to study knowledge graph embeddings, study ways of integrating temporal validity in the geometrical model of a knowledge graph, implement and perform tests with an embedding that represents the temporal evolution of entities using their vector representations.

References:

Translating Embeddings for Modeling Multi-relational Data by Bordes, Usunier, Garcia-Durán (NeurIPS 2013)

Temporally Attributed Description Logics by Ozaki, Krötzsch, Rudolph (Book chapter: Description Logic, Theory Combination, and All That 2019)

Attributed Description Logics: Reasoning on Knowledge Graphs by Krötzsch, Marx, Ozaki, Thost (ISWC 2017)

Advisor: Ana Ozaki

Knowledge Graph Repair

While Knowledge Graphs are becoming increasingly popular, one persistent issue concerns the quality of data. Sometimes not only the information described is incomplete, but it is also incorrect. One can rely on ontological approaches or machine learning techniques using knowledge graph embeddings to fix incorrect information in such graphs. This project's primary research goal is to investigate the combination of methods in the mentioned approaches. Embeddings that can relate to the taxonomical rules in the Knowledge Graphs are particularly promising.

References:

  • Improved knowledge graph embedding using background taxonomic information by Fatemi, Ravanbakhsh, Poole. (AAAI 2019).
  • Debugging incoherent terminologies by Schlobach, Huang, Cornet, van Harmelen. (JAIR v39 - 2007).
  • KGClean: An Embedding Powered Knowledge Graph Cleaning Framework by Ge, Gao, Weng, Zhang, Miao, Zheng. (arXiv 2020).

Advisor: Ricardo Guimarães

Decidability and Complexity of Learning 

Gödel showed in 1931 that, essentially, there is no consistent and complete set of axioms that is capable of modelling traditional arithmetic operations. Recently, Ben-David et al. defined a general learning model and showed that learnability in this model may not be provable using the standard axioms of mathematics. The main tasks of this project are to study Gödel's incompleteness theorems, the connection between these theorems and the theory of machine learning, and to investigate learnability and complexity classes in the PAC and the exact learning models.

References:

Learnability can be undecidable by Ben-David, Hrubeš, Moran, Shpilka, Yehudayoff (Nature 2019)

On the Complexity of Learning Description Logic Ontologies by Ozaki (RW 2020)

Advisor: Ana Ozaki

Machine Ethics

Autonomous systems, such as self-driving cars, need to behave according to the environment in which they are embedded. However, ethical and moral behaviour is not universal and it is often the case that the underlying behaviour norms change among countries or groups of countries and a compromise among such differences needs to be considered.

The moral machines experiment (https://www.moralmachine.net/) exposed people to a series of moral dilemmas and asked people what should an autonomous vehicle do in each of the given situations. Researchers then tried to find similarities between the answers from the same region.

The main tasks of this project are to study the moral machine experiment, study and implement an algorithm for building compromises among different regions (or even people). We have developed a compromise building algorithm that works on behavioural norms represented as Horn clauses. Assume that each choice example from the moral machines experiment is behavioural norm represented as a Horn clause. The compromise algorithm is applied to these choices obtained from different people during the moral machines experiment. One of the goals of this project would be to determine how to (efficiently) compute compromises for groups of countries (e.g., the Nordic Countries and Scandinavia).

 

References:

The Moral Machine experiment by Edmond Awad, Sohan Dsouza, Richard Kim, Jonathan Schulz, Joseph Henrich, Azim Shariff, Jean-François Bonnefon, and Iyad Rahwan (Nature 2018)

 

Advisors: Ana Ozaki, Marija Slavkovik

Reinforcement learning for sparsification

Reinforcement learning has recently become a way to heuristically solve optimization problems. In this project, you will set up the problem of finding a sparse approximation for persistent homology using the reinforcement framework. You will train a neural network to find approximations of simplicial complexes that can be smaller and more precise than traditional approximation techniques. The setup of the reinforcement problem requires a deep theoretic understanding, and the problem also has a computational aspect.

Advisor: Nello Blaser

Multimodality in Bayesian neural network ensembles

One method to assess uncertainty in neural network predictions is to use dropout or noise generators at prediction time and run every prediction many times. This leads to a distribution of predictions. Informatively summarizing such probability distributions is a non-trivial task and the commonly used means and standard deviations result in the loss of crucial information, especially in the case of multimodal distributions with distinct likely outcomes. In this project, you will analyze such multimodal distributions with mixture models and develop ways to exploit such multimodality to improve training. This project can have theoretical, computational and applied aspects.

Advisor: Nello Blaser

Multitask variational autoencoders

Autoencoders are a type of artificial neural network that to learn a data representation, typically for dimensionality reduction. Variational autoencoders are generative models that combine the autoencoder architectures with probabilistic graphical modeling. They may be used to restore damaged data by conditioning the decoder on the remaining data. In this project you will explore if joint training of a traditional variational autoencoder and restoring variational autoencoders can make the embedding more stable. The project will be mostly computational, but may have some theoretic aspects.

Advisor: Nello Blaser

Detecting small clusters

Standard clustering methods are good at detecting clusters of a certain size and density. Detecting small clusters is difficult, because they lie in low density regions. In this project, you will use methods from anomaly detection coupled with clustering techniques to overcome this challenges. In addition, you will test the new techniques on real-world mass cytometry data. This project will be computational and applied.

Advisor: Nello Blaser

Cytometry analysis

We are looking for 2-3 students to join an interdisciplinary project where you will work together with medical doctors to analyse mass cytometry data. This is data on single cells and we are considering both suspension and image data. Potential projects range from applied data analysis to the development of new specialized methods to solve problems that arise in mass cytometry.

Advisors: Nello Blaser, Sonia Gavasso

Simulated underwater environment and deep learning

Using data from the Mareano surveys or the LoVe underwater observatory, create a simulator for underwater benthic (i.e. sea bed) scenes by placing objects randomly (but credibly) on a background. Using the simulated data, train deep learning neural networks to:

a) recognize presence of specific objects b) locate specific objects c) segment specific objects

Test the systems on real data and evaluate the results.

Advisor: Ketil Malde

Evaluating the effects and interaction of hyperparameters in convolutional neural networks

Neural networks have many hyperparameters, including choice of activation functions, regularization and normalization, gradient descent method, early stopping, cost function, and so on.  While best practices exist, the interactions between the different choices can be hard to predict. To study this, train networks on suitable benchmark data, using randomized choices for hyperparameters, and observe parameters like rate of convergence, over- and underfitting, magnitude of gradient, and final accuracy.

Advisor: Ketil Malde

Online learning in real-time systems

Build a model for the drilling process by using the Virtual simulator OpenLab (https://openlab.app/) for real-time data generation and online learning techniques. The student will also do a short survey of existing online learning techniques and learn how to cope with errors and delays in the data.

Advisor: Rodica Mihai

Building a finite state automaton for the drilling process by using queries and counterexamples

Datasets will be generated by using the Virtual simulator OpenLab (https://openlab.app/). The student will study the datasets and decide upon a good setting to extract a finite state automaton for the drilling process. The student will also do a short survey of existing techniques for extracting finite state automata from process data. We present a novel algorithm that uses exact learning and abstraction to extract a deterministic finite automaton describing the state dynamics of a given trained RNN. We do this using Angluin's L*algorithm as a learner and the trained RNN as an oracle. Our technique efficiently extracts accurate automata from trained RNNs, even when the state vectors are large and require fine differentiation.arxiv.org

Advisor: Rodica Mihai

Machine learning approaches toward personalized treatment of leukemia

With new data on multiple omics level reveal more information on leukemia and the effect of drugs, there are new opportunities to tailor treatment to each individual patient. In an on-going European project we study leukemia and use data both from individual patients and from cell line and mouse model systems to improve the understanding of genomic clonality, signaling pathway status aiming to generate data enabling machine learning approaches to predict prognosis and treatment response. The focus of the project will be on setting up an appropriate software system enabling evaluation of alternative feature selection methods and classification approaches. There is an opportunity to work tightly with bioinformatics, systems biology and cancer researchers in the above mentioned European project including partners in Germany and the Netherlands and also with the Centre of Excellence CCBIO (Center for Cancer Biomarkers) in Bergen.

Advisor: Inge Jonassen

Applications of causal inference methods to omics data

Many hard problems in machine learning are directly linked to causality [1]. The graphical causal inference framework developed by Judea Pearl can be traced back to pioneering work by Sewall Wright on path analysis in genetics and has inspired research in artificial intelligence (AI) [1].

The Michoel group has developed the open-source tool Findr [2] which provides efficient implementations of mediation and instrumental variable methods for applications to large sets of omics data (genomics, transcriptomics, etc.). Findr works well on a recent data set for yeast [3].

We encourage students to explore promising connections between the fiels of causal inference and machine learning. Feel free to contact us to discuss projects related to causal inference. Possible topics include: a) improving methods based on structural causal models, b) evaluating causal inference methods on data for model organisms, c) comparing methods based on causal models and neural network approaches.

References:

1. Schölkopf B, Causality for Machine Learning, arXiv (2019): https://arxiv.org/abs/1911.10500

2. Wang L and Michoel T. Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data. PLoS Computational Biology 13:e1005703 (2017). https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005703

3. Ludl A and and Michoel T. Comparison between instrumental variable and mediation-based methods for reconstructing causal gene networks in yeast. arXiv:2010.07417 https://arxiv.org/abs/2010.07417

Advisors: Adriaan LudlTom Michoel

Space-Time Linkage of Fish Distribution to Environmental Conditions

Background

Conditions in the marine environment, such as, temperature and currents, influence the spatial distribution and migration patterns of marine species. Hence, understanding the link between environmental factors and fish behavior is crucial in predicting, e.g., how fish populations may respond to climate change.   Deriving this link is challenging because it requires analysis of two types of datasets (i) large environmental (currents, temperature) datasets that vary in space and time, and (ii) sparse and sporadic spatial observations of fish populations.

Project goal   

The primary goal of the project is to develop a methodology that helps predict how spatial distribution of two fish stocks (capelin and mackerel) change in response to variability in the physical marine environment (ocean currents and temperature).  The information can also be used to optimize data collection by minimizing time spent in spatial sampling of the populations.

Approach

The project will focus on the use of machine learning and/or causal inference algorithms.  As a first step, we use synthetic (fish and environmental) data from analytic models that couple the two data sources.  Because the ‘truth’ is known, we can judge the efficiency and error margins of the methodologies. We then apply the methodologies to real world (empirical) observations.

Advisors: Tom Michoel, Sam Subbey

Towards precision medicine for cancer patient stratification

On average, a drug or a treatment is effective in only about half of patients who take it. This means patients need to try several until they find one that is effective at the cost of side effects associated with every treatment. The ultimate goal of precision medicine is to provide a treatment best suited for every individual. Sequencing technologies have now made genomics data available in abundance to be used towards this goal.

In this project we will specifically focus on cancer. Most cancer patients get a particular treatment based on the cancer type and the stage, though different individuals will react differently to a treatment. It is now well established that genetic mutations cause cancer growth and spreading and importantly, these mutations are different in individual patients. The aim of this project is use genomic data allow to better stratification of cancer patients, to predict the treatment most likely to work. Specifically, the project will use machine learning approach to integrate genomic data and build a classifier for stratification of cancer patients.

Advisor: Anagha Joshi

Unraveling gene regulation from single cell data

Multi-cellularity is achieved by precise control of gene expression during development and differentiation and aberrations of this process leads to disease. A key regulatory process in gene regulation is at the transcriptional level where epigenetic and transcriptional regulators control the spatial and temporal expression of the target genes in response to environmental, developmental, and physiological cues obtained from a signalling cascade. The rapid advances in sequencing technology has now made it feasible to study this process by understanding the genomewide patterns of diverse epigenetic and transcription factors as well as at a single cell level.

Single cell RNA sequencing is highly important, particularly in cancer as it allows exploration of heterogenous tumor sample, obstructing therapeutic targeting which leads to poor survival. Despite huge clinical relevance and potential, analysis of single cell RNA-seq data is challenging. In this project, we will develop strategies to infer gene regulatory networks using network inference approaches (both supervised and un-supervised). It will be primarily tested on the single cell datasets in the context of cancer.

Advisor: Anagha Joshi

Developing a Stress Granule Classifier

To carry out the multitude of functions 'expected' from a human cell, the cell employs a strategy of division of labour, whereby sub-cellular organelles carry out distinct functions. Thus we traditionally understand organelles as distinct units defined both functionally and physically with a distinct shape and size range. More recently a new class of organelles have been discovered that are assembled and dissolved on demand and are composed of liquid droplets or 'granules'. Granules show many properties characteristic of liquids, such as flow and wetting, but they can also assume many shapes and indeed also fluctuate in shape. One such liquid organelle is a stress granule (SG). 

Stress granules are pro-survival organelles that assemble in response to cellular stress and important in cancer and neurodegenerative diseases like Alzheimer's. They are liquid or gel-like and can assume varying sizes and shapes depending on their cellular composition. 

In a given experiment we are able to image the entire cell over a time series of 1000 frames; from which we extract a rough estimation of the size and shape of each granule. Our current method is susceptible to noise and a granule may be falsely rejected if the boundary is drawn poorly in a small majority of frames. Ideally, we would also like to identify potentially interesting features, such as voids, in the accepted granules.

We are interested in applying a machine learning approach to develop a descriptor for a 'classic' granule and furthermore classify them into different functional groups based on disease status of the cell. This method would be applied across thousands of granules imaged from control and disease cells. We are a multi-disciplinary group consisting of biologists, computational scientists and physicists. 

Advisors: Sushma Grellscheid, Carl Jones

Machine Learning based Hyperheuristic algorithm

Develop a Machine Learning based Hyper-heuristic algorithm to solve a pickup and delivery problem. A hyper-heuristic is a heuristics that choose heuristics automatically. Hyper-heuristic seeks to automate the process of selecting, combining, generating or adapting several simpler heuristics to efficiently solve computational search problems [Handbook of Metaheuristics]. There might be multiple heuristics for solving a problem. Heuristics have their own strength and weakness. In this project, we want to use machine-learning techniques to learn the strength and weakness of each heuristic while we are using them in an iterative search for finding high quality solutions and then use them intelligently for the rest of the search. Once a new information is gathered during the search the hyper-heuristic algorithm automatically adjusts the heuristics.

Advisor: Ahmad Hemmati

Machine learning for solving satisfiability problems and applications in cryptanalysis

Advisor: Igor Semaev

Hybrid modeling approaches for well drilling with Sintef

Several topics are available.

 

Background

"Flow models" are first-principles models simulating the flow, temperature and pressure in a well being drilled. Our project is exploring "hybrid approaches" where these models are combined with machine learning models that either learn from time series data from flow model runs or from real-world measurements during drilling. The goal is to better detect drilling problems such as hole cleaning, make more accurate predictions and correctly learn from and interpret real-word data.

The "surrogate model" refers to  a ML model which learns to mimic the flow model by learning from the model inputs and outputs. Use cases for surrogate models include model predictions where speed is favoured over accuracy and exploration of parameter space.

 

Surrogate models with active Learning

While it is possible to produce a nearly unlimited amount of training data by running the flow model, the surrogate model may still perform poorly if it lacks training data in the part of the parameter space it operates in or if it "forgets" areas of the parameter space by being fed too much data from a narrow range of parameters.

The goal of this thesis is to build a surrogate model (with any architecture) for some restricted parameter range and implement an active learning approach where the ML requests more model runs from the flow model in the parts of the parameter space where it is needed the most. The end result should be a surrogate model that is quick and performs acceptably well over the whole defined parameter range.

 

Surrogate models trained via adversarial learning

How best to train surrogate models from runs of the flow model is an open question. This master thesis would use the adversarial learning approach to build a surrogate model which to its "adversary" becomes indistinguishable from the output of an actual flow model run.

 

GPU-based Surrogate models for parameter search

While CPU speed largely stalled 20 years ago in terms of working frequency on single cores, multi-core CPUs and especially GPUs took off and delivered increases in computational power by parallelizing computations.

Modern machine learning such as deep learning takes advantage this boom in computing power by running on GPUs.

The SINTEF flow models in contrast, are software programs that runs on a CPU and does not happen to utilize multi-core CPU functionality. The model runs advance time-step by time-step and each time step relies on the results from the previous time step. The flow models are therefore fundamentally sequential and not well suited to massive parallelization.

It is however of interest to run different model runs in parallel, to explore parameter spaces. The use cases for this includes model calibration, problem detection and hypothesis generation and testing.

The task of this thesis is to implement an ML-based surrogate model in such a way that many surrogate model outputs can be produced at the same time using a single GPU. This will likely entail some trade off with model size and maybe some coding tricks.

 

Uncertainty estimates of hybrid predictions (Lots of room for creativity, might need to steer it more, needs good background literature)

When using predictions from a ML model trained on time series data, it is useful to know if it's accurate or should be trusted. The student is challenged to develop hybrid approaches that incorporates estimates of uncertainty. Components could include reporting variance from ML ensembles trained on a diversity of time series data, implementation of conformal predictions, analysis of training data parameter ranges vs current input, etc. The output should be a "traffic light signal" roughly indicating the accuracy of the predictions.

 

Transfer learning approaches

We're assuming an ML model is to be used for time series prediction

It is possible to train an ML on a wide range of scenarios in the flow models, but we expect that to perform well, the model also needs to see model runs representative of the type of well and drilling operation it will be used in. In this thesis the student implements a transfer learning approach, where the model is trained on general model runs and fine-tuned on a most representative data set.

(Bonus1: implementing one-shot learning, Bonus2: Using real-world data in the fine-tuning stage)

 

ML capable of reframing situations

When a human oversees an operation like well drilling, she has a mental model of the situation and new data such as pressure readings from the well is interpreted in light of this model. This is referred to as "framing" and is the normal mode of work. However, when a problem occurs, it becomes harder to reconcile the data with the mental model. The human then goes into "reframing", building a new mental model that includes the ongoing problem. This can be seen as a process of hypothesis generation and testing.

A computer model however, lacks re-framing. A flow model will keep making predictions under the assumption of no problems and a separate alarm system will use the deviation between the model predictions and reality to raise an alarm. This is in a sense how all alarm systems work, but it means that the human must discard the computer model as a tool at the same time as she's handling a crisis.

The student is given access to a flow model and a surrogate model which can learn from model runs both with and without hole cleaning and is challenged to develop a hybrid approach where the ML+flow model continuously performs hypothesis generation and testing and is able to "switch" into predictions of  a hole cleaning problem and different remediations of this.

 

Advisor: Philippe Nivlet at Sintef together with advisor from UiB

 

Explainable AI at Equinor

In the project Machine Teaching for XAI (see https://xai.w.uib.no) a master thesis in collaboration between UiB and Equinor.

Advisor: One of Pekka Parviainen/Jan Arne Telle/Emmanuel Arrighi + Bjarte Johansen from Equinor.

Explainable AI at Eviny

In the project Machine Teaching for XAI (see https://xai.w.uib.no) a master thesis in collaboration between UiB and Eviny.

Advisor: One of Pekka Parviainen/Jan Arne Telle/Emmanuel Arrighi + Kristian Flikka from Eviny.

Own topic combining logic and learning

If you want to suggest your own topic combining logic and learning, please contact Ana Ozaki  

Own topic

If you want to suggest your own topic, please contact Pekka Parviainen