Understanding multi-modalities in weather and climate predictions
This PhD project aims at enhancing our understanding of weather prediction in case of multi-modality by combining machine learning techniques with topological data analysis approaches as well as appropriate visualization tools.
To produce weather forecasts, an atmospheric model is run multiple times with slightly perturbed initial conditions and/or varied parameters resulting in an ensemble dataset. Then, the communicated forecast is based on an analysis of this ensemble -- usually using the ensemble mean as the expected value and the standard deviation as a measure of uncertainty. In most cases this analysis works well but it yields misleading results and discards crucial information in case of multi-modality in the ensemble, i.e distinct likely outcomes. Our objective is to account for multi-modality in weather prediction in order to obtain a more accurate yet simplified representation of the ensemble.
This project is a tight collaboration at the University of Bergen between the Meteorological group from the Geophysical Institute, the Machine Learning group and the Visualization group from the Department of Informatics as well as the Topological Data Analysis group from the Department of Mathematics. This collaboration allowed us to efficiently identify the challenges related to ensemble weather prediction and to design novel tools, well-adapted to the application.
As a first step we limited our scope to meteograms, i.e ensembles of univariate time-series, a common visualisation in the meteorolgical domain to provide an overview of the possible evolution of a given variable of interest at a fixed geographic point. Such datasets contain a wealth of information and, consequently, analysing them encompasses multiple challenges that current topological data analysis (TDA) methods and machine learning algorithms can not overcome. Therefore, we adapted some ideas from TDA to our data, providing fewer theoretical guarantees than classical TDA approaches but producing encouraging practical results. This led to the design of a graph-based method using machine learning clustering methods as well as the implementation of novel visualization techniques to visualize such graph. Our solution produces fully automated solutions while providing quantitative and qualitative information that allows meteorologists to understand the automated solution and potentially re-adjust this automated solution if necessary. Indeed, we believe that one 'correct' interpretation does not always exist and that only meteorologists can make a decision, especially in threatening situations. Thus, the data exploration process as well as the automated analysis and interpretation should remain transparent.
This study resulted in the second poster presentation of this PhD project "Exploring multi-modalities in weather prediction using a univariate graph based on machine learning techniques", this time during the EGU General Assembly 2021 (April) -- a conference that aims at bringing together geoscientists from all over the world to one meeting covering all disciplines of the Earth, planetary, and space sciences. This presentation was a great opportunity to meet the potential future users of our method and to get valuable feedback from them.
Shortly after, our first paper was submitted in May and then published during the Eurovis 2021 conference (June). To present this paper entitled "Revealing Multimodality in Ensemble Weather Prediction" I gave a 20mn talk during the machine learning session of this visualization conference. In our paper we propose our novel method and apply it to 3 historical extreme weather events (Hurricane Sandy in New-York in 2012, Storm Lothar in Paris in 1999, European Heatwave in Bergen in 2019). These historical applications aimed at illustrating how our method can aid the understanding of ensemble weather prediction and help identify potentially threatening weather outcomes that might not be discernible with a classical unimodal approach.
In the meantime, I have been invited to give a talk during the first CEDAS conference in June (a hybrid event that I could attend *physically* in Bergen!). This event featured presentations related to Data Science in general and it was a great opportunity to share my experience and our work to a new type of audience. Again, we received positive and valuable feedback, strengthening our motivation to further develop our method.
The next steps of my PhD will focus on extending our method to multivariate meteorological data and to develop new interactive tools in order to enhance the user experience with our graph-based solution.