Research project

Mainstreaming Sensitivity Analysis and Uncertainty Auditing

This project was financed by the Peder Sather Grant and was a cooperation between the Centre for the Study of the Sciences and the Humanities, NTNU and UC Berkeley.

Føremålet med Peder Sather Center for Advanced Study er å styrke ekisterande forskingssamarbeid og fremje utviklinga av nye samarbeid mellom University of California, Berkeley (UC Berkeley) og dei norske samarbeidsuniversiteta.

Photo:

Peder Saether Center

Main content

The aim of the project was to develop a serious collaboration regarding sensitivity analysis and uncertainty quantification, with an emphasis on training the next generation of scientists and engineers to use the tools routinely.

Information about the project from the final report:

We expect to continue to collaborate on the intersection of sensitivity analysis and statistics. In particular, we are examining the use of sensitivity analysis for model selection and variable selection. The core idea is to wrap the entire statistical modeling enterprise in a generalization of SA. This requires new measures of “variable importance” or “feature importance” to compare SA to more standardstatistical methods, and careful thought about the target of sensitivity.

For instance, the loss function might be some function of the predictions and the data, e.g., prediction mean-squared error, either in-sample (i.e., on the trainingdata) or out-of-sample (i.e., on held-out test data), or a function that measuresthe extent to which the procedure identifies the “correct” variables.

More generally, the SA framework makes it possible to study the stability of results to “natural variation” (noise, the particular sample or cases included, etc.) and “researcher variation” (choices including data cleaning, pre-transformations, the functional form of the model, smoothing, variable selection, etc.)

This framing involves a shift of perspective from sampling “rows” (cases) withor without replacement (in addition to sampling/scrambling columns), rather than just sampling/scrambling “columns” (points at which observations aremade). This could lead to a version of SA where the observed cases are thought of as a sample from some larger real or hypothetical population of cases—a “bootstrap” SA, which could provide upper bounds on the generalizability of results.