**19/1, Anna Dreber Almenberg, Stockholm School of Economics: (Predicting) replication outcome**

Abstract: Why are there so many false positive results in the published scientific literature? And what is the actual share of results that do not replicate in different literatures in the experimental social sciences? I will discuss several large replication projects on direct and conceptual replications, as well as our studies on "wisdom-of-crowds" mechanisms like prediction markets and forecasting surveys where researchers attempt to predict replication outcomes as well as new outcomes.

**2/2, Claudia Redenbach, Technische Universität Kaiserslautern: Using stochastic models for segmentation and characterization of spatial microstructures**

Abstract: The performance of engineering materials such as foams, fibre composites or concrete is heavily influenced by the microstructure geometry. Quantitative analysis of 3D images, provided for instance by micro computed tomography (µCT), allows for a characterization of material samples. In this talk, we will illustrate how models from stochastic geometry may support the segmentation of image data and the statistical analysis of the microstructures. Our first example deals with the estimation of the fibre length distribution from µCT images of glass fibre reinforced composites. As examples of segmentation tasks we present the reconstruction of the solid component of a porous medium from focused ion beam scanning electron microscopy (FIB-SEM) image data and the segmentation of cracks in µCT images of concrete.

**16/2, Fredrik Johansson, Chalmers: Making the most of observational data in causal estimation with machine learning**

Abstract: Decision making is central to all aspects of society, private and public. Consequently, using data and statistics to improve decision-making has a rich history, perhaps best exemplified by the randomized experiment. In practice, however, experiments carry significant risk. For example, making an online recommendation system worse could result in millions of lost profits; selecting an inappropriate treatment for a patient could have devastating consequences. Luckily, organizations like hospitals and companies who serve recommendations routinely collect vast troves of observational data on decisions and outcomes. In this talk, I discuss how to make the best use of such data to improve policy, starting with an example of what can go wrong if we’re not careful. Then, I present two pieces of research on how to avoid such perils if we are willing to say more about less.

**2/3, Andrea De Gaetano, IRIB CNR: Modelling haemorrhagic shock and statistical challenges for parameter estimation**

Abstract: In the ongoing development of ways to mitigate the consequences of penetrating trauma in humans, particularly in the area of civil defence and military operations, possible strategies aimed at identifying the victim's physiological state and its likely evolution depend on mechanistic, quantitative understanding of the compensation mechanisms at play. In this presentation, time-honored and recent mathematical models of the dynamical response to hemorrhage are briefly discussed and their applicability to real-life situations is examined. Conclusions are drawn as to the necessary formalization of this problem, which however poses methodological challenges for parameter estimation.

**16/3, Fredrik Lindsten, Linköping University: Monte Carlo for Approximate Bayesian Inference**

Abstract: Sequential Monte Carlo (SMC) is a powerful class of methods for approximate Bayesian inference. While originally used mainly for signal processing and inference in dynamical systems, these methods are in fact much more general and can be used to solve many challenging problems in Bayesian statistics and machine learning, even if they lack apparent sequential structure. In this talk I will first discuss the foundations of SMC from a machine learning perspective. We will see that there are two main design choices of SMC: the proposal distribution and the so-called intermediate target distributions, where the latter is often overlooked in practice. Focusing on graphical model inference, I will then show how deterministic approximations, such as variational inference and expectation propagation, can be used to approximate the optimal intermediate target distributions. The resulting algorithm can be viewed as a post-correction of the biases associated with these deterministic approximations. Numerical results show improvements over the baseline deterministic methods as well as over "plain" SMC.

The first part of the talk is an introduction to SMC inspired by our recent Foundations and Trends tutorial

**30/3, Manuela Zucknick, University of Oslo: Bayesian modelling of treatment response in ex vivo drug screens for precision cancer medicine**

Abstract: Large-scale cancer pharmacogenomic screening experiments profile cancer cell lines or patient-derived cells versus hundreds of drug compounds. The aim of these in vitro studies is to use the genomic profiles of the cell lines together with information about the drugs to predict the response to a particular combination therapy, in particular to identify combinations of drugs that act synergistically. The field is hyped with rapid development of sophisticated high-throughput miniaturised platforms for rapid large-scale screens, but development of statistical methods for the analysis of resulting data is lagging behind. I will discuss typical challenges for estimation and prediction of response to combination therapies, from large technical variation and experimental biases to modelling challenges for prediction of drug response using genomic data. I will present two Bayesian models that we have recently developed to address diverse problems relating to the estimation and prediction tasks, and show how they can improve the identification of promising drug combinations over standard non-statistical approaches.

**6/4, Prashant Singh, Uppsala University: Likelihood-free parameter inference of stochastic time series models: exploring neural networks to enhance scalability, efficiency and performance**

Abstract: Parameter inference of stochastic time series models, such as gene regulatory networks in the likelihood-free setting is a challenging task, particularly when the number of parameters to be inferred is large. Recently, data-driven machine learning models (neural networks in particular) have delivered encouraging results towards addressing the scalability, efficiency and parameter inference quality of the likelihood-free parameter inference pipeline. In particular, this talk will present a detailed discussion on neural networks as trainable, expressive and scalable summary statistics of high-dimensional time series for parameter inference tasks.

Preprint reference: Åkesson, M., Singh, P., Wrede, F., & Hellander, A. (2020). Convolutional neural networks as summary statistics for approximate bayesian computation. arXiv preprint arXiv:2001.11760

**11/5, Ilaria Prosdocimi, University of Venice: Statistical models for the detection of changes in peak river flow in the UK **

Abstract: Several parts of the United Kingdom have experienced highly damaging flooding events in the recent decades, raising doubts on whether methods used to assess flood risk, and therefore design flood defences, are "fit for purpose". It has also been hypothesized that the high number of recent extreme events might be one of the impacts of the (anthropogenic) changes in the climate. Indeed, with the increasing evidence of a changing climate, there is much interest in investigating the potential impacts of these changes on the risks linked to natural hazards such as intense rainfall, extreme waves and flooding. This has resulted in several studies investigating changes in natural hazard extremes, including peak river flow extremes in the UK. This talk will review a selection of these studies, discussing some of the pitfalls of statistical models typically employed to assess whether any change can be detected in peak river flow extremes. Solutions to these pitfalls are outlined and discussed. In particular, the consequences of the functional forms assumed to describe change in extremes on the ability of describing changes in the risk profiles of natural hazards are discussed.

**25/5, Matteo Fasiolo, University of Bristol: Generalized additive models for ensemble electricity demand forecasting**

Abstract: Future grid management systems will coordinate distributed production and storage resources to manage, in a cost-effective fashion, the increased load and variability brought by the electrification of transportation and by a higher share of weather-dependent production.

Electricity demand forecasts at a low level of aggregation will be key inputs for such systems. In this talk, I'll focus on forecasting demand at the individual household level, which is more challenging than forecasting aggregate demand, due to the lower signal-to-noise ratio and to the heterogeneity of consumption patterns across households.

I'll describe a new ensemble method for probabilistic forecasting, whichborrows strength across the households while accommodating their individual idiosyncrasies.

The first step consists of designing a set of models or 'experts' which capture different demand dynamics and fitting each of them to the data from each household.

Then the idea is to construct an aggregation ofexperts where the ensemble weights are estimated on the whole data set,the main innovation being that we let the weights vary with the covariates by adopting an additive model structure.In particular, the proposed aggregation method is an extension of regression stacking (Breiman, 1996) where the mixture weights are modelled using linear combinations of parametric, smooth or random effects.

The methods for building and fitting additive stacking models are implemented by the gamFactory R package, available at https://github.com/mfasiolo/gamFactory

**8/6, Seyed Morteza Najibi, Lund University, Functional Singular Spectrum Analysis with application to remote sensing data**

One of the popular approaches in the decomposition of time series is accomplished using the rates of change. In this approach, the observed time series is partitioned (decomposed) into informative trends plus potential seasonal (cyclical) and noise (irregular) components. Aligned with this principle, Singular Spectrum Analysis (SSA) is a model-free procedure that is commonly used as a nonparametric technique in analysing the time series. SSA does not require restrictive assumptions such as stationarity, linearity, and normality. It can be used for a wide range of purposes such as trend and periodic component detection and extraction, smoothing, forecasting, change-point detection, gap filling, causality, and so on.

In this talk, I will briefly overview SSA methodology and introduce a new extension called functional SSA to analyze functional time series. This is developed by integrating ideas from functional data analysis and univariate SSA. I will demonstrate this approach for tracking changes in vegetation over time by analysing the kernel density functions of Normalized Difference Vegetation Index (NDVI) images. At the end of the talk, I will also illustrate a simulated example in the interactive Shiny web application implemented in the Rfssa package.

**25/8, Jonas Wallin, Lund University: Locally scale invariant proper scoring rules**

Abstract: Averages of proper scoring rules are often used to rank probabilistic forecasts. In many cases, the variance of the individual observations and their predictive distributions vary in these averages. We show that some of the most popular proper scoring rules, such as the continuous ranked probability score (CRPS) which is the go-to score for continuous observation ensemble forecasts, up-weight observations with large uncertainty which can lead to unintuitive rankings.

To describe this issue, we define the concept of local scale invariance for scoring rules. A new class of generalized proper kernel scoring rules is derived, and as a member of this class, we propose the scaled CRPS (SCRPS). This new proper scoring rule is locally scale-invariant and therefore works in the case of varying uncertainty. Like CRPS it is computationally available for output from ensemble forecasts and does not require the ability to evaluate the density of the forecast. The theoretical findings are illustrated in a few different applications, where we in particular focus on models in spatial statistics.

**14/9, Moritz Schauer, Chalmers/GU: The sticky Zig-Zag sampler: an event chain Monte Carlo (PDMP-) sampler for Bayesian variable selection**

Abstract: During the talk, I will present the sticky event chain Monte Carlo (piecewise deterministic Monte Carlo) samplers [1]. This is a new class of efficient Monte Carlo methods based on continuous-time piecewise deterministic Markov processes (PDMPs) suitable for inference in high dimensional sparse models, i.e. models for which there is prior knowledge that many coordinates are likely to be exactly 0. This is achieved with the fairly simple idea of endowing existing PDMP samplers with sticky coordinate axes, coordinate planes etc. Upon hitting those subspaces, an event is triggered, during which the process sticks to the subspace, this way spending some time in a sub-model. That introduces non-reversible jumps between different (sub-)models. During the talk, I will touch upon computational aspects of the algorithm and illustrate the method for a number of statistical models where both the sample size N and the dimensionality d of the parameter space are large.

[1] J. Bierkens, S. Grazzi, F. van der Meulen, and M. Schauer. Sticky PDMP samplers for sparse and local inference problems. arXiv: 2103.08478, 2021.

Joris Bierkens, Delft University of Technology, joris.bierkens@tudelft.nl

Sebastiano Grazzi, Delft University of Technology, s.grazzi@tudelft.nl

Frank van der Meulen, Delft University of Technology, f.h.vandermeulen@tudelft.nl

Moritz Schauer, Chalmers University of Technology, University of Gothenburg, smoritz@chalmers.se

**21/9, Johan Larsson, Lund University: The Hessian Screening Rule and Adaptive Paths for the Lasso**

Abstract: Predictor screening rules, which discard predictors from the design matrix before fitting the model, have had sizable impacts on the speed at which sparse regression models, such as the lasso, can be solved in the high-dimensional regime. Current state-of-the-art methods, however, face difficulties when dealing with highly-correlated predictors, often becoming too conservative.

In this talk we introduce a new screening rule that deals with this issue: The Hessian Screening Rule, which offers considerable improvements in computational performance when fitting the lasso. These benefits result both from the screening rule itself, but also from much-improved warm starts.

The Hessian Screening Rule also presents a welcome improvement to the construction of the lasso path: the set of lasso models produced by varying the strength of the penalization. The default approach, to a priori construct a log-spaced penalty grid, often fails in approximating the true (exact) lasso path. Leaning on the information already used when computing the Hessian Screening Rule, however, we can improve upon the construction of this grid by adaptively picking penalty parameters along the path.