Raphaël Huser, A. C. Davison
Max-stable processes are the natural analogues of the generalized extreme-value distribution for the modelling of extreme events in space and time. Under suitable conditions, these processes are asymptotically justified models for maxima of independent replications of random fields, and they are also suitable for the modelling of joint individual extreme measurements over high thresholds. This paper extends a model of Schlather (2001) to the space-time framework, and shows how a pairwise censored likelihood can be used for consistent estimation under mild mixing conditions. Estimator efficiency is also assessed and the choice of pairs to be included in the pairwise likelihood is discussed based on computations for simple time series models. The ideas are illustrated by an application to hourly precipitation data over Switzerland.
Gregory P. Bopp, Benjamin A. Shaby, Raphaël Huser
Understanding the spatial extent of extreme precipitation is necessary for determining flood risk and adequately designing infrastructure (e.g., stormwater pipes) to withstand such hazards. While environmental phenomena typically exhibit weakening spatial dependence at increasingly extreme levels, limiting max-stable process models for block maxima have a rigid dependence structure that does not capture this type of behavior. We propose a flexible Bayesian model from a broader family of (conditionally) max-infinitely divisible processes that allows for weakening spatial dependence at increasingly extreme levels, and due to a hierarchical representation of the likelihood in terms of random effects, our inference approach scales to large datasets. Therefore, our model not only has a flexible dependence structure, but it also allows for fast, fully Bayesian inference, prediction and conditional simulation in high dimensions. The proposed model is constructed using flexible random basis functions that are estimated from the data, allowing for straightforward inspection of the predominant spatial patterns of extremes. In addition, the described process possesses (conditional) max-stability as a special case, making inference on the tail dependence class possible. We apply our model to extreme precipitation in North-Eastern America, and show that the proposed model adequately captures the extremal behavior of the data. Interestingly, we find that the principal modes of spatial variation estimated from our model resemble observed patterns in extreme precipitation events occurring along the coast (e.g., with localized tropical cyclones and convective storms) and mountain range borders. Our model, which can easily be adapted to other types of environmental datasets, is therefore useful to identify extreme weather patterns and regions at risk.
Rishikesh Yadav, Raphaël Huser, Thomas Opitz, Luigi Lombardo
To accurately quantify landslide hazard in a region of Turkey, we develop new marked point process models within a Bayesian hierarchical framework for the joint prediction of landslide counts and sizes. To accommodate for the dominant role of the few largest landslides in aggregated sizes, we leverage mark distributions with strong justification from extreme-value theory, thus bridging the two broad areas of statistics of extremes and marked point patterns. At the data level, we assume a Poisson distribution for landslide counts, while we compare different "sub-asymptotic" distributions for landslide sizes to flexibly model their upper and lower tails. At the latent level, Poisson intensities and the median of the size distribution vary spatially in terms of fixed and random effects, with shared spatial components capturing cross-correlation between landslide counts and sizes. We robustly model spatial dependence using intrinsic conditional autoregressive priors. Our novel models are fitted efficiently using a customized adaptive Markov chain Monte Carlo algorithm. We show that, for our dataset, sub-asymptotic mark distributions provide improved predictions of large landslide sizes compared to more traditional choices. To showcase the benefits of joint occurrence-size models and illustrate their usefulness for risk assessment, we map landslide hazard along major roads.
Birgir Hrafnkelsson, Stefan Siegert, Raphaël Huser, Haakon Bakka, Árni V. Jóhannesson
With modern high-dimensional data, complex statistical models are necessary, requiring computationally feasible inference schemes. We introduce Max-and-Smooth, an approximate Bayesian inference scheme for a flexible class of latent Gaussian models (LGMs) where one or more of the likelihood parameters are modeled by latent additive Gaussian processes. Max-and-Smooth consists of two-steps. In the first step (Max), the likelihood function is approximated by a Gaussian density with mean and covariance equal to either (a) the maximum likelihood estimate and the inverse observed information, respectively, or (b) the mean and covariance of the normalized likelihood function. In the second step (Smooth), the latent parameters and hyperparameters are inferred and smoothed with the approximated likelihood function. The proposed method ensures that the uncertainty from the first step is correctly propagated to the second step. Since the approximated likelihood function is Gaussian, the approximate posterior density of the latent parameters of the LGM (conditional on the hyperparameters) is also Gaussian, thus facilitating efficient posterior inference in high dimensions. Furthermore, the approximate marginal posterior distribution of the hyperparameters is tractable, and as a result, the hyperparameters can be sampled independently of the latent parameters. In the case of a large number of independent data replicates, sparse precision matrices, and high-dimensional latent vectors, the speedup is substantial in comparison to an MCMC scheme that infers the posterior density from the exact likelihood function. The proposed inference scheme is demonstrated on one spatially referenced real dataset and on simulated data mimicking spatial, temporal, and spatio-temporal inference problems. Our results show that Max-and-Smooth is accurate and fast.
Zhongwei Zhang, David Bolin, Sebastian Engelke, Raphaël Huser
Moving average processes driven by exponential-tailed Lévy noise are important extensions of their Gaussian counterparts in order to capture deviations from Gaussianity, more flexible dependence structures, and sample paths with jumps. Popular examples include non-Gaussian Ornstein--Uhlenbeck processes and type G Matérn stochastic partial differential equation random fields. This paper is concerned with the open problem of determining their extremal dependence structure. We leverage the fact that such processes admit approximations on grids or triangulations that are used in practice for efficient simulations and inference. These approximations can be expressed as special cases of a class of linear transformations of independent, exponential-tailed random variables, that bridge asymptotic dependence and independence in a novel, tractable way. This result is of independent interest since models that can capture both extremal dependence regimes are scarce and the construction of such flexible models is an active area of research. This new fundamental result allows us to show that the integral approximation of general moving average processes with exponential-tailed Lévy noise is asymptotically independent when the mesh is fine enough. Under mild assumptions on the kernel function we also derive the limiting residual tail dependence function. For the popular exponential-tailed Ornstein--Uhlenbeck process we prove that it is asymptotically independent, but with a different residual tail dependence function than its Gaussian counterpart. Our results are illustrated through simulation studies.
Stefano Castruccio, Raphaël Huser, Marc Genton
In multivariate or spatial extremes, inference for max-stable processes observed at a large collection of locations is among the most challenging problems in computational statistics, and current approaches typically rely on less expensive composite likelihoods constructed from small subsets of data. In this work, we explore the limits of modern state-of-the-art computational facilities to perform full likelihood inference and to efficiently evaluate high-order composite likelihoods. With extensive simulations, we assess the loss of information of composite likelihood estimators with respect to a full likelihood approach for some widely-used multivariate or spatial extreme models, we discuss how to choose composite likelihood truncation to improve the efficiency, and we also provide recommendations for practitioners.
Raphaël Huser, Anthony C. Davison, Marc G. Genton
The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.
Luigi Lombardo, Thomas Opitz, Raphael Huser
We develop a stochastic modeling approach based on spatial point processes of log-Gaussian Cox type for a collection of around 5000 landslide events provoked by a precipitation trigger in Sicily, Italy. Through the embedding into a hierarchical Bayesian estimation framework, we can use the Integrated Nested Laplace Approximation methodology to make inference and obtain the posterior estimates. Several mapping units are useful to partition a given study area in landslide prediction studies. These units hierarchically subdivide the geographic space from the highest grid-based resolution to the stronger morphodynamic-oriented slope units. Here we integrate both mapping units into a single hierarchical model, by treating the landslide triggering locations as a random point pattern. This approach diverges fundamentally from the unanimously used presence-absence structure for areal units since we focus on modeling the expected landslide count jointly within the two mapping units. Predicting this landslide intensity provides more detailed and complete information as compared to the classically used susceptibility mapping approach based on relative probabilities. To illustrate the model's versatility, we compute absolute probability maps of landslide occurrences and check its predictive power over space. While the landslide community typically produces spatial predictive models for landslides only in the sense that covariates are spatially distributed, no actual spatial dependence has been explicitly integrated so far for landslide susceptibility. Our novel approach features a spatial latent effect defined at the slope unit level, allowing us to assess the spatial influence that remains unexplained by the covariates in the model.
Daniela Cisneros, Jordan Richards, Ashok Dahal, Luigi Lombardo, Raphaël Huser
Recent wildfires in Australia have led to considerable economic loss and property destruction, and there is increasing concern that climate change may exacerbate their intensity, duration, and frequency. Hazard quantification for extreme wildfires is an important component of wildfire management, as it facilitates efficient resource distribution, adverse effect mitigation, and recovery efforts. However, although extreme wildfires are typically the most impactful, both small and moderate fires can still be devastating to local communities and ecosystems. Therefore, it is imperative to develop robust statistical methods to reliably model the full distribution of wildfire spread. We do so for a novel dataset of Australian wildfires from 1999 to 2019, and analyse monthly spread over areas approximately corresponding to Statistical Areas Level~1 and~2 (SA1/SA2) regions. Given the complex nature of wildfire ignition and spread, we exploit recent advances in statistical deep learning and extreme value theory to construct a parametric regression model using graph convolutional neural networks and the extended generalized Pareto distribution, which allows us to model wildfire spread observed on an irregular spatial domain. We highlight the efficacy of our newly proposed model and perform a wildfire hazard assessment for Australia and population-dense communities, namely Tasmania, Sydney, Melbourne, and Perth.
Andrew Zammit-Mangion, Matthew Sainsbury-Dale, Raphaël Huser
Simulation-based methods for statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimization libraries and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortized, in the sense that, after an initial setup cost, they allow rapid inference through fast feed-forward operations. In this article we review recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation. We also cover software, and include a simple illustration to showcase the wide array of tools available for amortized inference and the benefits they offer over Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.
Jordan Richards, Raphaël Huser, Emanuele Bevacqua, Jakob Zscheischler
Extreme wildfires are a significant cause of human death and biodiversity destruction within countries that encompass the Mediterranean Basin. Recent worrying trends in wildfire activity (i.e., occurrence and spread) suggest that wildfires are likely to be highly impacted by climate change. In order to facilitate appropriate risk mitigation, we must identify the main drivers of extreme wildfires and assess their spatio-temporal trends, with a view to understanding the impacts of global warming on fire activity. We analyse the monthly burnt area due to wildfires over a region encompassing most of Europe and the Mediterranean Basin from 2001 to 2020, and identify high fire activity during this period in Algeria, Italy and Portugal. We build an extreme quantile regression model with a high-dimensional predictor set describing meteorological conditions, land cover usage, and orography. To model the complex relationships between the predictor variables and wildfires, we use a hybrid statistical deep-learning framework that can disentangle the effects of vapour-pressure deficit (VPD), air temperature, and drought on wildfire activity. Our results highlight that whilst VPD, air temperature, and drought significantly affect wildfire occurrence, only VPD affects wildfire spread. To gain insights into the effect of climate trends on wildfires in the near future, we focus on August 2001 and perturb temperature according to its observed trends (median over Europe: +0.04K per year). We find that, on average over Europe, these trends lead to a relative increase of 17.1\% and 1.6\% in the expected frequency and severity, respectively, of wildfires in August 2001, with spatially non-uniform changes in both aspects.
Lambert De Monte, Raphaël Huser, Ioannis Papastathopoulos, Jordan Richards
Leveraging the recently emerging geometric approach to multivariate extremes and the flexibility of normalising flows on the hypersphere, we propose a principled deep-learning-based methodology that enables accurate joint tail extrapolation in all directions. We exploit theoretical links between intrinsic model parameters defined as functions on hyperspheres to construct models ranging from high flexibility to parsimony, thereby enabling the efficient modelling of multivariate extremes displaying complex dependence structures in higher dimensions with reasonable sample sizes. We use the generative feature of normalising flows to perform fast probability estimation for arbitrary Borel risk regions via an efficient Monte Carlo integration scheme. The good properties of our estimators are demonstrated via a simulation study in up to ten dimensions. We apply our methodology to the analysis of low and high extremes of wind speeds. In particular, we find that our methodology enables probability estimation for non-trivial extreme events in relation to electricity production via wind turbines and reveals interesting structure in the underlying data.
Mara Sherlin D. Talento, Jordan Richards, Marco Pinto-Orellana, Raphael Huser, Hernando C. Ombao
Coherence analysis plays a vital role in the study of functional brain connectivity. However, coherence captures only linear spectral associations, and thus can produce misleading findings when ignoring variations of connectivity in the tails of the distribution. This limitation becomes important when investigating extreme neural events that are characterized by large signal amplitudes. The focus of this paper is to examine connectivity in the tails of the distribution, as this reveals salient information that may be overlooked by standard methods. We develop a novel notion of spectral tail association of periodograms to study connectivity in the network of electroencephalogram (EEG) signals of seizure-prone neonates. We further develop a novel non-stationary extremal dependence model for multivariate time series that captures differences in extremal dependence during different brain phases, namely burst-suppression and non-burst-suppression. One advantage of our proposed approach is its ability to identify tail connectivity at key frequency bands that could be associated with outbursts of energy which may lead to seizures. We discuss these novel scientific findings alongside a comparison of the extremal behavior of brain signals for epileptic and non-epileptic patients.
Julia Walchessen, Andrew Zammit-Mangion, Raphaël Huser, Mikael Kuusela
A key objective in spatial statistics is to simulate from the distribution of a spatial process at a selection of unobserved locations conditional on observations (i.e., a predictive distribution) to enable spatial prediction and uncertainty quantification. However, exact conditional simulation from this predictive distribution is intractable or inefficient for many spatial process models. In this paper, we propose neural conditional simulation (NCS), a general method for spatial conditional simulation that is based on neural diffusion models. Specifically, using spatial masks, we implement a conditional score-based diffusion model that evolves Gaussian noise into samples from a predictive distribution when given a partially observed spatial field and spatial process parameters as inputs. The diffusion model relies on a neural network that only requires unconditional samples from the spatial process for training. Once trained, the diffusion model is amortized with respect to the observations in the partially observed field, the number and locations of those observations, and the spatial process parameters, and can therefore be used to conditionally simulate from a broad class of predictive distributions without retraining the neural network. We assess the NCS-generated simulations against simulations from the true conditional distribution of a Gaussian process model, and against Markov chain Monte Carlo (MCMC) simulations from a Brown--Resnick process model for spatial extremes. In the latter case, we show that it is more efficient and accurate to conditionally simulate using NCS than classical MCMC techniques implemented in standard software. We conclude that NCS enables efficient and accurate conditional simulation from spatial predictive distributions that are challenging to sample from using traditional methods.
Qihao Duan, Alexandre B. Simas, David Bolin, Raphaël Huser
The Dynamic Nelson--Siegel (DNS) model is a widely used framework for term structure forecasting. We propose a novel extension that models DNS residuals as a Gaussian random field, capturing dependence across both time and maturity. The residual field is represented via a stochastic partial differential equation (SPDE), enabling flexible covariance structures and scalable Bayesian inference through sparse precision matrices. We consider a range of SPDE specifications, including stationary, non-stationary, anisotropic, and nonseparable models. The SPDE--DNS model is estimated in a Bayesian framework using the integrated nested Laplace approximation (INLA), jointly inferring latent DNS factors and the residual field. Empirical results show that the SPDE-based extensions improve both point and probabilistic forecasts relative to standard benchmarks. When applied in a mean--variance bond portfolio framework, the forecasts generate economically meaningful utility gains, measured as performance fees relative to a Bayesian DNS benchmark under monthly rebalancing. Importantly, incorporating the structured SPDE residual substantially reduces cross-maturity and intertemporal dependence in the remaining measurement error, bringing it closer to white noise. These findings highlight the advantages of combining DNS with SPDE-driven residual modeling for flexible, interpretable, and computationally efficient yield curve forecasting.
Zipei Geng, Jordan Richards, Raphael Huser, Marc G. Genton
Wildfires pose an increasingly severe threat to air quality, yet quantifying their causal impact remains challenging due to unmeasured meteorological and geographic confounders. Moreover, wildfire impacts on air quality may exhibit heterogeneous effects across pollution levels, which conventional mean-based causal methods fail to capture. To address these challenges, we develop a Quantile-based Latent Spatial Confounder Model (QLSCM) that substitutes conditional expectations with conditional quantiles, enabling causal analysis across the entire outcome distribution. We establish the causal interpretation of QLSCM theoretically, prove the identifiability of causal effects, and demonstrate estimator consistency under mild conditions. Simulations confirm the bias correction capability and the advantage of quantile-based inference over mean-based approaches. Applying our method to contiguous US wildfire and air quality data, we uncover important heterogeneous effects: fire radiative power exerts significant positive causal effects on aerosol optical depth at high quantiles in Western states like California and Oregon, while insignificant at lower quantiles. This indicates that wildfire impacts on air quality primarily manifest during extreme pollution events. Regional analyses reveal that Western and Northwestern regions experience the strongest causal effects during such extremes. These findings provide critical insights for environmental policy by identifying where and when mitigation efforts would be most effective.
Luigi Lombardo, Thomas Opitz, Francesca Ardizzone, Fausto Guzzetti, Raphaël Huser
Landslides are nearly ubiquitous phenomena and pose severe threats to people, properties, and the environment. Investigators have for long attempted to estimate landslide hazard to determine where, when, and how destructive landslides are expected to be in an area. This information is useful to design landslide mitigation strategies, and to reduce landslide risk and societal and economic losses. In the geomorphology literature, most attempts at predicting the occurrence of populations of landslides rely on the observation that landslides are the result of multiple interacting, conditioning and triggering factors. Here, we propose a novel Bayesian modelling framework for the prediction of space-time landslide occurrences of the slide type caused by weather triggers. We consider log-Gaussian cox processes, assuming that individual landslides stem from a point process described by an unknown intensity function. We tested our prediction framework in the Collazzone area, Umbria, Central Italy, for which a detailed multi-temporal landslide inventory spanning 1941-2014 is available together with lithological and bedding data. We tested five models of increasing complexity. Our most complex model includes fixed effects and latent spatio-temporal effects, thus largely fulfilling the common definition of landslide hazard in the literature. We quantified the spatio-temporal predictive skill of our model and found that it performed better than simpler alternatives. We then developed a novel classification strategy and prepared an intensity-susceptibility landslide map, providing more information than traditional susceptibility zonations for land planning and management. We expect our novel approach to lead to better projections of future landslides, and to improve our collective understanding of the evolution of landscapes dominated by mass-wasting processes under geophysical and weather triggers.
Arnab Hazra, Raphaël Huser
In this work, we estimate extreme sea surface temperature (SST) hotspots, i.e., high threshold exceedance regions, for the Red Sea, a vital region of high biodiversity. We analyze high-resolution satellite-derived SST data comprising daily measurements at 16703 grid cells across the Red Sea over the period 1985-2015. We propose a semiparametric Bayesian spatial mixed-effects linear model with a flexible mean structure to capture spatially-varying trend and seasonality, while the residual spatial variability is modeled through a Dirichlet process mixture (DPM) of low-rank spatial Student-$t$ processes (LTPs). By specifying cluster-specific parameters for each LTP mixture component, the bulk of the SST residuals influence tail inference and hotspot estimation only moderately. Our proposed model has a nonstationary mean, covariance and tail dependence, and posterior inference can be drawn efficiently through Gibbs sampling. In our application, we show that the proposed method outperforms some natural parametric and semiparametric alternatives. Moreover, we show how hotspots can be identified and we estimate extreme SST hotspots for the whole Red Sea, projected until the year 2100, based on the Representative Concentration Pathways 4.5 and 8.5. The estimated 95\% credible region for joint high threshold exceedances include large areas covering major endangered coral reefs in the southern Red Sea.
Silius M. Vandeskog, Sara Martino, Raphaël Huser
A successful model for high-dimensional spatial extremes should, in principle, be able to describe both weakening extremal dependence at increasing levels and changes in the type of extremal dependence class as a function of the distance between locations. Furthermore, the model should allow for computationally tractable inference using inference methods that efficiently extract information from data and that are robust to model misspecification. In this paper, we demonstrate how to fulfil all these requirements by developing a comprehensive methodological workflow for efficient Bayesian modelling of high-dimensional spatial extremes using the spatial conditional extremes model while performing fast inference with R-INLA. We then propose a post hoc adjustment method that results in more robust inference by properly accounting for possible model misspecification. The developed methodology is applied for modelling extreme hourly precipitation from high-resolution radar data in Norway. Inference is computationally efficient, and the resulting model fit successfully captures the main trends in the extremal dependence structure of the data. Robustifying the model fit by adjusting for possible misspecification further improves model performance.
Peng Zhong, Manuela Brunner, Thomas Opitz, Raphaël Huser
Extreme precipitation events with large spatial extents may have more severe impacts than localized events as they can lead to widespread flooding. It is debated how climate change may affect the spatial extent of precipitation extremes, whose investigation often directly relies on simulations from climate models. Here, we use a different strategy to investigate how future changes in spatial extents of precipitation extremes differ across climate zones and seasons in two river basins (Danube and Mississippi). We rely on observed precipitation extremes while exploiting a physics-based mean temperature covariate, which enables us to project future precipitation extents. We include the covariate into newly developed time-varying $r$-Pareto processes using a suitably chosen spatial aggregation functional $r$. This model captures temporal non-stationarity in the spatial dependence structure of precipitation extremes by linking it to the temperature covariate, which we derive from observations for model calibration and from debiased climate simulations (CMIP6) for projections. For both river basins, our results show negative correlation between the spatial extent and the temperature covariate for most of the rain season and an increasing trend in the margins, indicating a decrease in spatial precipitation extent in a warming climate during rain seasons as precipitation intensity increases locally.