Fabrizio Falasca
A central challenge in climate science and applied mathematics is developing data-driven models of multiscale systems that capture both stationary statistics and responses to external perturbations. Current neural climate emulators aim to resolve the atmosphere-ocean system in all its complexity but often struggle to reproduce forced responses, limiting their use in causal studies such as Green's function experiments. To explore the origin of these limitations, we first examine a simplified dynamical system that retains key features of climate variability. We interpret the results through linear response theory, providing a rigorous framework to evaluate neural models beyond stationary statistics and to probe causal mechanisms. We argue that the ability of emulators of multiscale systems to reproduce perturbed statistics depends critically on (i) the choice of an appropriate coarse-grained representation and (ii) careful parameterizations of unresolved processes. These insights highlight reduced-order models, tailored to specific goals, processes, and scales, as valuable alternatives to general-purpose emulators. We next consider a real-world application by developing a neural model to investigate the joint variability of the surface temperature field and radiative fluxes. The model infers a multiplicative noise process directly from data, largely reproduces the system's probability distribution, and enables causal studies through forced responses. We discuss its limitations and outline directions for future work. Overall, these results expose key challenges in data-driven modeling of multiscale physical systems and underscore the value of coarse-grained, stochastic approaches, with response theory providing a principled framework to guide model design and enhance causal understanding.
Fabrizio Falasca, Andrew Brettin, Laure Zanna, Stephen M. Griffies, Jianjun Yin, Ming Zhao
Studies agree on a significant global mean sea level rise in the 20th century and its recent 21st century acceleration in the satellite record. At regional scale, the evolution of sea level probability distributions is often assumed to be dominated by changes in the mean. However, a quantification of changes in distributional shapes in a changing climate is currently missing. To this end, we propose a novel framework quantifying significant changes in probability distributions from time series data. The framework first quantifies linear trends in quantiles through quantile regression. Quantile slopes are then projected onto a set of four $orthogonal$ polynomials quantifying how such changes can be explained by $independent$ shifts in the first four statistical moments. The framework proposed is theoretically founded, general and can be applied to any climate observable with close-to-linear changes in distributions. We focus on observations and a coupled climate model (GFDL-CM4). In the historical period, trends in coastal daily sea level have been driven mainly by changes in the mean and can therefore be explained by a shift of the distribution with no change in shape. In the modeled world, robust changes in higher order moments emerge with increasing CO2 concentration. Such changes are driven in part by ocean circulation alone and get amplified by sea level pressure fluctuations, with possible consequences for sea level extremes attribution studies.
Fabrizio Falasca, Laure Zanna
We present a framework for constructing physics and causally constrained neural models of turbulent dynamical systems from data. We first formulate a finite-time flow map with strict energy-preserving nonlinearities for stable modeling of temporally discrete trajectories. We then impose causal constraints to suppress spurious interactions across degrees of freedom. The resulting neural models accurately capture stationary statistics and responses to both small and large external forcings. We demonstrate the framework on the stochastic Charney-DeVore equations and on a symmetry-broken Lorenz-96 system. The framework is broadly applicable to reduced-order modeling of turbulent dynamical systems from observational data.
Fabrizio Falasca, Annalisa Bracco
The threat of global warming and the demand for reliable climate predictions pose a formidable challenge being the climate system multiscale, high-dimensional and nonlinear. Spatiotemporal recurrences of the system hint to the presence of a low-dimensional manifold containing the high-dimensional climate trajectory that could make the problem more tractable. Here we argue that reproducing the geometrical and topological properties of the low-dimensional attractor should be a key target for models used in climate projections. In doing so, we propose a general data-driven framework to characterize the climate attractor and showcase it in the tropical Pacific ocean using a reanalysis as observational proxy and two state-of-the-art models. The analysis spans four variables simultaneously over the periods 1979-2019 and 2060-2100. At each time t, the system can be uniquely described by a state space vector parameterized by N variables and their spatial variability. The dynamics is confined on a manifold with dimension lower than the full state space that we characterize through manifold learning algorithms, both linear and nonlinear. The local geometry and local stability of the high-dimensional, multi-variable climate attractor are quantified through the local dimension and persistence metrics. Model biases that hamper climate predictability are identified and found to be similar in the multivariate attractor of the two models during the historical period while diverging under the warming scenario considered. Finally, the relationships between different sub-spaces (univariate fields), and therefore among climate variables, are evaluated. The proposed framework provides a comprehensive, physically based, test for assessing climate feedbacks and opens new avenues for improving their model representation.
Fabrizio Falasca, Pavel Perezhogin, Laure Zanna
We propose a data-driven framework to simplify the description of spatiotemporal climate variability into few entities and their causal linkages. Given a high-dimensional climate field, the methodology first reduces its dimensionality into a set of regionally constrained patterns. Time-dependent causal links are then inferred in the interventional sense through the fluctuation-response formalism, as shown in Baldovin et al. (2020). These two steps allow to explore how regional climate variability can influence remote locations. To distinguish between true and spurious responses, we propose a novel analytical null model for the fluctuation-dissipation relation, therefore allowing for uncertainty estimation at a given confidence level. Finally, we select a set of metrics to summarize the results, offering a useful and simplified approach to explore climate dynamics. We showcase the methodology on the monthly sea surface temperature field at global scale. We demonstrate the usefulness of the proposed framework by studying few individual links as well as "link maps", visualizing the cumulative degree of causation between a given region and the whole system. Finally, each pattern is ranked in terms of its "causal strength", quantifying its relative ability to influence the system's dynamics. We argue that the methodology allows to explore and characterize causal relationships in high-dimensional spatiotemporal fields in a rigorous and interpretable way.
Fabrizio Falasca, Aurora Basinski-Ferris, Laure Zanna, Ming Zhao
Radiative forcing drives warming in the Earth system, leading to changes in sea surface temperatures (SSTs) and associated radiative feedbacks. The link between changes in the top-of-the-atmosphere (TOA) net radiative flux and SST patterns, known as the "pattern effect", is typically diagnosed by studying the response of atmosphere-only models to SST perturbations. In this work, we diagnose the pattern effect through response theory, by performing idealized warming perturbation experiments from unperturbed data alone. First, by studying the response at short time scales, where the response is dominated by atmospheric variability, we recover results that agree with the literature. Second, by extending the framework to longer time scales, we capture coupled interactions between the slow ocean component and the atmosphere, yielding a novel "sensitivity map" quantifying the response of the net radiative flux to SST perturbations in the coupled system. Here, feedbacks are captured by a spatiotemporal response operator, rather than time-independent maps as in traditional studies. Both formulations skillfully reconstruct changes in externally forced simulations and provide practical strategies for climate studies. The key distinction lies in their perspectives on climate feedbacks. The first formulation, closely aligned with prediction tasks, follows the traditional view in which slow variables, such as SSTs, exert a one-way influence on fast variables. The second formulation broadens this perspective by incorporating spatiotemporal interactions across state variables. This alternative approach explores how localized SST perturbations can alter the coupled dynamics, leading to temperature changes in remote areas and further impacting the radiative fluxes at later times.
Valentina Zantedeschi, Fabrizio Falasca, Alyson Douglas, Richard Strange, Matt J. Kusner, Duncan Watson-Parris
One of the greatest sources of uncertainty in future climate projections comes from limitations in modelling clouds and in understanding how different cloud types interact with the climate system. A key first step in reducing this uncertainty is to accurately classify cloud types at high spatial and temporal resolution. In this paper, we introduce Cumulo, a benchmark dataset for training and evaluating global cloud classification models. It consists of one year of 1km resolution MODIS hyperspectral imagery merged with pixel-width 'tracks' of CloudSat cloud labels. Bringing these complementary datasets together is a crucial first step, enabling the Machine-Learning community to develop innovative new techniques which could greatly benefit the Climate community. To showcase Cumulo, we provide baseline performance analysis using an invertible flow generative model (IResNet), which further allows us to discover new sub-classes for a given cloud class by exploring the latent space. To compare methods, we introduce a set of evaluation criteria, to identify models that are not only accurate, but also physically-realistic. CUMULO can be download from https://www.dropbox.com/sh/i3s9q2v2jjyk2it/AACxXnXfMF5wuIqLXqH4NJOra?dl=0 .
Ludovico T. Giorgini, Fabrizio Falasca, Andre N. Souza
We present a novel and flexible data-driven framework for estimating the response of higher-order moments of nonlinear stochastic systems to small external perturbations. The classical Generalized Fluctuation--Dissipation Theorem (GFDT) links the unperturbed steady-state distribution to the system's linear response. While standard implementations relying on Gaussian approximations can predict the mean response, they often fail to capture changes in higher-order moments. To overcome this, we combine GFDT with score-based generative modeling to estimate the system's score function directly from data. We demonstrate the framework's versatility by employing two complementary score estimation techniques tailored to the system's characteristics: (i) a clustering-based algorithm (KGMM) for systems with low-dimensional effective dynamics, and (ii) a denoising score matching method implemented with a U-Net architecture for high-dimensional, spatially-extended systems where reduced-order modeling is not feasible. Our method is validated on several stochastic models relevant to climate dynamics: three reduced-order models of increasing complexity and a 2D Navier--Stokes model representing a turbulent flow with a localized perturbation. In all cases, the approach accurately captures strongly nonlinear and non-Gaussian features of the system's response, significantly outperforming traditional Gaussian approximations.
Laure Zanna, William Gregory, Pavel Perezhogin, Aakash Sane, Cheng Zhang, Alistair Adcroft, Mitch Bushuk, Carlos Fernandez-Granda, Brandon Reichl, Dhruv Balwada, Julius Busecke, William Chapman, Alex Connolly, Danni Du, Kelsey Everard, Fabrizio Falasca, Renaud Falga, David Kamm, Etienne Meunier, Qi Liu, Antoine Nasser, Matthew Pudig, Andrew Shao, Julia L. Simpson, Linus Vogt, Jiarong Wu
Climate simulations, at all grid resolutions, rely on approximations that encapsulate the forcing due to unresolved processes on resolved variables, known as parameterizations. Parameterizations often lead to inaccuracies in climate models, with significant biases in the physics of key climate phenomena. Advances in artificial intelligence (AI) are now directly enabling the learning of unresolved processes from data to improve the physics of climate simulations. Here, we introduce a flexible framework for developing and implementing physics- and scale-aware machine learning parameterizations within climate models. We focus on the ocean and sea-ice components of a state-of-the-art climate model by implementing a spectrum of data-driven parameterizations, ranging from complex deep learning models to more interpretable equation-based models. Our results showcase the viability of AI-driven parameterizations in operational models, advancing the capabilities of a new generation of hybrid simulations, and include prototypes of fully coupled atmosphere-ocean-sea-ice hybrid simulations. The tools developed are open source, accessible, and available to all.
Leif Fredericks, Maria Rugenstein, David W. J. Thompson, Senne Van Loon, Fabrizio Falasca, Rory Basinski-Ferris, Paulo Ceppi, Quran Wu, Jonah Bloch-Johnson, Marc Alessi, Sarah M. Kang
Over the past decade, it has become clear that the radiative response to surface temperature change depends on the spatially varying structure in the temperature field, a phenomenon known as the "pattern effect". The pattern effect is commonly estimated from dedicated climate model simulations forced with local surface temperatures patches (Green's function experiments). Green's function experiments capture causal influences from temperature perturbations, but are computationally expensive to run. Recently, however, several methods have been proposed that estimate the pattern effect through statistical means. These methods can accurately predict the radiative response to temperature variations in climate model simulations. The goal of this paper is to compare methods used to quantify the pattern effect. We apply each method to the same prediction task and discuss its advantages and disadvantages. Most methods indicate large negative feedbacks over the western Pacific. Over other regions, the methods frequently disagree on feedback sign and spatial homogeneity. While all methods yield similar predictions of the global radiative response to surface temperature variations driven by internal variability, they produce very different predictions from the patterns of surface temperature change in simulations forced with increasing CO2 concentrations. We discuss reasons for the discrepancies between methods and recommend paths towards using them in the future to enhance physical understanding of the pattern effect.