Laure Zanna, William Gregory, Pavel Perezhogin, Aakash Sane, Cheng Zhang, Alistair Adcroft, Mitch Bushuk, Carlos Fernandez-Granda, Brandon Reichl, Dhruv Balwada, Julius Busecke, William Chapman, Alex Connolly, Danni Du, Kelsey Everard, Fabrizio Falasca, Renaud Falga, David Kamm, Etienne Meunier, Qi Liu, Antoine Nasser, Matthew Pudig, Andrew Shao, Julia L. Simpson, Linus Vogt, Jiarong Wu
Climate simulations, at all grid resolutions, rely on approximations that encapsulate the forcing due to unresolved processes on resolved variables, known as parameterizations. Parameterizations often lead to inaccuracies in climate models, with significant biases in the physics of key climate phenomena. Advances in artificial intelligence (AI) are now directly enabling the learning of unresolved processes from data to improve the physics of climate simulations. Here, we introduce a flexible framework for developing and implementing physics- and scale-aware machine learning parameterizations within climate models. We focus on the ocean and sea-ice components of a state-of-the-art climate model by implementing a spectrum of data-driven parameterizations, ranging from complex deep learning models to more interpretable equation-based models. Our results showcase the viability of AI-driven parameterizations in operational models, advancing the capabilities of a new generation of hybrid simulations, and include prototypes of fully coupled atmosphere-ocean-sea-ice hybrid simulations. The tools developed are open source, accessible, and available to all.
William Gregory, Mitchell Bushuk, Yongfei Zhang, Alistair Adcroft, Laure Zanna
In this study we perform online sea ice bias correction within a GFDL global ice-ocean model. For this, we use a convolutional neural network (CNN) which was developed in a previous study (Gregory et al., 2023) for the purpose of predicting sea ice concentration (SIC) data assimilation (DA) increments. An initial implementation of the CNN shows systematic improvements in SIC biases relative to the free-running model, however large summertime errors remain. We show that these residual errors can be significantly improved with a data augmentation approach, in which sequential CNN and DA corrections are applied to a new simulation over the training period. This then provides a new training data set with which to refine the weights of the initial network. We propose that this machine-learned correction scheme could be utilized for generating improved initial conditions, and also for real-time sea ice bias correction within seasonal-to-subseasonal sea ice forecasts.
Surya Dheeshjith, Adam Subel, Shubham Gupta, Alistair Adcroft, Carlos Fernandez-Granda, Julius Busecke, Laure Zanna
With the success of machine learning (ML) applied to climate reaching further every day, emulators have begun to show promise not only for weather but for multi-year time scales in the atmosphere. Similar work for the ocean remains nascent, with state-of-the-art limited to models running for shorter time scales or only for regions of the globe. In this work, we demonstrate high-skill global emulation for surface ocean fields over 5-8 years of model rollout, accurately representing modes of variability for two different ML architectures (ConvNext and Transformers). In addition, we address the outstanding question of generalization, an essential consideration if the end-use of emulation is to model warming scenarios outside of the model training data. We show that 1) generalization is not an intrinsic feature of a data-driven emulator, 2) fine-tuning the emulator on only small amounts of additional data from a distribution similar to the test set can enable the emulator to perform well in a warmed climate, and 3) the forced emulators are robust to noise in the forcing.
William Gregory, Mitchell Bushuk, Yong-Fei Zhang, Alistair Adcroft, Laure Zanna, Colleen McHugh, Liwei Jia
We showcase a hybrid modeling framework which embeds machine learning (ML) inference into the GFDL SPEAR climate model, for online sea ice bias correction during a set of global fully-coupled 1-year retrospective forecasts. We compare two hybrid versions of SPEAR to understand the importance of exposing ML models to coupled ice-atmosphere-ocean feedbacks before implementation into fully-coupled simulations: Hybrid_CPL (with feedbacks) and Hybrid_IO (without feedbacks). Relative to SPEAR, Hybrid_CPL systematically reduces seasonal forecast errors in the Arctic and significantly reduces Antarctic errors for target months May-December, with >2x error reduction in 4-6-month lead forecasts of Antarctic winter sea ice extent. Meanwhile, Hybrid_IO suffers from out-of-sample behavior which can trigger a chain of Southern Ocean feedbacks, leading to ice-free Antarctic summers. Our results demonstrate that ML can significantly improve numerical sea ice prediction capabilities and that exposing ML models to coupled ice-atmosphere-ocean processes is essential for generalization in fully-coupled simulations.
Pavel Perezhogin, Alistair Adcroft, Laure Zanna
Data-driven methods have become popular to parameterize the effects of mesoscale eddies in ocean models. However, they perform poorly in generalization tasks and may require retuning if the grid resolution or ocean configuration changes. We address the generalization problem by enforcing physics constraints on a neural network parameterization of mesoscale eddy fluxes. We found that the local scaling of input and output features helps to generalize to unseen grid resolutions and depths offline in the global ocean. The scaling is based on dimensional analysis and incorporates grid spacing as a length scale. We formulate our findings as a general algorithm that can be used to enforce data-driven parameterizations with dimensional scaling. The new parameterization improves the representation of kinetic and potential energy in online simulations with idealized and global ocean models. Comparison to baseline parameterizations and impact on global ocean biases are discussed.
James P. C. Duncan, Elynn Wu, Surya Dheeshjith, Adam Subel, Troy Arcomano, Spencer K. Clark, Brian Henn, Anna Kwa, Jeremy McGibbon, W. Andre Perkins, William Gregory, Carlos Fernandez-Granda, Julius Busecke, Oliver Watt-Meyer, William J. Hurlin, Alistair Adcroft, Laure Zanna, Christopher Bretherton
Traditional numerical global climate models simulate the full Earth system by exchanging boundary conditions between separate simulators of the atmosphere, ocean, sea ice, land surface, and other geophysical processes. This paradigm allows for distributed development of individual components within a common framework, unified by a coupler that handles translation between realms via spatial or temporal alignment and flux exchange. Following a similar approach adapted for machine learning-based emulators, we present SamudrACE: a coupled global climate model emulator which produces centuries-long simulations at 1-degree horizontal, 6-hourly atmospheric, and 5-daily oceanic resolution, with 145 2D fields spanning 8 atmospheric and 19 oceanic vertical levels, plus sea ice, surface, and top-of-atmosphere variables. SamudrACE is highly stable and has low climate biases comparable to those of its components with prescribed boundary forcing, with realistic variability in coupled climate phenomena such as ENSO that is not possible to simulate in uncoupled mode.
Christian Pedersen, Laure Zanna, Joan Bruna, Pavel Perezhogin
Integration of machine learning (ML) models of unresolved dynamics into numerical simulations of fluid dynamics has been demonstrated to improve the accuracy of coarse resolution simulations. However, when trained in a purely offline mode, integrating ML models into the numerical scheme can lead to instabilities. In the context of a 2D, quasi-geostrophic turbulent system, we demonstrate that including an additional network in the loss function, which emulates the state of the system into the future, produces offline-trained ML models that capture important subgrid processes, with improved stability properties.
Cheng Zhang, Pavel Perezhogin, Cem Gultekin, Alistair Adcroft, Carlos Fernandez-Granda, Laure Zanna
We address the question of how to use a machine learned parameterization in a general circulation model, and assess its performance both computationally and physically. We take one particular machine learned parameterization \cite{Guillaumin1&Zanna-JAMES21} and evaluate the online performance in a different model from which it was previously tested. This parameterization is a deep convolutional network that predicts parameters for a stochastic model of subgrid momentum forcing by mesoscale eddies. We treat the parameterization as we would a conventional parameterization once implemented in the numerical model. This includes trying the parameterization in a different flow regime from that in which it was trained, at different spatial resolutions, and with other differences, all to test generalization. We assess whether tuning is possible, which is a common practice in general circulation model development. We find the parameterization, without modification or special treatment, to be stable and that the action of the parameterization to be diminishing as spatial resolution is refined. We also find some limitations of the machine learning model in implementation: 1) tuning of the outputs from the parameterization at various depths is necessary; 2) the forcing near boundaries is not predicted as well as in the open ocean; 3) the cost of the parameterization is prohibitively high on CPUs. We discuss these limitations, present some solutions to problems, and conclude that this particular ML parameterization does inject energy, and improve backscatter, as intended but it might need further refinement before we can use it in production mode in contemporary climate models.
William Gregory, Mitchell Bushuk, Alistair Adcroft, Yongfei Zhang, Laure Zanna
Data assimilation is often viewed as a framework for correcting short-term error growth in dynamical climate model forecasts. When viewed on the time scales of climate however, these short-term corrections, or analysis increments, can closely mirror the systematic bias patterns of the dynamical model. In this study, we use convolutional neural networks (CNNs) to learn a mapping from model state variables to analysis increments, in order to showcase the feasibility of a data-driven model parameterization which can predict state-dependent model errors. We undertake this problem using an ice-ocean data assimilation system within the Seamless system for Prediction and EArth system Research (SPEAR) model, developed at the Geophysical Fluid Dynamics Laboratory, which assimilates satellite observations of sea ice concentration every 5 days between 1982--2017. The CNN then takes inputs of data assimilation forecast states and tendencies, and makes predictions of the corresponding sea ice concentration increments. Specifically, the inputs are states and tendencies of sea ice concentration, sea-surface temperature, ice velocities, ice thickness, net shortwave radiation, ice-surface skin temperature, sea-surface salinity, as well as a land-sea mask. We find the CNN is able to make skillful predictions of the increments in both the Arctic and Antarctic and across all seasons, with skill that consistently exceeds that of a climatological increment prediction. This suggests that the CNN could be used to reduce sea ice biases in free-running SPEAR simulations, either as a sea ice parameterization or an online bias correction tool for numerical sea ice forecasts.
Cem Gultekin, Adam Subel, Cheng Zhang, Matan Leibovich, Pavel Perezhogin, Alistair Adcroft, Carlos Fernandez-Granda, Laure Zanna
Due to computational constraints, climate simulations cannot resolve a range of small-scale physical processes, which have a significant impact on the large-scale evolution of the climate system. Parameterization is an approach to capture the effect of these processes, without resolving them explicitly. In recent years, data-driven parameterizations based on convolutional neural networks have obtained promising results. In this work, we provide an in-depth analysis of these parameterizations developed using data from ocean simulations. The parametrizations account for the effect of mesoscale eddies toward improving simulations of momentum, heat, and mass exchange in the ocean. Our results provide several insights into the properties of data-driven parameterizations based on neural networks. First, their performance can be substantially improved by increasing the geographic extent of the training data. Second, they learn nonlinear structure, since they are able to outperform a linear baseline. Third, they generalize robustly across different CO2 forcings, but not necessarily across different ocean depths. Fourth, they exploit a relatively small region of their input to generate their output. Our results will guide the further development of ocean mesoscale eddy parameterizations, and multiscale modeling more generally.
J. Nathan Kutz, Peter Battaglia, Michael Brenner, Kevin Carlberg, Aric Hagberg, Shirley Ho, Stephan Hoyer, Henning Lange, Hod Lipson, Michael W. Mahoney, Frank Noe, Max Welling, Laure Zanna, Francis Zhu, Steven L. Brunton
Machine learning (ML) and artificial intelligence (AI) algorithms are transforming and empowering the characterization and control of dynamic systems in the engineering, physical, and biological sciences. These emerging modeling paradigms require comparative metrics to evaluate a diverse set of scientific objectives, including forecasting, state reconstruction, generalization, and control, while also considering limited data scenarios and noisy measurements. We introduce a common task framework (CTF) for science and engineering, which features a growing collection of challenge data sets with a diverse set of practical and common objectives. The CTF is a critically enabling technology that has contributed to the rapid advance of ML/AI algorithms in traditional applications such as speech recognition, language processing, and computer vision. There is a critical need for the objective metrics of a CTF to compare the diverse algorithms being rapidly developed and deployed in practice today across science and engineering.
Pavel Perezhogin, Alistair Adcroft, Laure Zanna
Global ocean models exhibit biases in the mean state and variability, particularly at coarse resolution, where mesoscale eddies are unresolved. To address these biases, parameterization coefficients are typically tuned ad hoc. Here, we formulate parameter tuning as a calibration problem using Ensemble Kalman Inversion (EKI). We optimize parameters of a neural network parameterization of mesoscale eddies in two idealized ocean models at coarse resolution. The calibrated parameterization reduces errors in the time-averaged fluid interfaces and their variability by approximately a factor of two compared to the unparameterized model or the offline-trained parameterization. The EKI method is robust to noise in time-averaged statistics arising from chaotic ocean dynamics. Furthermore, we propose an efficient calibration protocol that bypasses integration to statistical equilibrium by carefully choosing an initial condition. These results demonstrate that systematic calibration can substantially improve coarse-resolution ocean simulations and provide a practical pathway for reducing biases in global ocean models.
Sheng Liu, Aakash Kaku, Weicheng Zhu, Matan Leibovich, Sreyas Mohan, Boyang Yu, Haoxiang Huang, Laure Zanna, Narges Razavian, Jonathan Niles-Weed, Carlos Fernandez-Granda
Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty. Probability-estimation models are trained on observed outcomes (e.g. whether it has rained or not, or whether a patient has died or not), because the ground-truth probabilities of the events of interest are typically unknown. The problem is therefore analogous to binary classification, with the difference that the objective is to estimate probabilities rather than predicting the specific outcome. This work investigates probability estimation from high-dimensional data using deep neural networks. There exist several methods to improve the probabilities generated by these models but they mostly focus on model (epistemic) uncertainty. For problems with inherent uncertainty, it is challenging to evaluate performance without access to ground-truth probabilities. To address this, we build a synthetic dataset to study and compare different computable metrics. We evaluate existing methods on the synthetic data as well as on three real-world probability estimation tasks, all of which involve inherent uncertainty: precipitation forecasting from radar images, predicting cancer patient survival from histopathology images, and predicting car crashes from dashcam videos. We also give a theoretical analysis of a model for high-dimensional probability estimation which reproduces several of the phenomena evinced in our experiments. Finally, we propose a new method for probability estimation using neural networks, which modifies the training process to promote output probabilities that are consistent with empirical probabilities computed from the data. The method outperforms existing approaches on most metrics on the simulated as well as real-world data.
Gustau Camps-Valls, Andreas Gerhardus, Urmi Ninad, Gherardo Varando, Georg Martius, Emili Balaguer-Ballester, Ricardo Vinuesa, Emiliano Diaz, Laure Zanna, Jakob Runge
Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.
Fabrizio Falasca, Pavel Perezhogin, Laure Zanna
We propose a data-driven framework to simplify the description of spatiotemporal climate variability into few entities and their causal linkages. Given a high-dimensional climate field, the methodology first reduces its dimensionality into a set of regionally constrained patterns. Time-dependent causal links are then inferred in the interventional sense through the fluctuation-response formalism, as shown in Baldovin et al. (2020). These two steps allow to explore how regional climate variability can influence remote locations. To distinguish between true and spurious responses, we propose a novel analytical null model for the fluctuation-dissipation relation, therefore allowing for uncertainty estimation at a given confidence level. Finally, we select a set of metrics to summarize the results, offering a useful and simplified approach to explore climate dynamics. We showcase the methodology on the monthly sea surface temperature field at global scale. We demonstrate the usefulness of the proposed framework by studying few individual links as well as "link maps", visualizing the cumulative degree of causation between a given region and the whole system. Finally, each pattern is ranked in terms of its "causal strength", quantifying its relative ability to influence the system's dynamics. We argue that the methodology allows to explore and characterize causal relationships in high-dimensional spatiotemporal fields in a rigorous and interpretable way.
Pavel Perezhogin, Laure Zanna, Carlos Fernandez-Granda
Subgrid parameterizations of mesoscale eddies continue to be in demand for climate simulations. These subgrid parameterizations can be powerfully designed using physics and/or data-driven methods, with uncertainty quantification. For example, Guillaumin and Zanna (2021) proposed a Machine Learning (ML) model that predicts subgrid forcing and its local uncertainty. The major assumption and potential drawback of this model is the statistical independence of stochastic residuals between grid points. Here, we aim to improve the simulation of stochastic forcing with generative models of ML, such as Generative adversarial network (GAN) and Variational autoencoder (VAE). Generative models learn the distribution of subgrid forcing conditioned on the resolved flow directly from data and they can produce new samples from this distribution. Generative models can potentially capture not only the spatial correlation but any statistically significant property of subgrid forcing. We test the proposed stochastic parameterizations offline and online in an idealized ocean model. We show that generative models are able to predict subgrid forcing and its uncertainty with spatially correlated stochastic forcing. Online simulations for a range of resolutions demonstrated that generative models are superior to the baseline ML model at the coarsest resolution.
Abigail Bodner, Dhruv Balwada, Laure Zanna
Parameterizations of O(1-10)km submesoscale flows in General Circulation Models (GCMs) represent the effects of unresolved vertical buoyancy fluxes in the ocean mixed layer. These submesoscale flows interact non-linearly with mesoscale and boundary layer turbulence, and it is challenging to account for all the relevant processes in physics-based parameterizations. In this work, we present a data-driven approach for the submesoscale parameterization, that relies on a Convolutional Neural Network (CNN) trained to predict mixed layer vertical buoyancy fluxes as a function of relevant large-scale variables. The data used for training is given from 12 regions sampled from the global high-resolution MITgcm-LLC4320 simulation. When compared with the baseline of a submesoscale physics-based parameterization, the CNN demonstrates high offline skill across all regions, seasons, and filter scales tested in this study. During seasons when submesoscales are most active, which generally corresponds to winter and spring months, we find that the CNN prediction skill tends to be lower than in summer months. The CNN exhibits strong dependency on the mixed layer depth and on the large scale strain field, a variable closely related to frontogenesis, which is currently missing from the submesoscale parameterizations in GCMs.
Fabrizio Falasca, Aurora Basinski-Ferris, Laure Zanna, Ming Zhao
Radiative forcing drives warming in the Earth system, leading to changes in sea surface temperatures (SSTs) and associated radiative feedbacks. The link between changes in the top-of-the-atmosphere (TOA) net radiative flux and SST patterns, known as the "pattern effect", is typically diagnosed by studying the response of atmosphere-only models to SST perturbations. In this work, we diagnose the pattern effect through response theory, by performing idealized warming perturbation experiments from unperturbed data alone. First, by studying the response at short time scales, where the response is dominated by atmospheric variability, we recover results that agree with the literature. Second, by extending the framework to longer time scales, we capture coupled interactions between the slow ocean component and the atmosphere, yielding a novel "sensitivity map" quantifying the response of the net radiative flux to SST perturbations in the coupled system. Here, feedbacks are captured by a spatiotemporal response operator, rather than time-independent maps as in traditional studies. Both formulations skillfully reconstruct changes in externally forced simulations and provide practical strategies for climate studies. The key distinction lies in their perspectives on climate feedbacks. The first formulation, closely aligned with prediction tasks, follows the traditional view in which slow variables, such as SSTs, exert a one-way influence on fast variables. The second formulation broadens this perspective by incorporating spatiotemporal interactions across state variables. This alternative approach explores how localized SST perturbations can alter the coupled dynamics, leading to temperature changes in remote areas and further impacting the radiative fluxes at later times.
Chris Pedersen, Laure Zanna, Joan Bruna
Autoregressive surrogate models (or \textit{emulators}) of spatiotemporal systems provide an avenue for fast, approximate predictions, with broad applications across science and engineering. At inference time, however, these models are generally unable to provide predictions over long time rollouts due to accumulation of errors leading to diverging trajectories. In essence, emulators operate out of distribution, and controlling the online distribution quickly becomes intractable in large-scale settings. To address this fundamental issue, and focusing on time-stationary systems admitting an invariant measure, we leverage diffusion models to obtain an implicit estimator of the score of this invariant measure. We show that this model of the score function can be used to stabilize autoregressive emulator rollouts by applying on-the-fly denoising during inference, a process we call \textit{thermalization}. Thermalizing an emulator rollout is shown to extend the time horizon of stable predictions by an order of magnitude in complex systems exhibiting turbulent and chaotic behavior, opening up a novel application of diffusion models in the context of neural emulation.
Andrew Brettin, Laure Zanna, Elizabeth A. Barnes
Reliable dynamic sea level forecasts are hindered by numerous sources of uncertainty on daily-to-seasonal timescales (1-180 days) due to atmospheric boundary conditions and internal ocean variability. Studies have demonstrated that certain initial states can extend predictability horizons; thus, identifying these initial conditions may help improve forecast skill. Here, we identify sources of dynamic sea level predictability on daily-to-seasonal timescales using neural networks trained on CESM2 large ensemble data to forecast dynamic sea level. The forecasts yield not only a point estimate for sea level but also a standard deviation to quantify forecast uncertainty based on the initial conditions. Forecasted uncertainties can be leveraged to identify state-dependent sources of predictability at most locations and forecast leads. Network forecasts, particularly in the low-latitude Indo-Pacific, exhibit skillful deterministic predictions and skillfully forecast exceedance probabilities relative to local linear baselines. For networks trained at Guam and in the western Indian Ocean, the transfer of sources of predictability from local sources to remote sources is presented by the deteriorating utility of initial condition information for predicting exceedance events. Propagating Rossby waves are identified as a potential source of predictability for dynamic sea level at Guam. In the Indian Ocean, persistence of thermosteric sea level anomalies from the Indian Ocean Dipole may be a source of predictability on subseasonal timescales, but El Niño drives predictability on seasonal timescales. This work shows how uncertainty-quantifying machine learning can help identify changes in sources of state-dependent predictability over a range of forecast leads.