Griffin Mooers, Jens Tuyls, Stephan Mandt, Michael Pritchard, Tom Beucler
While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensionality reduction, and clustering of high-resolution vertical velocity fields. Trained on ~6*10^6 samples spanning the globe, the VAE successfully reconstructs the spatial structure of convection, performs unsupervised clustering of convective organization regimes, and identifies anomalous storm activity, confirming the potential of generative modeling to power stochastic parameterizations of convection in climate models.
Yevgeniy Vorobeychik, Michael Pritchard
We propose a framework for cyber risk assessment and mitigation which models attackers as formal planners and defenders as interdicting such plans. We illustrate the value of plan interdiction problems by first modeling network cyber risk through the use of formal planning, and subsequently formalizing an important question of prioritizing vulnerabilities for patching in the plan interdiction framework. In particular, we show that selectively patching relatively few vulnerabilities allows a network administrator to significantly reduce exposure to cyber risk. More broadly, we have developed a number of scalable approaches for plan interdiction problems, making especially significant advances when attack plans involve uncertainty about system dynamics. However, important open problems remain, including how to effectively capture information asymmetry between the attacker and defender, how to best model dynamics in the attacker-defender interaction, and how to develop scalable algorithms for solving associated plan interdiction games.
Oliver Watt-Meyer, Gideon Dresdner, Jeremy McGibbon, Spencer K. Clark, Brian Henn, James Duncan, Noah D. Brenowitz, Karthik Kashinath, Michael S. Pritchard, Boris Bonev, Matthew E. Peters, Christopher S. Bretherton
Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter, autoregressive machine learning emulator of an existing comprehensive 100-km resolution global atmospheric model. The formulation of ACE allows evaluation of physical laws such as the conservation of mass and moisture. The emulator is stable for 100 years, nearly conserves column moisture without explicit constraints and faithfully reproduces the reference model's climate, outperforming a challenging baseline on over 90% of tracked variables. ACE requires nearly 100x less wall clock time and is 100x more energy efficient than the reference model using typically available resources. Without fine-tuning, ACE can stably generalize to a previously unseen historical sea surface temperature dataset.
Griffin Mooers, Tom Beucler, Mike Pritchard, Stephan Mandt
Despite the importance of quantifying how the spatial patterns of extreme precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised machine learning framework to quantify how storm dynamics affect changes in precipitation extremes, without sacrificing spatial information. For the upper precipitation quantiles (above the 80th percentile), we find that the spatial patterns of extreme precipitation changes are dominated by spatial shifts in storm dynamical regimes rather than changes in how these storm regimes produce precipitation. Our study shows how unsupervised machine learning, paired with domain knowledge, may allow us to better understand the physics of the atmosphere and anticipate the changes associated with a warming world.
Liran Peng, Peter N. Blossey, Walter M. Hannah, Christopher S. Bretherton, Christopher R. Terai, Andrea M. Jenney, Michael Pritchard
High-Resolution Multi-scale Modeling Frameworks (HR) -- global climate models that embed separate, convection-resolving models with high enough resolution to resolve boundary layer eddies -- have exciting potential for investigating low cloud feedback dynamics due to reduced parameterization and ability for multidecadal throughput on modern computing hardware. However low clouds in past HR have suffered a stubborn problem of over-entrainment due to an uncontrolled source of mixing across the marine subtropical inversion manifesting as stratocumulus dim biases in present-day climate, limiting their scientific utility. We report new results showing that this over-entrainment can be partly offset by using hyperviscosity and cloud droplet sedimentation. Hyperviscosity damps small-scale momentum fluctuations associated with the formulation of the momentum solver of the embedded LES. By considering the sedimentation process adjacent to default one-moment microphysics in HR, condensed phase particles can be removed from the entrainment zone, which further reduces entrainment efficiency. The result is an HR that is able to produce more low clouds with a higher liquid water path and a reduced stratocumulus dim bias. Associated improvements in the explicitly simulated sub-cloud eddy spectrum are observed. We report these sensitivities in multi-week tests and then explore their operational potential alongside microphysical retuning in decadal simulations at operational 1.5 degree exterior resolution. The result is a new HR having desired improvements in the baseline present-day low cloud climatology, and a reduced global mean bias and root mean squared error of absorbed shortwave radiation. We suggest it should be promising for examining low cloud feedbacks with minimal approximation.
Fernando Iglesias-Suarez, Pierre Gentine, Breixo Solino-Fernandez, Tom Beucler, Michael Pritchard, Jakob Runge, Veronika Eyring
Climate models are essential to understand and project climate change, yet long-standing biases and uncertainties in their projections remain. This is largely associated with the representation of subgrid-scale processes, particularly clouds and convection. Deep learning can learn these subgrid-scale processes from computationally expensive storm-resolving models while retaining many features at a fraction of computational cost. Yet, climate simulations with embedded neural network parameterizations are still challenging and highly depend on the deep learning solution. This is likely associated with spurious non-physical correlations learned by the neural networks due to the complexity of the physical dynamical system. Here, we show that the combination of causality with deep learning helps removing spurious correlations and optimizing the neural network algorithm. To resolve this, we apply a causal discovery method to unveil causal drivers in the set of input predictors of atmospheric subgrid-scale processes of a superparameterized climate model in which deep convection is explicitly resolved. The resulting causally-informed neural networks are coupled to the climate model, hence, replacing the superparameterization and radiation scheme. We show that the climate simulations with causally-informed neural network parameterizations retain many convection-related properties and accurately generate the climate of the original high-resolution climate model, while retaining similar generalization capabilities to unseen climates compared to the non-causal approach. The combination of causal discovery and deep learning is a new and promising approach that leads to stable and more trustworthy climate simulations and paves the way towards more physically-based causal deep learning approaches also in other scientific disciplines.
Gunnar Behrens, Tom Beucler, Pierre Gentine, Fernando Iglesias-Suarez, Michael Pritchard, Veronika Eyring
Deep learning can accurately represent sub-grid-scale convective processes in climate models, learning from high resolution simulations. However, deep learning methods usually lack interpretability due to large internal dimensionality, resulting in reduced trustworthiness in these methods. Here, we use Variational Encoder Decoder structures (VED), a non-linear dimensionality reduction technique, to learn and understand convective processes in an aquaplanet superparameterized climate model simulation, where deep convective processes are simulated explicitly. We show that similar to previous deep learning studies based on feed-forward neural nets, the VED is capable of learning and accurately reproducing convective processes. In contrast to past work, we show this can be achieved by compressing the original information into only five latent nodes. As a result, the VED can be used to understand convective processes and delineate modes of convection through the exploration of its latent dimensions. A close investigation of the latent space enables the identification of different convective regimes: a) stable conditions are clearly distinguished from deep convection with low outgoing longwave radiation and strong precipitation; b) high optically thin cirrus-like clouds are separated from low optically thick cumulus clouds; and c) shallow convective processes are associated with large-scale moisture content and surface diabatic heating. Our results demonstrate that VEDs can accurately represent convective processes in climate models, while enabling interpretability and better understanding of sub-grid-scale physical processes, paving the way to increasingly interpretable machine learning parameterizations with promising generative properties
Tom Beucler, Tristan Abbott, Timothy Cronin, Michael Pritchard
Idealized convection-permitting simulations of radiative-convective equilibrium (RCE) have become a popular tool for understanding the physical processes leading to horizontal variability of tropical water vapor and rainfall. However, the applicability of idealized simulations to nature is still unclear given that important processes are typically neglected, such as lateral vapor advection by extratropical intrusions, or interactive ocean coupling. Here, we exploit spectral analysis to compactly summarize the multi-scale processes supporting convective aggregation. By applying this framework to high-resolution reanalysis data and satellite observations in addition to idealized simulations, we compare convective-aggregation processes across horizontal scales and data sets. The results affirm the validity of the RCE simulations as an analogy to the real world. Column moist static energy tendencies share similar signs and scale-selectivity in convection-permitting models and observations: Radiation increases variance at wavelengths above 1,000km, while advection damps variance across wavelengths, and surface fluxes mostly reduce variance between 1,000km and 10,000km.
Tom Beucler, Michael Pritchard, Stephan Rasp, Jordan Ott, Pierre Baldi, Pierre Gentine
Neural networks can emulate nonlinear physical systems with high accuracy, yet they may produce physically-inconsistent results when violating fundamental constraints. Here, we introduce a systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the architecture or the loss function. Applied to convective processes for climate modeling, architectural constraints enforce conservation laws to within machine precision without degrading performance. Enforcing constraints also reduces errors in the subsets of the outputs most impacted by the constraints.
Stathi Fotiadis, Noah Brenowitz, Tomas Geffner, Yair Cohen, Michael Pritchard, Arash Vahdat, Morteza Mardani
Conditioning diffusion and flow models have proven effective for super-resolving small-scale details in natural images.However, in physical sciences such as weather, super-resolving small-scale details poses significant challenges due to: (i) misalignment between input and output distributions (i.e., solutions to distinct partial differential equations (PDEs) follow different trajectories), (ii) multi-scale dynamics, deterministic dynamics at large scales vs. stochastic at small scales, and (iii) limited data, increasing the risk of overfitting. To address these challenges, we propose encoding the inputs to a latent base distribution that is closer to the target distribution, followed by flow matching to generate small-scale physics. The encoder captures the deterministic components, while flow matching adds stochastic small-scale details. To account for uncertainty in the deterministic part, we inject noise into the encoder output using an adaptive noise scaling mechanism, which is dynamically adjusted based on maximum-likelihood estimates of the encoder predictions. We conduct extensive experiments on both the real-world CWA weather dataset and the PDE-based Kolmogorov dataset, with the CWA task involving super-resolving the weather variables for the region of Taiwan from 25 km to 2 km scales. Our results show that the proposed stochastic flow matching (SFM) framework significantly outperforms existing methods such as conditional diffusion and flows.
Kushagra Pandey, Jaideep Pathak, Yilun Xu, Stephan Mandt, Michael Pritchard, Arash Vahdat, Morteza Mardani
Diffusion models achieve state-of-the-art generation quality across many applications, but their ability to capture rare or extreme events in heavy-tailed distributions remains unclear. In this work, we show that traditional diffusion and flow-matching models with standard Gaussian priors fail to capture heavy-tailed behavior. We address this by repurposing the diffusion framework for heavy-tail estimation using multivariate Student-t distributions. We develop a tailored perturbation kernel and derive the denoising posterior based on the conditional Student-t distribution for the backward process. Inspired by $γ$-divergence for heavy-tailed distributions, we derive a training objective for heavy-tailed denoisers. The resulting framework introduces controllable tail generation using only a single scalar hyperparameter, making it easily tunable for diverse real-world distributions. As specific instantiations of our framework, we introduce t-EDM and t-Flow, extensions of existing diffusion and flow models that employ a Student-t prior. Remarkably, our approach is readily compatible with standard Gaussian diffusion models and requires only minimal code changes. Empirically, we show that our t-EDM and t-Flow outperform standard diffusion models in heavy-tail estimation on high-resolution weather datasets in which generating rare and extreme events is crucial.
Noah D. Brenowitz, Tom Beucler, Michael Pritchard, Christopher S. Bretherton
Neural networks are a promising technique for parameterizing sub-grid-scale physics (e.g. moist atmospheric convection) in coarse-resolution climate models, but their lack of interpretability and reliability prevents widespread adoption. For instance, it is not fully understood why neural network parameterizations often cause dramatic instability when coupled to atmospheric fluid dynamics. This paper introduces tools for interpreting their behavior that are customized to the parameterization task. First, we assess the nonlinear sensitivity of a neural network to lower-tropospheric stability and the mid-tropospheric moisture, two widely-studied controls of moist convection. Second, we couple the linearized response functions of these neural networks to simplified gravity-wave dynamics, and analytically diagnose the corresponding phase speeds, growth rates, wavelengths, and spatial structures. To demonstrate their versatility, these techniques are tested on two sets of neural networks, one trained with a super-parametrized version of the Community Atmosphere Model (SPCAM) and the second with a near-global cloud-resolving model (GCRM). Even though the SPCAM simulation has a warmer climate than the cloud-resolving model, both neural networks predict stronger heating/drying in moist and unstable environments, which is consistent with observations. Moreover, the spectral analysis can predict that instability occurs when GCMs are coupled to networks that support gravity waves that are unstable and have phase speeds larger than 5 m/s. In contrast, standing unstable modes do not cause catastrophic instability. Using these tools, differences between the SPCAM- vs. GCRM- trained neural networks are analyzed, and strategies to incrementally improve both of their coupled online performance unveiled.
Ankur Mahesh, William Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis OBrien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, Jared Willard
In Part I, we created an ensemble based on Spherical Fourier Neural Operators. As initial condition perturbations, we used bred vectors, and as model perturbations, we used multiple checkpoints trained independently from scratch. Based on diagnostics that assess the ensemble's physical fidelity, our ensemble has comparable performance to operational weather forecasting systems. However, it requires orders of magnitude fewer computational resources. Here in Part II, we generate a huge ensemble (HENS), with 7,424 members initialized each day of summer 2023. We enumerate the technical requirements for running huge ensembles at this scale. HENS precisely samples the tails of the forecast distribution and presents a detailed sampling of internal variability. HENS has two primary applications: (1) as a large dataset with which to study the statistics and drivers of extreme weather and (2) as a weather forecasting system. For extreme climate statistics, HENS samples events 4$σ$ away from the ensemble mean. At each grid cell, HENS increases the skill of the most accurate ensemble member and enhances coverage of possible future trajectories. As a weather forecasting model, HENS issues extreme weather forecasts with better uncertainty quantification. It also reduces the probability of outlier events, in which the verification value lies outside the ensemble forecast distribution.
Teng Wang, Sarp Oral, Michael Pritchard, Kevin Vasko, Weikuan Yu
Modern parallel filesystems such as Lustre are designed to provide high, scalable I/O bandwidth in response to growing I/O requirements; however, the bursty I/O characteristics of many data-intensive scientific applications make it difficult for back-end parallel filesystems to efficiently handle I/O requests. A burst buffer system, through which data can be temporarily buffered via high-performance storage mediums, allows for gradual flushing of data to back-end filesystems. In this paper, we explore issues surrounding the development of a burst buffer system for data-intensive scientific applications. Our initial results demonstrate that utilizing a burst buffer system on top of the Lustre filesystem shows promise for dealing with the intense I/O traffic generated by application checkpointing.
Tom Beucler, Michael Pritchard, Pierre Gentine, Stephan Rasp
Data-driven algorithms, in particular neural networks, can emulate the effect of sub-grid scale processes in coarse-resolution climate models if trained on high-resolution climate simulations. However, they may violate key physical constraints and lack the ability to generalize outside of their training set. Here, we show that physical constraints can be enforced in neural networks, either approximately by adapting the loss function or to within machine precision by adapting the architecture. As these physical constraints are insufficient to guarantee generalizability, we additionally propose to physically rescale the training and validation data to improve the ability of neural networks to generalize to unseen climates.
Tom Beucler, Pierre Gentine, Janni Yuval, Ankitesh Gupta, Liran Peng, Jerry Lin, Sungduk Yu, Stephan Rasp, Fiaz Ahmed, Paul A. O'Gorman, J. David Neelin, Nicholas J. Lutsko, Michael Pritchard
Projecting climate change is a generalization problem: we extrapolate the recent past using physical models across past, present, and future climates. Current climate models require representations of processes that occur at scales smaller than model grid size, which have been the main source of model projection uncertainty. Recent machine learning (ML) algorithms hold promise to improve such process representations, but tend to extrapolate poorly to climate regimes they were not trained on. To get the best of the physical and statistical worlds, we propose a new framework - termed "climate-invariant" ML - incorporating knowledge of climate processes into ML algorithms, and show that it can maintain high offline accuracy across a wide range of climate conditions and configurations in three distinct atmospheric models. Our results suggest that explicitly incorporating physical knowledge into data-driven models of Earth system processes can improve their consistency, data efficiency, and generalizability across climate regimes.
Ankur Mahesh, William Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Joshua Elms, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis OBrien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, Jared Willard
Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computational constraints, it is infeasible to generate huge ensembles (comprised of 1,000-10,000 members) with traditional, physics-based numerical models. In this two-part paper, we replace traditional numerical simulations with machine learning (ML) to generate hindcasts of huge ensembles. In Part I, we construct an ensemble weather forecasting system based on Spherical Fourier Neural Operators (SFNO), and we discuss important design decisions for constructing such an ensemble. The ensemble represents model uncertainty through perturbed-parameter techniques, and it represents initial condition uncertainty through bred vectors, which sample the fastest growing modes of the forecast. Using the European Centre for Medium-Range Weather Forecasts Integrated Forecasting System (IFS) as a baseline, we develop an evaluation pipeline composed of mean, spectral, and extreme diagnostics. Using large-scale, distributed SFNOs with 1.1 billion learned parameters, we achieve calibrated probabilistic forecasts. As the trajectories of the individual members diverge, the ML ensemble mean spectra degrade with lead time, consistent with physical expectations. However, the individual ensemble members' spectra stay constant with lead time. Therefore, these members simulate realistic weather states, and the ML ensemble thus passes a crucial spectral test in the literature. The IFS and ML ensembles have similar Extreme Forecast Indices, and we show that the ML extreme weather forecasts are reliable and discriminating.
Scott A. Martin, Noah Brenowitz, Dale Durran, Michael Pritchard
Accurate long-range weather forecasting remains a major challenge for AI models, both because errors accumulate over autoregressive rollouts and because reanalysis datasets used for training offer a limited sample of the slow modes of climate variability underpinning predictability. Most AI weather models are autoregressive, producing short lead forecasts that must be repeatedly applied to reach subseasonal-to-seasonal (S2S) or seasonal lead times, often resulting in instability and calibration issues. Long-timestep probabilistic models that generate long-range forecasts in a single step offer an attractive alternative, but training on the 40-year reanalysis record leads to overfitting, suggesting orders of magnitude more training data are required. We introduce long-range distillation, a method that trains a long-timestep probabilistic "student" model to forecast directly at long-range using a huge synthetic training dataset generated by a short-timestep autoregressive "teacher" model. Using the Deep Learning Earth System Model (DLESyM) as the teacher, we generate over 10,000 years of simulated climate to train distilled student models for forecasting across a range of timescales. In perfect-model experiments, the distilled models outperform climatology and approach the skill of their autoregressive teacher while replacing hundreds of autoregressive steps with a single timestep. In the real world, they achieve S2S forecast skill comparable to the ECMWF ensemble forecast after ERA5 fine-tuning. The skill of our distilled models scales with increasing synthetic training data, even when that data is orders of magnitude larger than ERA5. This represents the first demonstration that AI-generated synthetic training data can be used to scale long-range forecast skill.
Tom Beucler, Stephan Rasp, Michael Pritchard, Pierre Gentine
Artificial neural-networks have the potential to emulate cloud processes with higher accuracy than the semi-empirical emulators currently used in climate models. However, neural-network models do not intrinsically conserve energy and mass, which is an obstacle to using them for long-term climate predictions. Here, we propose two methods to enforce linear conservation laws in neural-network emulators of physical models: Constraining (1) the loss function or (2) the architecture of the network itself. Applied to the emulation of explicitly-resolved cloud processes in a prototype multi-scale climate model, we show that architecture constraints can enforce conservation laws to satisfactory numerical precision, while all constraints help the neural-network better generalize to conditions outside of its training set, such as global warming.
Jerry Lin, Mohamed Aziz Bhouri, Tom Beucler, Sungduk Yu, Michael Pritchard
Accurate and computationally-viable representations of clouds and turbulence are a long-standing challenge for climate model development. Traditional parameterizations that crudely but efficiently approximate these processes are a leading source of uncertainty in long-term projected warming and precipitation patterns. Machine Learning (ML)-based parameterizations have long been hailed as a promising alternative with the potential to yield higher accuracy at a fraction of the cost of more explicit simulations. However, these ML variants are often unpredictably unstable and inaccurate in \textit{coupled} testing (i.e. in a downstream hybrid simulation task where they are dynamically interacting with the large-scale climate model). These issues are exacerbated in out-of-distribution climates. Certain design decisions such as ``climate-invariant" feature transformation for moisture inputs, input vector expansion, and temporal history incorporation have been shown to improve coupled performance, but they may be insufficient for coupled out-of-distribution generalization. If feature selection and transformations can inoculate hybrid physics-ML climate models from non-physical, out-of-distribution extrapolation in a changing climate, there is far greater potential in extrapolating from observational data. Otherwise, training on multiple simulated climates becomes an inevitable necessity. While our results show generalization benefits from these design decisions, the obtained improvment does not sufficiently preclude the necessity of using multi-climate simulated training data.