Data-efficient surrogate modeling of spectral functions using Gaussian processes: An application to the $t$-$t'$-$t''$-$J$ model
Sanket Jantre, Nathan M. Urban, Weiguo Yin, Niraj Aryal
Abstract
Spectral functions encode key many-body information but are costly to compute with high fidelity. Machine-learning surrogates have emerged as a powerful alternative, yet many approaches require large training datasets. We develop a data-efficient surrogate for spectral functions using the $t$-$t'$-$t''$-$J$ model, which describes the motion of a hole in a quantum antiferromagnet. Using $\sim$ 10$^5$ self-consistent Born approximation-based spectra from Lee, Carbone and Yin (Phys. Rev. B 107, 205132 (2023)), we train a deep-kernel Gaussian process surrogate model with sparse variational inference (DKL-SVGP) using only 10% of the available training spectra. We benchmark against feed-forward neural networks (FFNN) trained on the same reduced subset and on the full dataset. The proposed DKL-SVGP model consistently outperforms the reduced-data FFNN and, despite using only 10% of the training spectra, achieves spectrum-wise errors within the same order-of-magnitude as the full-data FFNN baseline. Worst-tail diagnostics show improved fidelity on difficult spectra, while peak-level analysis indicates that DKL-SVGP recovers dominant peak heights with comparable accuracy and improves peak-location agreement under a matched-peak evaluation that mitigates rare peak-swapping cases. Overall, these results highlight GP-based surrogates as a competitive and data-efficient approach for spectral-function prediction in scarce-data regimes.