OJAL\'A: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars
/ Authors
G. Mart'inez-Solaeche, R. M. G. Delgado, R. Garc'ia-Benito, A. Hern'an-Caballero, I. P'erez-Rafols, L. D'iaz-Garc'ia, L. Abramo, J. E. Rodr'iguez-Mart'in, A. M. Conrado, I. Breda
and 30 more authors
H. D. S'anchez, I. M'arquez, M. Pieri, D. L'opez-Cano, V. Placco, L. Nakazono, A. Pino, V. Marra, J. Alcaniz, N. Benítez, S. Bonoli, S. Carneiro, A. Cenarro, D. Crist'obal-Hornillos, S. Daflon, R. Dupke, A. Ederoclite, C. Hern'andez-Monteagudo, J. Liu, C. L'opez-Sanjuan, A. Mar'in-Franch, C. M. D. Oliveira, M. Moles, F. Roig, L. Sodr'e, K. Taylor, J. Varela, H. V. Rami'o, J. M. V'ilchez, J. Zaragoza-Cardiel
/ Abstract
The advent of large-scale surveys requires efficient ML techniques to exploit the information of massive datasets. We present OJALA, a transformer-based autoregressive foundation model designed to simultaneously classify astronomical objects and infer their physical parameters using 54 narrow bands from J-PAS, combined with broad bands from the DESI Legacy Imaging Surveys and WISE. The model is trained on $\sim20$ million synthetic SEDs generated from DESI DR1 spectra. We validate OJALA using a cross-matched sample of $\sim121,000$ objects between J-PAS and DESI. The model achieves a weighted F1-score of approximately 0.9 for spectral classification (stars, galaxies, and QSOs) at $i<21$. For galaxies, we recover photo-z with a precision of $\sigma_{\rm NMAD}<0.01$, while for QSOs, the precision improves significantly at $z>1.5$, reaching $\sigma_{\rm NMAD} \approx 0.006$ at $z \approx 3.5$. We demonstrate robust estimation of physical properties for galaxies, recovering stellar masses and SFR with a scatter of approximately 0.11 dex and 0.22 dex, respectively. Furthermore, the model accurately predicts EWs for major optical emission lines, allowing for the derivation of extinction-corrected H$\alpha$ luminosities with a scatter of 0.29 dex. OJALA successfully reproduces the BPT and WHAN diagnostic diagrams, classifying SF, AGN, and passive galaxies with F1-scores typically ranging from 70% to 90% depending on the diagnostic class. For stars, the model reliably infers effective temperature and metallicity, though surface gravity remains challenging. Finally, we show the modularity of the architecture by fine-tuning the pre-trained embeddings to predict BH masses, a property not included in the primary training, recovering spectroscopic virial estimates with a precision of approximately 0.5 dex. We release the code, model weights, and a comprehensive VAC for the J-PAS EDR.