Interpretable feature interaction via statistical self-supervised learning on tabular data
/ Authors
/ Abstract
In high-stakes scientific contexts, explainable AI is crucial for deriving meaningful insights from complex tabular data. A formidable challenge is ensuring both rigorous statistical guarantees and clear interpretability in feature extraction. While traditional methods like principal component analysis are limited by linear assumptions, powerful neural network approaches often lack the transparency required in scientific domains. To address this gap, we introduce Spofe, a novel self-supervised learning pipeline that makes nonlinear feature interactions interpretable. Spofe marries the power of kernel principal components (KPCs) for capturing complex dependencies with a sparse, principled polynomial representation to achieve clear interpretability with statistical rigor. Our approach bridges data-driven complexity and statistical reliability via three stages. First, it generates self-supervised signals using KPCAs to model complex patterns. Second, it distills these signals into sparse polynomial functions for interpretability. Third, it constructs the Spofe statistic by aggregating knockoff feature-importance scores across self-supervised signals, and applies a data-adaptive threshold to identify significant polynomial interactions with rigorous false discovery rate control. Extensive experiments on diverse real-world datasets demonstrate the effectiveness of Spofe, achieving competitive or superior performance relative to other methods in feature selection for regression and classification tasks. In particular, applications on physics datasets highlight the ability of the proposed method to produce scientifically valid and interpretable explanations, reinforcing its practical utility and the critical role of explainability in AI for science.
Journal: Machine Learning: Science and Technology