Linnaeus: A Hierarchical, Multi-Label Framework for Autonomous System Classification
cs.NI
/ Authors
/ Abstract
Autonomous systems (ASes) play diverse roles in today's Internet, from community and research backbones to hyperscale content providers and submarine-cable operators. However, existing taxonomies based solely on network-level features fail to capture their semantic and operational heterogeneity. In this paper, we present Linnaeus, a hierarchical AS-classification framework that combines network-centric data (e.g., topology, BGP announcements) with rich non-network features and leverages domain-adapted large language models alongside traditional machine-learning techniques. Linnaeus provides a two-level taxonomy with 18 top-level and 38 second-level classes, supports multi-label assignments to reflect hybrid roles (e.g., research backbone and transit provider), and provides an end-to-end pipeline from data ingestion to label inference. On a manually annotated dataset of nearly 2,000 ASes, Linnaeus achieves an overall precision and recall of 0.83 and 0.76, respectively. We further demonstrate its practical value through case studies, highlighting Linnaeus's potential to reveal both structural and semantic dimensions of Internet infrastructure.