How Many Qubits Does a Machine Learning Problem Require?

/ Authors

/ Abstract

Quantum machine learning (QML) promises computational advantages for complex learning tasks, but identifying which datasets stand to benefit remains an open question. The recently proposed bit-bit encoding scheme encodes both inputs and outputs as bitstrings, which leads to quantum models that are universal approximators. Under bit-bit encoding, the number of input and output pairs increases exponentially with the number of qubits, which allows the calculation of the number of qubits required to fully represent a dataset. Datasets that can be covered with fewer than 50 qubits are unlikely to benefit from quantum advantage, as they are classically simulable. We use bit-bit encoding to perform a resource estimation study on both synthetic and real-world classification datasets. On synthetic data, we analyze how qubit requirements scale with the number of features and samples. On real datasets, we compare qubit requirements across different classical dimensionality reduction schemes. We find that all tested datasets require 49 qubits or fewer for full coverage, regardless of dimensionality reduction method. This suggests that standard, medium-sized, single-label classification datasets are unlikely to see performance gains from QML. In future work, we will explore more complex data types, such as multi-label, sequential, and regression, that may require more than 50 qubits for coverage and could therefore be promising candidates for quantum advantage.

Journal: 2025 IEEE International Conference on Quantum Computing and Engineering (QCE)

DOI: 10.1109/QCE65121.2025.10468