Patch-aware Vector Quantized Codebook Learning for Unsupervised Visual Defect Detection

/ Authors

/ Abstract

Unsupervised visual defect detection is essential across various industrial applications. Typically, this involves learning a representation space that captures only the features of normal data, and subsequently identifying defects by measuring deviations from this norm. However, balancing the expressiveness and compactness of this space is challenging. The space must be comprehensive enough to encapsulate all regular patterns of normal data, yet without becoming overly expressive, which leads to wasted computational and storage resources and may cause mode collapse-blurring the distinction between normal and defect data embeddings and impairing detection accuracy. To overcome these issues, we introduce a novel approach using an extended VQ-VAE framework optimized for unsupervised defect detection. Unlike traditional methods that apply a constant and uniform representation capacity across an image, our model employs a patch-aware dynamic code assignment scheme. This approach trains the model to allocate codes of varying resolutions based on the context richness of different image regions, aiming to optimize code usage spatially for each sample in a learnable fashion. We also leverage the learned strategy for code allocation in inference, to enlarge the discrepancy between normal and defective samples, thereby improving detection capabilities. Our extensive testing on the MVTecAD, BTAD, and MTSD datasets demonstrates that our model achieves state-of-the-art performance.

Journal: 2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI)

DOI: 10.1109/ICTAI62512.2024.00088