Motif-Driven Contrastive Learning of Graph Representations
/ Authors
/ Abstract
Pre-training Graph Neural Networks (GNN) via self-supervised contrastive learning has recently drawn lots of attention. However, most existing works focus on node-level contrastive learning, which cannot capture global graph structure. The key challenge to conduct subgraph-level contrastive learning is to sample informative subgraphs that are semantically meaningful. To solve it, we propose to learn graph motifs, which are frequently-occurring subgraph patterns (e.g. functional groups of molecules), for better subgraph sampling. Our framework <underline>M</underline>ot<underline>I</underline>f-driven <underline>C</underline>ontrastive lea<underline>R</underline>ning <underline>O</underline>f <underline>G</underline>raph representations (<italic>MICRO-Graph</italic>) can: 1) use GNNs to extract motifs from large graph datasets; 2) leverage learned motifs to sample informative subgraphs for contrastive learning of GNN. We formulate motif learning as a differentiable clustering problem, and adopt EM-clustering to group similar and significant subgraphs into several motifs. Guided by these learned motifs, a sampler is trained to generate more informative subgraphs, and these subgraphs are used to train GNNs through graph-to-subgraph contrastive learning. By pre-training on the ogbg-molhiv dataset with <italic>MICRO-Graph</italic>, the pre-trained GNN achieves 2.04<inline-formula><tex-math notation="LaTeX">$\%$</tex-math><alternatives><mml:math><mml:mo>%</mml:mo></mml:math><inline-graphic xlink:href="zhang-ieq1-3364059.gif"/></alternatives></inline-formula> ROC-AUC average performance enhancement on various downstream benchmark datasets, which is significantly higher than other state-of-the-art self-supervised learning baselines.
Journal: IEEE Transactions on Knowledge and Data Engineering