A Lightweight and Accurate Spatial-Temporal Transformer for Traffic Forecasting
/ Authors
/ Abstract
We study the forecasting problem for traffic with dynamic, possibly periodical, and joint spatial-temporal dependency between regions. Given the aggregated inflow and outflow traffic of regions in a city from time slots 0 to <inline-formula><tex-math notation="LaTeX">$t - 1$</tex-math><alternatives><mml:math><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="li-ieq1-3233086.gif"/></alternatives></inline-formula>, we predict the traffic at time <inline-formula><tex-math notation="LaTeX">$t$</tex-math><alternatives><mml:math><mml:mi>t</mml:mi></mml:math><inline-graphic xlink:href="li-ieq2-3233086.gif"/></alternatives></inline-formula> for any region. Prior arts in the area often considered the spatial and temporal dependencies in a decoupled manner, or were rather computationally intensive in training with a large number of hyper-parameters which needed tuning. We propose ST-TIS, a novel, lightweight and accurate <bold>S</bold>patial-<bold>T</bold>emporal <bold>T</bold>ransformer with <bold>i</bold>nformation fusion and region <bold>s</bold>ampling for traffic forecasting. ST-TIS extends the canonical Transformer with information fusion and region sampling. The information fusion module captures the complex spatial-temporal dependency between regions. The region sampling module is to improve the efficiency and prediction accuracy, cutting the computation complexity for dependency learning from <inline-formula><tex-math notation="LaTeX">$O(n^{2})$</tex-math><alternatives><mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>n</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="li-ieq3-3233086.gif"/></alternatives></inline-formula> to <inline-formula><tex-math notation="LaTeX">$O(n\sqrt{n})$</tex-math><alternatives><mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:msqrt><mml:mi>n</mml:mi></mml:msqrt><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="li-ieq4-3233086.gif"/></alternatives></inline-formula>, where <inline-formula><tex-math notation="LaTeX">$n$</tex-math><alternatives><mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="li-ieq5-3233086.gif"/></alternatives></inline-formula> is the number of regions. With far fewer parameters than state-of-the-art deep learning models, ST-TIS's offline training is significantly faster in terms of tuning and computation (with a reduction of up to <inline-formula><tex-math notation="LaTeX">$90\%$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>90</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="li-ieq6-3233086.gif"/></alternatives></inline-formula> on training time and network parameters). Notwithstanding such training efficiency, extensive experiments show that ST-TIS is substantially more accurate in online prediction than state-of-the-art approaches (with an average improvement of <inline-formula><tex-math notation="LaTeX">$9.5\%$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>9</mml:mn><mml:mo>.</mml:mo><mml:mn>5</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="li-ieq7-3233086.gif"/></alternatives></inline-formula> on RMSE, and <inline-formula><tex-math notation="LaTeX">$12.4\%$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>12</mml:mn><mml:mo>.</mml:mo><mml:mn>4</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="li-ieq8-3233086.gif"/></alternatives></inline-formula> on MAPE compared to STDN and DSAN).
Journal: IEEE Transactions on Knowledge and Data Engineering