Reasoning as a Resource: Optimizing Fast and Slow Thinking in Code Generation Models
/ Authors
/ Abstract
This position paper proposes a fundamental shift in designing code generation models: treating reasoning depth as a controllable resource. Rather than being an incidental by-product of prompting, we argue that the trade-off between rapid, direct answers ("fast thinking") and elaborate, chain-of-thought deliberation ("slow thinking") must be explicitly managed. We contend that optimizing reasoning budgets across the entire model lifecycle—from synthetic data creation and benchmarking to real-world deployment—can unlock superior trade-offs among accuracy, latency, and cost. This paper outlines how adaptive control over reasoning can enrich supervision signals, motivate new multi-dimensional benchmarks, and inform cost-aware, security-conscious deployment policies. By viewing fast and slow thinking as complementary modes to be scheduled, we envision coding agents that think deep when necessary and act fast when possible.
Journal: Proceedings of the 1st ACM SIGPLAN International Workshop on Language Models and Programming Languages