Showing 1–11 of 11 results
/ Date/ Name
Jun 6, 2022On Efficient Approximate Queries over Machine Learning ModelsJun 6, 2024OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification InferenceJun 28, 2025BEST-Route: Adaptive LLM Routing with Test-Time Optimal ComputeApr 22, 2024Hybrid LLM: Cost-Efficient and Quality-Aware Query RoutingOct 25, 2023LLM Performance Predictors are good initializers for Architecture SearchSep 2, 2025Dynamic Speculative Agent PlanningJan 9, 2025ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification QueriesNov 3, 2024EcoAct: Economic Agent Determines When to Register What ActionOct 18, 2024Supervised Chain of ThoughtMar 13, 2025Why Prompt Design Matters and Works: A Complexity Analysis of Prompt Search Space in LLMsJun 13, 2025Semantic Scheduling for LLM Inference