|
|
Dynamic Speculative Agent Planning
Yilin Guan,
Wenyue Hua,
Qingfeng Lan,
Fei Sun,
Dujian Ding,
Devang Acharya,
Chi Wang,
William Yang Wang
ICLR, 2026
arXiv
/
code
A dynamic speculative planning framework for LLM-based agents that accelerates multi-step reasoning by adaptively speculating future actions.
|
|
|
AsyncSpade: Efficient Test-Time Scaling with Asynchronous Sparse Decoding
Shuqing Luo*, Yilin Guan*, Pingzhi Li, Hanrui Wang, Tianlong Chen
* Equal contribution
arXiv, 2025
arXiv
Asynchronous framework for efficient test-time scaling: light-weight temporal-regressive query prediction and disaggregated KV-cache filtering overlapped with inference.
|
|