Publications

Preprints

Benchmarking and Studying the LLM-based Agent System in End-to-End Software Development

Zhengran Zeng*, Yixin Li*, Rui Xie, Wei Ye, and Shikun Zhang

Published in arXiv preprint, 2025

We construct E2EDevBench and a hybrid evaluation framework to benchmark LLM-based agent systems for end-to-end software development.