MLEvolve

#1 on MLE-bench — 65.3% medal rate in only 12 hours

arXiv HuggingFace GitHub Stars
Rank
#1
Medal Rate (All)
65.3%
Runtime
12h
Tasks
75

About MLEvolve

MLEvolve is a self-evolving multi-agent system that automatically solves Kaggle-style ML competitions through Monte Carlo Graph Search (MCGS). It combines progressive search, experience-driven memory, and multi-mode adaptive code generation into a closed-loop optimization framework.

Beyond ML engineering, MLEvolve generalizes to open mathematical optimization problems, matching or surpassing purpose-built optimization frameworks like AlphaEvolve.

Core Innovations

Progressive MCGS Search

Explores multiple branches in parallel with budget-aware explore/exploit switching and cross-branch fusion when progress stalls.

Experience-Driven Memory

Stores plan, code, metrics, and success/failure labels per node. BM25 + FAISS retrieval reinforces proven strategies.

Multi-Mode Code Generation

Dynamically selects Base / Stepwise / Diff modes by task state for efficient iterations from strategy to precise fixes.

Closed-Loop Optimization

Connects code execution, metric feedback, and strategy updates into an automated loop for result-driven decisions.

MLE-bench Results

Performance on the full MLE-bench set (75 tasks). MLEvolve achieves 65.3% medal rate with only a 12-hour runtime budget, ranking #1 among all methods.

Low
80.3 ± 1.5
Medium
64.0 ± 0.9
High
46.7 ± 0.0
All
65.3 ± 0.8

Mathematical Optimization

Beyond ML engineering, MLEvolve generalizes to open mathematical optimization problems. On 15 tasks from the AlphaEvolve benchmark, MLEvolve achieves competitive or superior results against purpose-built optimization frameworks including AlphaEvolve and AlphaEvolve-v2.

Problem Dir AlphaEvolve MLEvolve
Hex packing3.9300923.928476
Circle/square2.6358632.635983
Circle/rectangle2.3658322.365832
Heilbronn convex0.0309370.030937
Heilbronn triangles0.0365300.036530
Kissing number d11593592
Sum-diff 11.1479891.190177
Sum-diff 21.1584171.158546
Uncertainty inequality0.3520990.352099
Autocorrelation 1st1.5052941.502863
Autocorrelation 3rd (v)1.4687621.458770
Autocorrelation 3rd1.4556431.454851
Autocorrelation 2nd0.8962800.905422
Max-to-min ratios12.88926612.889230
Minimum overlap0.3809230.380897

↑ higher is better, ↓ lower is better. Green = MLEvolve wins. Gold = AlphaEvolve wins. MLEvolve matches or surpasses AlphaEvolve on 14 of 15 tasks.

Citation

@article{du2026mlevolve,
  title={MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery},
  author={Du, Shangheng and Yan, Xiangchao and Shi, Jinxin and Cao, Zongsheng and Feng, Shiyang
          and Liang, Zichen and Sun, Boyuan and Peng, Tianshuo and Zhou, Yifan
          and Li, Xin and Zhou, Jie and He, Liang and Zhang, Bo and Bai, Lei},
  journal={arXiv preprint arXiv:2606.06473},
  year={2026}
}

@article{du2025automlgen,
  title={AutoMLGen: Navigating Fine-Grained Optimization for Coding Agents},
  author={Du, Shangheng and Yan, Xiangchao and Jiang, Dengyang and Yuan, Jiakang
          and Hu, Yusong and Li, Xin and He, Liang and Zhang, Bo and Bai, Lei},
  journal={arXiv preprint arXiv:2510.08511},
  year={2025}
}

@article{feng2026internagent,
  title={InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery},
  author={Shiyang Feng and Runmin Ma and Xiangchao Yan and Yue Fan and Yusong Hu and others},
  journal={arXiv preprint arXiv:2602.08990},
  year={2026}
}

Acknowledgments

The authors would like to thank Yunfeng Zhao and Yazhou Li from The Heart of The Machine (Beijing) Technology Co., Ltd. for their support in the application and promotion of MLEvolve.