MLEvolve

#1 on MLE-bench in 12 Hours, Now Open Source

MLEvolve is an automated machine learning engineering system for Kaggle-style competitions. It combines progressive MCGS search, multi-agent collaboration, and experience-driven memory to form a full loop from planning to coding, validation, and iterative optimization.

🚀 Open Source: InternScience / MLEvolve

GitHub Repository Core Innovations Results

Rank

All (Any Medal %)

61.33 ± 1.33

Runtime Budget

12 Hours

High Split

42.22 ± 2.22

Main Idea

In long-horizon automation tasks, a system should not stop at writing one solution. It needs to continuously search, validate, and refine. MLEvolve turns Plan → Build → Evaluate → Evolve into a repeatable optimization loop so agents can approach better solutions under limited budgets.

Plan Start from strategy design before implementation.

Build Use multi-mode code generation for full builds and local patches.

Evolve Iterate from feedback so solutions improve over time.

Core Innovations

🌲 Progressive MCGS Search: Instead of following a single trial-and-error path, MLEvolve explores multiple candidate branches in parallel on a graph. When progress stalls, it performs cross-branch fusion by recombining useful strategies from top nodes. With budget-aware explore/exploit switching, search moves smoothly from broad exploration to focused refinement, improving convergence speed and robustness.
🧠 Experience-Driven Global Memory: Every attempt is stored as a retrievable quadruple of plan, code, metrics, and success/failure tags. Future nodes can reuse proven patterns and avoid known failure routes, reducing repeated mistakes. As memory grows during runtime, the system becomes increasingly task-aware and self-improving.
🛠️ Multi-Mode Adaptive Planning: MLEvolve follows a plan-code decoupled workflow and dynamically selects Base / Stepwise / Diff modes by task state. Base quickly builds full solutions, Stepwise decomposes long reasoning chains, and Diff performs targeted incremental patches. These modes can be chained for efficient iterations from strategy to precise fixes.
🔁 Closed-Loop Validation and Optimization: MLEvolve connects proposal generation, code execution, metric feedback, and strategy updates into an automated loop. Each round of feedback directly adjusts search and planning priorities for the next round, turning the system from simple code generation into result-driven decision making.

Results and Impact

MLEvolve ranks first on MLE-bench with an Any Medal rate of 61.33% using only a 12-hour runtime budget. It also serves as a key optimization engine in InternAgent 1.5 for longer-horizon scientific discovery workflows.

Low

80.30 ± 1.52

Medium

57.89 ± 1.52

High

42.22 ± 2.22

All

61.33 ± 1.33

Main Idea

Core Innovations

Results and Impact

Related Links