Scaling the Horizon, Not the Parameters

Agents-A1

A 35B Mixture-of-Experts agentic model built to scale heterogeneous agent abilities across long-horizon search, engineering, scientific research, instruction following, and tool calling.

35B MoE agentic model
256K served context length
6 evaluation directions
Long-horizon search Engineering Scientific research Instruction following Tool calling

Designed for real agent workflows

Agents-A1 targets tasks where the model must plan, use tools, inspect intermediate state, and keep constraints intact across long contexts.

01

Agentic reasoning

Decomposes complex goals into executable sub-steps, plans ahead, and adapts based on intermediate observations.

02

Tool use

Supports function calling and external tools including APIs, code interpreters, search engines, and task environments.

03

Long context

Handles extended conversations and documents while preserving coherence, recall, and multi-step state.

04

Instruction following

Tracks detailed constraints across diverse domains, from scientific research prompts to structured tool workflows.

Broad agentic benchmark coverage

Agents-A1 is evaluated across long-horizon search, engineering tasks, scientific research, instruction following, general agentic tasks, and scientific agentic tasks.

Performance Matrix Agents-A1 versus leading agentic systems

Hover over any bar to inspect the model and score. Blue bars mark Agents-A1.

Seal-0Long-horizon search, overall SOTA result
56.36
BrowseCompBest among comparable 35B-class models
75.51
SciCodeEngineering tasks, best among comparable models
44.33
FrontierScience-ResearchScientific research, overall SOTA result
40.0
IFBenchInstruction following, overall SOTA result
80.61

Three-stage agent training

The model is trained with a domain-grounded knowledge-action graph that turns agent process traces into trainable targets.

Stage 1

Full-domain supervised fine-tuning

Aligns the base model with broad agentic behaviors across search, engineering, research, tools, and instructions.

Stage 2

Domain-level teacher models

Captures specialized expertise for each domain, giving the final model stronger and more varied supervision.

Stage 3

Multi-teacher on-policy distillation

Transfers expertise across heterogeneous domains with optimization designed for efficient knowledge transfer.

Serve with standard LLM runtimes

Agents-A1 can be served through SGLang or vLLM with OpenAI-compatible endpoints at http://localhost:8000/v1.

SGLang

python -m sglang.launch_server \
  --model-path InternScience/Agents-A1 \
  --port 8000 \
  --tp-size 1 \
  --mem-fraction-static 0.8 \
  --context-length 262144 \
  --reasoning-parser qwen3 \
  --tool-call-parser qwen3_coder

vLLM

vllm serve InternScience/Agents-A1 \
  --port 8000 \
  --tensor-parallel-size 1 \
  --max-model-len 262144 \
  --reasoning-parser qwen3 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder