Research & Projects

Academic foundations and open-source projects of the SGIWorld evaluation ecosystem

Research Papers

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows. 1000+ expert-curated tasks based on Science's 125 Big Questions.

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning. 830 expert-verified VQA pairs.

An Open-source Evaluation Toolkit for Scientific General Intelligence. Unified benchmarking across 6 scientific domains.