OpenAI's chief scientist outlines a roadmap to build fully autonomous scientific research agents, extending Codex-style coding automation into all domains of science.
OpenAI chief scientist Jakub Pachocki publicly committed to building fully automated scientific research systems, describing it as a natural extension of Codex-style coding agents. GPT-5, the model powering Codex, has already demonstrated early wins on unsolved math problems and puzzles in biology, chemistry, and physics. OpenAI is training models on hard math and coding contests specifically to extend autonomous working windows and multi-step task management. Pachocki stated an automated mathematician is technically buildable today but deprioritized in favor of real-world research impact.
The Codex architecture is being stress-tested as a general-purpose agentic substrate — not just for code. What's technically interesting: OpenAI is training long-horizon task management using math and coding contest data, which means the same agent primitives you use for code (tool calls, subtask decomposition, backtracking) will be the interface for scientific problem-solving. If you're building on the Codex or GPT-5 API, expect context window utilization and multi-step orchestration to become the primary performance lever, not just prompt quality.
Run a Codex agent on a non-trivial multi-step debugging task this week — measure how many subtasks it correctly decomposes before needing human correction. Use this as your baseline before the next capability bump lands.
Open the Codex CLI or API and submit: 'Analyze this codebase, identify the top 3 architectural bottlenecks, propose refactors for each, then implement the highest-impact one.' Observe how far it gets autonomously before stalling — this is your current agentic ceiling.
Tags
Signals by role
Also today
Tools mentioned