
I work on reinforcement learning at Google DeepMind — general objectives beyond a single scalar reward, meta-learning, and diversity in decision-making — and on applying these ideas to mathematics, games, and reasoning in large language models.
Featured Projects
AlphaProof
An AI system that taught itself to prove mathematical theorems in Lean, reaching silver-medal performance at the International Mathematical Olympiad through continuous reinforcement learning.
Read more →COrigami
Combines reinforcement learning with Gemini to design origami crease patterns, using a semantic representation and visual feedback loop to fold arbitrary target shapes.
Read more →PuzzleGen
Generates original chess puzzles with reinforcement learning, rewarding uniqueness, counter-intuitiveness, and novelty. Evaluated by chess grandmasters and featured on lichess and chess.com.
Read more →AlphaZero db
Explores diversity in AI decision-making by training a league of agents with distinct playing styles, revealing multiple qualitatively different ways to play chess at a superhuman level.
Read more →LLMs can't jump
A position paper examining a fundamental limitation of large language models: their difficulty with abductive reasoning, a capacity central to genuine scientific invention.
Read more →